Toorcon 2010 Talk

My over caffeinated self somehow managed to stumble through the talk at toorcon. I’m self critical over the whole thing, but still overall a great experience, and I’m glad I did it.

I was totally nervous. This was my first ‘con’ and the room was packed (people standing at the wall), I spotted relatively famous hackers in the audience, etc. I needed more beer!

Hopefully the next one I’ll relax, slow down, not use filler words, etc :)



email_spider

This was a small part of a project that was itself about 1/3 of my graduate project. I used it to collect certain information. Here is the excerpt from the paper.

Website Email Spider Program

In order to automatically process publicly available email addresses, a simple tool was developed, with source code available in Appendix A. An automated tool is able to process web pages in a way that is less error prone than manual methods, and it also makes processing the sheer number of websites possible (or at least less tedious).
This tool begins at a few root pages, which can be comma delimited. From these, it searches for all unique links by keeping track of a queue so that pages are not usually revisited (although revisiting a page is still possible in case the server is case insensitive or equivalent pages are dynamically generated with unique URLs). In addition, the base class is passed a website scope so that pages outside of that scope are not spidered. By default, the scope is simply a regular expression including the top domain name of the organization.

Each page requested searches the contents for the following regular expression to identify common email formats:

[w_.-]{3,}@[w_.-]{6,}

The 3 and 6 repeaters were necessary because of false positives otherwise obtained due to various encodings. This regular expression will not obtain all email addresses. However, it will obtain the most common addresses with a minimum of false positives. In addition, the obtained email addresses are run against a blacklist of uninteresting generic form addresses (such as help@example.com, info@example.com, or sales@example.com).

These email addresses are saved in memory and reported when the program completes or is interrupted. Note because of the dynamic nature of some pages, these can potentially spider infinitely and must be interrupted (for example, a calendar application that uses links to go back in time indefinitely). Most emails seemed to be obtained in the first 1,000 pages crawled. A limit of 10,000 pages was chosen as a reasonable scope. Although this limit was reached several times, the spider program uses a breadth search method. It was observed that most unique addresses were obtained early in the spidering process, and extending the number of pages tended to have a diminishing return. Despite this, websites with more pages also tended to correlate with greater email addresses returned (see analysis section).

Much of the logic in the spidering tool is dedicated to correctly parsing html. By their nature, web pages vary widely with links, with many sites using a mix of directory traversal, absolute URLs, and partial URLs. It is no surprise there are so many security vulnerabilities related to browsers parsing this complex data.
There is also an effort made to make the software somewhat more efficient by ignoring superfluous links to objects such as documents, executables, etc. Although if such a file is encountered an exception will catch the processing error, these files consume resources.

Using this tool is straightforward, but a certain familiarity is expected – it was not developed for an end user but for this specific experiment. For example, a URL is best processed in the format http://example.com/ since in its current state it would use example.com to verify that spidered addresses are within a reasonable scope. It prints debugging messages constantly because every site seemed to have unique parsing quirks. Although other formats and usages may work, there was little effort to make this software easy to use.

Here is the source.

#!/usr/bin/python

import HTMLParser
import urllib2
import re
import sys
import signal
import socket

socket.setdefaulttimeout(20)

#spider is meant for a single url
#proto can be http, https, or any
class PageSpider(HTMLParser.HTMLParser):
  def __init__(self, url, scope, searchList=[], emailList=[], errorDict={}):
    HTMLParser.HTMLParser.__init__(self)
    self.url = url
    self.scope = scope
    self.searchList = searchList
    self.emailList = emailList
    try:
      urlre = re.search(r"(w+):[/]+([^/]+).*", self.url)
      self.baseurl = urlre.group(2)
      self.proto = urlre.group(1)
    except AttributeError:
      raise Exception("URLFormat", "URL passed is invalid")
    if self.scope == None:
      self.scope = self.baseurl
    try:
      req = urllib2.urlopen(self.url)
      htmlstuff = req.read()
    except KeyboardInterrupt:
      raise
    except urllib2.HTTPError:
      #not able to fetch a url eg 404
      errorDict["link"] += 1
      print "Warning: link error"
      return
    except urllib2.URLError:
      errorDict["link"] += 1
      print "Warning: URLError"
      return
    except ValueError:
      errorDict["link"] += 1
      print "Warning link error"
      return
    except:
      print "Unknown Error", self.url
      errorDict["link"] += 1
      return
    emailre = re.compile(r"[w_.-]{3,}@[w_.-]{2,}.[w_.-]{2,}")
    nemail = re.findall(emailre, htmlstuff)
    for i in nemail:
      if i not in self.emailList:
        self.emailList.append(i)
    try:
      self.feed(htmlstuff)
    except HTMLParser.HTMLParseError:
      errorDict["parse"] += 1
      print "Warning: HTML Parse Error"
      pass
    except UnicodeDecodeError:
      errorDict["decoding"] += 1
      print "Warning: Unicode Decode Error"
      pass
  def handle_starttag(self, tag, attrs):
    if (tag == "a" or tag =="link") and attrs:
      #process the url formats, make sure the base is in scope
      for k, v in attrs:
        #check it's an htref and that it's within scope
        if  (k == "href" and
            ((("http" in v) and (re.search(self.scope, v))) or
            ("http" not in v)) and
            (not (v.endswith(".pdf") or v.endswith(".exe") or
             v.endswith(".doc") or v.endswith(".docx") or
             v.endswith(".jpg") or v.endswith(".jpeg") or
             v.endswith(".png") or v.endswith(".css") or
             v.endswith(".gif") or v.endswith(".GIF") or
             v.endswith(".mp3") or v.endswith(".mp4") or
             v.endswith(".mov") or v.endswith(".MOV") or
             v.endswith(".avi") or v.endswith(".flv") or
             v.endswith(".wmv") or v.endswith(".wav") or
             v.endswith(".ogg") or v.endswith(".odt") or
             v.endswith(".zip") or v.endswith(".gz") or
             v.endswith(".bz") or v.endswith(".tar") or
             v.endswith(".xls") or v.endswith(".xlsx") or
             v.endswith(".qt") or v.endswith(".divx") or
             v.endswith(".JPG") or v.endswith(".JPEG")))):
          #Also todo - modify regex so that >= 3 chars in front >= 7 chars in back
          url = self.urlProcess(v)
          #TODO 10000 is completely arbitrary
          if (url not in self.searchList) and (url != None) and len(self.searchList) < 10000:
            self.searchList.append(url)
  #returns complete url in the form http://stuff/bleh
  #as input handles (./url, http://stuff/bleh/url, //stuff/bleh/url)
  def urlProcess(self, link):
    link = link.strip()
    if "http" in link:
      return (link)
    elif link.startswith("//"):
      return self.proto + "://" + link[2:]
    elif link.startswith("/"):
      return self.proto + "://" + self.baseurl + link
    elif link.startswith("#"):
      return None
    elif ":" not in link and " " not in link:
      while link.startswith("../"):
        link = link[3:]
        #TODO [8:-1] is just a heuristic, but too many misses shouldn't be bad... maybe?
        if self.url.endswith("/") and ("/" in self.url[8:-1]):
          self.url = self.url[:self.url.rfind("/", 0, -1)] + "/"
      dir = self.url[:self.url.rfind("/")] + "/"
      return dir + link
    return None

class SiteSpider:
  def __init__(self, searchList, scope=None, verbocity=True, maxDepth=4):
    #TODO maxDepth logic
    #necessary to add to this list to avoid infinite loops
    self.searchList = searchList
    self.emailList = []
    self.errors = {"decoding":0, "link":0, "parse":0, "connection":0, "unknown":0}
    if scope == None:
      try:
        urlre = re.search(r"(w+):[/]+([^/]+).*", self.searchList[0])
        self.scope = urlre.group(2)
      except AttributeError:
        raise Exception("URLFormat", "URL passed is invalid")
    else:
      self.scope = scope
    index = 0
    threshhold = 0
    while 1:
      try:
        PageSpider(self.searchList[index], self.scope, self.searchList, self.emailList, self.errors)
        if verbocity:
          print self.searchList[index]
          print " Total Emails:", len(self.emailList)
          print " Pages Processed:", index
          print " Pages Found:", len(self.searchList)
        index += 1
      except IndexError:
        break
      except KeyboardInterrupt:
        break
      except:
        threshhold += 1
        print "Warning: unknown error"
        self.errors["unknown"] += 1
        if threshhold >= 40:
          break
        pass
    garbageEmails =   [ "help",
                        "webmaster",
                        "contact",
                        "sales" ]
    print "REPORT"
    print "----------"
    for email in self.emailList:
      if email not in garbageEmails:
        print email
    print "nTotal Emails:", len(self.emailList)
    print "Pages Processed:", index
    print "Errors:", self.errors

if __name__ == "__main__":
  SiteSpider(sys.argv[1].split(","))

overthewire vortex level 0

SPOILER. These games are awesome. Find them at http://www.overthewire.org.

#!/usr/bin/python

#edited so it doesn't quite work...

import socket
import struct

HOST='host'
PORT=1111
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST,PORT))

blob = ""
#no idea why 2 packets... but seems to be consistent
for i in range (0,2):
  data = s.recv(2048)
  blob = blob + data

print "DATA: ", data
print len(blob)
#blob should be 4 unsigned ints
intdata = struct.unpack("IIII", blob)
total=0
for i in intdata:
  total += i

myblob = struct.pack("I", total)
s.send(myblob)

pw = s.recv(1024)
print pw
s.close()

pydbg reverseme solution

Last week I wrote a keygen here.

This is an almost identical problem, but the binary has been patched to allow debugging (I may do this programmaticly as well, but not yet). I wanted to solve this with programmatic debugging. Here is the exe:
Ice9pch3.

The code simply sets a breakpoint and prints the key to the screen. Also it patches the process memory so that the serial is valid.

import sys
import ctypes

from pydbg import *
from pydbg.defines import *


print "This is a very stupid keygen that uses a debug method and grabs the key from memory"
print "prints out the valid key, and writes it to memory"
print "Basically, pydbg 'hello, world'"
print "-------------"

if len(sys.argv) != 2:
    print "Error. USAGE: keygen.py C:fullpathice"
    sys.exit(-1)

def handler_breakpoint(mdbg):
    valid_str = ""
    #the valid serial is at 004030C8
    addr = 0x004030C8
    while 1:
        tmp = mdbg.read(addr, 1)
        addr += 1
        if tmp != "x00":
            valid_str = valid_str + tmp
        else:
            break
    print "The valid string is: ", valid_str
    print "Writing this to memory..."
    #write this to memory at 004030b4
    #def write (self, address, data, length=0)
    wdata = ctypes.create_string_buffer(valid_str)
    mdbg.write(0x00403198, wdata, len(valid_str))
    #checking the write
    #print mdbg.read(0x00403198, len(valid_str) + 1)
    return DBG_CONTINUE

dbg = pydbg()
dbg.set_callback(EXCEPTION_BREAKPOINT, handler_breakpoint)
dbg.load(sys.argv[1])
dbg.debug_event_iteration()
#at 004011FF in execution, 
#def bp_set (self, address, description="", restore=True, handler=None):
dbg.bp_set(0x004011F5)
dbg.debug_event_loop()

Updated solution. I change a register now to circumvent the isdebuggerpresent call.

import sys
import ctypes

from pydbg import *
from pydbg.defines import *


print "This is a very stupid keygen that uses a debug method and grabs the key from memory"
print "prints out the valid key, and writes it to memory"
print "Basically, pydbg 'hello, world'"
print "-------------"

if len(sys.argv) != 2:
    print "Error. USAGE: keygen.py C:fullpathice"
    sys.exit(-1)

def handler_breakpoint(mdbg):
    if mdbg.get_register("EIP") == 0x004011F5:
        valid_str = ""
        #the valid serial is at 004030C8
        addr = 0x004030C8
        while 1:
            tmp = mdbg.read(addr, 1)
            addr += 1
            if tmp != "x00":
                valid_str = valid_str + tmp
            else:
                break
        print "The valid string is: ", valid_str
        print "Writing this to memory..."
        #write this to memory at 004030b4
        #def write (self, address, data, length=0)
        #wdata = ctypes.create_string_buffer(valid_str)
        mdbg.write(0x00403198, valid_str, len(valid_str))
        #checking the write
        #print mdbg.read(0x00403198, len(valid_str) + 1)
    if mdbg.get_register("EIP") == 0x40106e:
        mdbg.set_register("EAX", 0)
    return DBG_CONTINUE

dbg = pydbg()
dbg.set_callback(EXCEPTION_BREAKPOINT, handler_breakpoint)
dbg.load(sys.argv[1])
dbg.debug_event_iteration()
#0x40106e is the point where we can circumvent the isdebugger present call
dbg.bp_set(0x40106e)
#at 004011FF in execution, 
#breakpoing for reading writing final compare
dbg.bp_set(0x004011F5)
dbg.debug_event_loop()

Reverseme Windows Keygen

This one was challenging for me, and took me several hours, but was fun. I got caught up on certain parts that may not have been too difficult, but, yeah…

http://crackmes.de/users/tripletordo/ice9/

You can download the executable here Ice9.zip.

The first thing I noticed is probably the ‘trick’ which was simply a call to isdebuggerpresent. I modified the assembly immediately after from JNE to JE so that it only runs if a debugger is present, allowing me to attach my debugger.

00401071 74 0A JE SHORT Ice9.0040107D

This took a lot of trial and error. My strategy was to replicate the logic. Once I got to the point ‘ecx at 0040119c’ I was home free.

#include <iostream>
#include <string>
using namespace std;

void main (int argc, char *argv[]) {
  if ( argc != 2) {
    cout<<"Bad usage, enter a name > 4 letters"<<endl;
	return;
  }
  string name = argv[1];
  string ostring = name;
  int i;
  //first reverse the string
  for (i=0; i<name.length(); i++) {
    name[i] = ostring [name.length()-i-1];
  }
  
  if (name.length() < 4) {
    cout << "name must be more than 4 letters chief"<<endl;
	return;
  }
  

  int v1 = 0;
  int cum = 0;
  for (i=1; i<name.length(); i++) {
    v1 = name[i];
	if (name[i] <= 90) {
	  if (v1 >= 65)
	    v1 += 44;
	}
	cum += v1;
  } //ecx at 0040119C
  
  cum = 9 * (12345 * (cum + 666) - 23);
  
  char chr_403119 [122];
  unsigned int v;
  i=0;
  //no bounds checking
  do {
    v = cum;
	cum /= 0xA;
	chr_403119[i++] = v % 10 + 48;
  } while (v / 10);
  chr_403119[i] = '\0';
  
  printf ("%s", chr_403119);
  string serial = "";

  //reverse the string
  for (; i >= 0; --i) {
    serial += chr_403119[i];
  }
  cout<<serial<<endl;
  
  //append all chars except the 'first' three to the end 
  for (i=3; i< ostring.length(); i++) {
    serial += ostring[i];
  }
  
  cout<<serial<<endl;

}

My plan on this one, since it was interesting enough and because it’s relatively easy to break at the final value, is to break this a completely different way. I’d like to write a python debugging script that bypasses the isdebuggerpresent and just grabs the final value in the compare at 004011FF. This should be relatively straightforward, and hopefully a good ‘hello, world’ to the world of python debugging. Stay tuned.

Nmap script to detect Debian OpenSSL Random Number Generator Weakness

This relies on HD’s keys, found http://digitaloffense.net/tools/debian-openssl/

description = [[
Debian OpenSSH/OpenSSL Package Random Number Generator Weakness
]]

---
-- @output
-- 22/ssh open  ssh
-- |_ ssh_debian_weak: The following keys are vulnerable: 2048 RSA 1024 RSA

-- SSH Weak Debian Key Script
-- rev 1.0 (2010-02-07)
-- rougly based on ssh_debian_weak.nasl by tennable
-- written by hand

author = "Rich Lundeen <mopey@webstersprodigy.net>"
license = "Same as Nmap--See http://nmap.org/book/man-legal.html"
categories = {"websters", "nessus", "act_gather_info"}

dependencies = {"ssh-hostkey"}

require("shortport")
require("ssh1")
require("ssh2")
require("nessus/nessus_conf")
portrule = shortport.port_or_service({22}, {"ssh"})

action = function(host, port)
  local keyval = nmap.registry.sshhostkey[host.ip]
  if keyval == nil then
    return
  end
  local output = ""
  for i,line in ipairs(keyval) do
    --TODO eventually binary search is nicer, but due to formats ready from HD
    --or if wanted later perhaps add the hex version to registry
    local linekey = string.gsub(ssh1.fingerprint_hex(line.fingerprint, 
                                line.algorithm, line.bits), ":", "")
    local crimp = pcre.new("^[^\s]+[\s]([^\s]+)[\s][^\s]+", 0, "C")
    local s, e, t = crimp:exec(linekey, 0, 0)
    linekey = string.sub(linekey, t[1], t[2])
    local fstring = (nessus_conf.nessus_conf["basedir"] .. 
                     "nselib/nessus/data/debian_weak_ssl/" .. 
                     line.algorithm:lower() .. "_" .. 
                     tostring(line.bits))
    local mfile = io.open(fstring, "r")
    for vulnkey in mfile:lines() do
      --TODO this could be made more efficient
      if string.find(vulnkey, linekey, 0) then
        output = output .. line.algorithm .. " " .. tostring(line.bits)
      end
    end
    mfile:close()
  end
  if output ~= "" then
    return output
  end
end

Follow

Get every new post delivered to your Inbox.