Shellcode and You
Shellcode and You
The author is going to start writing about shellcode. But wait! It's obsolete, so why bother?
Join the DZone community and get the full member experience.Join For Free
Learn how to add document editing and viewing to your web app on .Net (C#), Node.JS, Java, PHP, Ruby, etc.
I'm going to start writing on shellcode, how you can develop it, and how you can test it, even though it's obsolete. Why bother?
Well, learning how to develop and use shellcode is a first step to exploring more complex exploit payloads and techniques. First, we'll take a look at shellcode, then return-to-libc, and then I'll explore return oriented programming. But first, shellcode.
So what is it?
Well, shellcode is binary computer instructions. As you'd expect, these are illegible, and you'll generate them from a program of some kind. For example, let's take a look at some shellcode that'll spawn a shell on Linux (downloaded from shellstorm):
shellcode = "\x48\x31\xd2\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x48\xc1\xeb\x08\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05"
Easy to understand, right? Let's look at it a little differently:
char shellcode = "\x48\x31\xd2" // xor %rdx, %rdx "\x48\xbb\x2f\x2f\x62\x69\x6e\x2f\x73\x68" // mov $0x68732f6e69622f2f, %rbx "\x48\xc1\xeb\x08" // shr $0x8, %rbx "\x53" // push %rbx "\x48\x89\xe7" // mov %rsp, %rdi "\x50" // push %rax "\x57" // push %rdi "\x48\x89\xe6" // mov %rsp, %rsi "\xb0\x3b" // mov $0x3b, %al "\x0f\x05"; // syscall
A little easier to understand? What we have here are the machine instructions, in binary in a character array (the '\x' prefix is the hexadecimal character escape; with it you can print non-ascii symbols). What we have here are the actual opcodes, in binary, with the appropriate endian-ness for a linux x86-64 system. Conveniently, this particular chunk of shellcode comes with menmonics in comments, so we can see what's going on. Basically, we're executing /bin/sh (MOVd into the rbx register in line 3) via a syscall to execve (on line 11). Prior to the syscall, we set up the stack and the appropriate arguments according to the relevant calling convention.
So, essentially, when you have shellcode like this, your goal is to write the shellcode to memory in some way and arrange to point the instruction pointer it's way. The classic way this is done is via a buffer overflow, but there's other ways too. With a buffer overflow, you take advantage of the design of the call stack on the target system. The call stack on most architectures contains the return address after a function call, and your goal is to overwrite this return address with a pointer that points to the beginning of your shellcode.
Now, this is a classic exploit, not that widely usable today. It does, however, force you to learn how computers work at a very basic level. It's also a valid, functional attack on many of the smaller internet of things devices that are hitting the market today as these devices tend to be pretty insecure. But on a stock, modern operating system? forget about it. There are defenses against these attacks in place today. You can, however, turn those off on some systems (notably Linux and Windows - iOS and MacOS don't let you). Most of the time I write on iOS, but we'll use Linux to show you how buffer overflows and shellcode works.
If you don't have VMWare Fusion or Virtualbox, download it and install a server image of the most recent Ubuntu LTS release, and we'll start exploring how these attacks really work.
Opinions expressed by DZone contributors are their own.