blog.dideriksen.org: OSx86_64 `hello world` shellcode

Shellcode, always fun to write when you've been away from your assemby for a while. On OSX quite a few things are different from Linux, mostly systemcall related issues. If this is your first time, you should check the various hello world examples, there are several guides out there, i.e Writing your own shellcode some hello world are available at github, there is even a detailed guide here.

If you're writing your assembly on OSX you'll need to verify that your assembler understands the Macho64 format. This is done by checking the version of Nasm you have

nasm -v

NASM version 0.98.40 (Apple Computer, Inc. build 11) compiled on Feb 10 2016

This is a rather outdated version that came to my system with the xcode tools. It doesn't support Macho64. So you'll need to get the latest Nasm using either brew or MacPorts.

$ brew install nasm

==> Downloading https://homebrew.bintray.com/bottles/nasm-2.12.01.el_capitan.bottle.tar.gz

######################################################################## 100.0%

==> Pouring nasm-2.12.01.el_capitan.bottle.tar.gz

🍺 /usr/local/Cellar/nasm/2.12.01: 29 files, 2M

cannabissen@titanium:~/Documents/src/asm/$ nasm -v

NASM version 2.12.01 compiled on Mar 23 2016

$ sudo port install nasm

Password:

Warning: port definitions are more than two weeks old, consider updating them by running 'port selfupdate'.

---> Fetching archive for nasm

---> Attempting to fetch nasm-2.12_0.darwin_15.x86_64.tbz2 from http://osl.no.packages.macports.org/nasm

---> Attempting to fetch nasm-2.12_0.darwin_15.x86_64.tbz2.rmd160 from http://osl.no.packages.macports.org/nasm

---> Installing nasm @2.12_0

---> Activating nasm @2.12_0

---> Cleaning nasm

---> Updating database of binaries

---> Scanning binaries for linking errors

---> No broken files found.

cannabissen@titanium:~/Documents/src/c/$ nasm -v

NASM version 2.12 compiled on Mar 5 2016

The Nasm version in brew and Macports seems to be 2.12 at writing. With brew having a slightly minor update .01 so use which ever version your prefer. All the apple system calls are available here and you'll need to use the shift trick explained here.

The following code has been snipped from thexploit's article on the issues with assembly on OSX 10.7 and above. I just tested the helloworld on El-Capitain.

; OSX86 helloworld assembler for el capitain!

section .data

hello_world db "Hello World!", 0x0a

section .text

global start

start:

mov rax, 0x2000004 ; System call write = 4

mov rdi, 1 ; Write to standard out = 1

mov rsi, hello_world ; The address of hello_world string

mov rdx, 14 ; The size to write

syscall ; Invoke the kernel

mov rax, 0x2000001 ; System call number for exit = 1

mov rdi, 0 ; Exit success = 0

syscall ; Invoke the kernel

Nothing much in this example. There are two system calls, the first '4' is a call to write, the second '1' is a call to exit. If you wonder why the 0x2000000 is added to each call, the expiation is that the newer darwin and bsd kernels left shift the system calls 24 times.

As you may know zeros or zero bytes cannot be used when producing shellcode. This is because the functions used to inject the shellcode are usually string functions. When such a function meet a 0x00 it terminates the string. In the case above it means that once the strcpy function reads the second byte of the 0x2000000 it terminates the string and the shellcode wont work as expected.

So to produce injectable shellcode you'll have to remove all the zeros from the code. This means that you could write the first part of the assembly like this:

global start

section .data

hello_world: db "Hello World!", 0x0a ; Defining the string to output

.len: equ $ - hello_world ; Calculating the length of the helloworld ; string.

; $ denotes the current address subtracting ; hello_world

; will produce the length of the ; hello_world+0x0a string.

section .text

start:

mov r8b, 0x02 ; Unix system call class = 2

shl r8, 24 ; Moving from 0x02 to 0x2000000 with left ; shift 24 times.

or r8, 0x04 ; Adding the actual system call to the call ; class 0x2000000 to 0x2000004

; the 3 instructions above removes any zeros ; in the 0x200000 address and

; effectively places the 0x2000004 systemcall
; number in r8

mov rax, r8 ; setting the system call

mov rdi, 0x1 ; write to std out

mov rsi, hello_world ; Add the address of the string to write

mov rdx, hello_world.len ; Add the length of the string to write

syscall ; using the calculated length from the .len ; definition above

sub r8, 0x3 ; as r8 contain the latest syscall 0x2000004 ; So, subtracting 3 will produce
; 0x2000001, which is the syscall for exit

mov rax, r8 ; setting the system call

xor rdi, rdi ; zeroing out rdi for the return value of the ; exit system call

syscall

$ nasm -f macho64 helloworld_shell.s

$ ld helloworld_shell.o

$ ./a.out

Hello World!

Seems like the code runs smoothly. Lets find the opcodes to put in the test shellcode program

$ gobjdump -sd a.out

a.out: file format mach-o-x86-64

Contents of section .text:

1fd0 41b00249 c1e01849 83c8044c 89c0bf01 A..I...I...L....

1fe0 00000048 be002000 00000000 00ba0d00 ...H.. .........

1ff0 00000f05 4983e803 4c89c048 31ff0f05 ....I...L..H1...

Contents of section .data:

2000 48656c6c 6f20576f 726c6421 0a Hello World!.

Contents of section LC_THREAD.x86_THREAD_STATE64.0:

0000 00000000 00000000 00000000 00000000 ................

0010 00000000 00000000 00000000 00000000 ................

0020 00000000 00000000 00000000 00000000 ................

0030 00000000 00000000 00000000 00000000 ................

0040 00000000 00000000 00000000 00000000 ................

0050 00000000 00000000 00000000 00000000 ................

0060 00000000 00000000 00000000 00000000 ................

0070 00000000 00000000 00000000 00000000 ................

0080 d01f0000 00000000 00000000 00000000 ................

0090 00000000 00000000 00000000 00000000 ................

00a0 00000000 00000000 ........

Disassembly of section .text:

0000000000001fd0 <start>:

1fd0: 41 b0 02 mov $0x2,%r8b

1fd3: 49 c1 e0 18 shl $0x18,%r8

1fd7: 49 83 c8 04 or $0x4,%r8

1fdb: 4c 89 c0 mov %r8,%rax

1fde: bf 01 00 00 00 mov $0x1,%edi

1fe3: 48 be 00 20 00 00 00 movabs $0x2000,%rsi

1fea: 00 00 00

1fed: ba 0d 00 00 00 mov $0xd,%edx

1ff2: 0f 05 syscall

1ff4: 49 83 e8 03 sub $0x3,%r8

1ff8: 4c 89 c0 mov %r8,%rax

1ffb: 48 31 ff xor %rdi,%rdi

1ffe: 0f 05 syscall

The bold items shows that all zeros have been removed from the expected parts but yikes, it seem there still are several zero bytes present in the code. These lines are:

1fde: bf 01 00 00 00 mov $0x1,%edi

1fe3: 48 be 00 20 00 00 00 movabs $0x2000,%rsi

1fea: 00 00 00

1fed: ba 0d 00 00 00 mov $0xd,%edx

Well, what happens here? It seems like moving 0x1 into edi moved plenty of zero bytes as well. Why? Well these are padding bytes because the edi register is 32 bits. The assembler therefore adds the 'missing' zeros, lets remove these zeros. We already know the xor trick so to produce 0x01 we just have to increment edi.

xor edi,edi

inc edi

This sets up edi with 0x01 (std out) for the write systemcall. The code above takes care of line 1fde. The next line to look at is 1fed, as this is a similar issue, except that we can here use the high and low parts of the register instead of incrementing. As the moved number is the length of a string and there is no point in incrementing 14 times.

xor edx,edx

mov dl, hello_world.len

The xor zeros out the whole edx register. We need to do this before using the lowest byte accessor of the register. If not there may be other bits set some where in the register and these produce an incorrect number. The above takes care of the zeros for line 1fed.

Lets now address the final and much worse line 1fe3. Here movabs sets the address of the hello_world string into the rsi register. Apparently helloworld is placed at the data address 0x2000.

Contents of section .data:

2000 48656c6c 6f20576f 726c6421 0a Hello World!.

This means that we need to mangle the code a bit. There are two approaches for this. One is to set the text using the converted ascii characters by moving these to rsi. Another is to use the push-n-pop address trick. Lets try the push -n- pop method.

; Zero byte free hello world assembly

global start

section .data

hello_world: db "Hello World!", 0x0a ; Defining the string to output

.len: equ $ - hello_world ; Calculating the length of the helloworld ; string.

; $ denotes the current address subtracting ; hello_world

; will produce the length of the ; hello_world+0x0a string.

section .text

start:

mov r8b, 0x02 ; Unix system call class = 2

shl r8, 24 ; Moving from 0x02 to 0x2000000 with left ; shift 24 times.

or r8, 0x04 ; Adding the actual system call to the call ; class 0x2000000 to 0x2000004

; the 3 instructions above removes any zeros ; in the 0x200000 address and

; effectively places the 0x2000004 systemcall ; number in r8

jmp in ; trick to push the string address into rdi, ; eliminating the 2000 zeros

out:

pop rdi ; get the address of the helloworld string

mov rsi, rdi ; put hello world to the correct register for ; the syscall

mov rax, r8 ; setting the system call

xor rdi,rdi ; zeroing out rdi for setting the std write ; handle

inc rdi ; applying the write handle using no zero ; bytes

xor edx,edx ; zeroing out edx before setting the length of ; the string

mov dl, hello_world.len ; Add the length of the string to write

syscall ; using the calculated length from the .len ; definition above

sub r8, 0x3 ; as r8 contain the latest syscall 0x2000004 ; so subtracting 3 will produce ; 0x2000001, which is the syscall for exit

mov rax, r8 ; setting the system call

xor rdi, rdi ; zeroing out rdi for the return value of the ; exit system call

syscall

in:

call out ; Call out labes as a function forcing the ; string below

db hello_world ; to be pushed into rdi

Notice the 3 changes to the final program. Using the xor methods to produce zeros in the correct registers. The `jmp, call-n-pop` method to get the address of a constant (string) pushed into rdi using the language constructs instead of static memory addresses. There you go, the hello world zero byte shellcode. Let's check to see if we have excluded all zeros

$ gobjdump -sd a.out

a.out: file format mach-o-x86-64

Contents of section .text:

1fc2 41b00249 c1e01849 83c804eb 1f5f4889 A..I...I....._H.

1fd2 fe4c89c0 4831ff48 ffc731d2 b20d0f05 .L..H1.H..1.....

1fe2 4983e803 4c89c048 31ff0f05 e8dcffff I...L..H1.......

1ff2 ff48656c 6c6f2057 6f726c64 210a .Hello World!.

Contents of section .data:

2000 48656c6c 6f20576f 726c6421 0a Hello World!.

Contents of section LC_THREAD.x86_THREAD_STATE64.0:

0000 00000000 00000000 00000000 00000000 ................

0010 00000000 00000000 00000000 00000000 ................

0020 00000000 00000000 00000000 00000000 ................

0030 00000000 00000000 00000000 00000000 ................

0040 00000000 00000000 00000000 00000000 ................

0050 00000000 00000000 00000000 00000000 ................

0060 00000000 00000000 00000000 00000000 ................

0070 00000000 00000000 00000000 00000000 ................

0080 c21f0000 00000000 00000000 00000000 ................

0090 00000000 00000000 00000000 00000000 ................

00a0 00000000 00000000 ........

Disassembly of section .text:

0000000000001fc2 <start>:

1fc2: 41 b0 02 mov $0x2,%r8b

1fc5: 49 c1 e0 18 shl $0x18,%r8

1fc9: 49 83 c8 04 or $0x4,%r8

1fcd: eb 1f jmp 1fee <in>

0000000000001fcf <out>:

1fcf: 5f pop %rdi

1fd0: 48 89 fe mov %rdi,%rsi

1fd3: 4c 89 c0 mov %r8,%rax

1fd6: 48 31 ff xor %rdi,%rdi

1fd9: 48 ff c7 inc %rdi

1fdc: 31 d2 xor %edx,%edx

1fde: b2 0d mov $0xd,%dl

1fe0: 0f 05 syscall

1fe2: 49 83 e8 03 sub $0x3,%r8

1fe6: 4c 89 c0 mov %r8,%rax

1fe9: 48 31 ff xor %rdi,%rdi

1fec: 0f 05 syscall

0000000000001fee <in>:

1fee: e8 dc ff ff ff callq 1fcf <out>

1ff3: 48 rex.W

1ff4: 65 6c gs insb (%dx),%es:(%rdi)

1ff6: 6c insb (%dx),%es:(%rdi)

1ff7: 6f outsl %ds:(%rsi),(%dx)

1ff8: 20 57 6f and %dl,0x6f(%rdi)

1ffb: 72 6c jb 2069 <hello_world+0x69>

1ffd: 64 21 0a and %ecx,%fs:(%rdx)

It sure seems like this is the case. To extract the opcodes from the assembly dump we ca use a tool very similar to objdump. This is otool

otool -t a.out

a.out:

(__TEXT,__text) section

0000000000001fc2 41 b0 02 49 c1 e0 18 49 83 c8 04 eb 1f 5f 48 89

0000000000001fd2 fe 4c 89 c0 48 31 ff 48 ff c7 31 d2 b2 0d 0f 05

0000000000001fe2 49 83 e8 03 4c 89 c0 48 31 ff 0f 05 e8 dc ff ff

0000000000001ff2 ff 48 65 6c 6c 6f 20 57 6f 72 6c 64 21 0a

Notice it dumps only the opcodes in this case. It is possible to get otool to show the disassembly as well. But right here we are interested in the raw machine bytes only.

$ otool -t a.out | cut -d ' ' -f 2- | grep ' '

41 b0 02 49 c1 e0 18 49 83 c8 04 eb 1f 5f 48 89

fe 4c 89 c0 48 31 ff 48 ff c7 31 d2 b2 0d 0f 05

49 83 e8 03 4c 89 c0 48 31 ff 0f 05 e8 dc ff ff

ff 48 65 6c 6c 6f 20 57 6f 72 6c 64 21 0a

Produces only the machine code bytes. You can count the number of bytes using `wc -w`

$ otool -t a.out | cut -d ' ' -f 2- | grep ' ' | wc -w

62 bytes hello world shell code. We better run it to check if it works

$ nasm -f macho64 helloworld_zero.s

$ ld helloworld_zero.o

$ ./a.out

Hello World!

$ otool -tv a.out

a.out:

(__TEXT,__text) section

start:

0000000000001fc2 movb $0x2, %r8b

0000000000001fc5 shlq $0x18, %r8

0000000000001fc9 orq $0x4, %r8

0000000000001fcd jmp 0x1fee

out:

0000000000001fcf popq %rdi

0000000000001fd0 movq %rdi, %rsi

0000000000001fd3 movq %r8, %rax

0000000000001fd6 xorq %rdi, %rdi

0000000000001fd9 incq %rdi

0000000000001fdc xorl %edx, %edx

0000000000001fde movb $0xd, %dl

0000000000001fe0 syscall

0000000000001fe2 subq $0x3, %r8

0000000000001fe6 movq %r8, %rax

0000000000001fe9 xorq %rdi, %rdi

0000000000001fec syscall

in:

0000000000001fee callq 0x1fcf

0000000000001ff3 gs

0000000000001ff5 insb %dx, %es:(%rdi)

0000000000001ff6 insb %dx, %es:(%rdi)

0000000000001ff7 outsl (%rsi), %dx

0000000000001ff8 andb %dl, 0x6f(%rdi)

0000000000001ffb jb 0x2069

0000000000001ffd andl %ecx, %fs:(%rdx)

Seems to do the job just fine, and the executable is the program we expected. To extract the machine code using command line tools, you can experiment with this expression:

$ echo "char shellcode[] = $(otool -t a.out | cut -d ' ' -f 2- | grep ' ' | sed -e "s|[ ]|\\\x|g" -e "s|^|\"\\\x|g" -e "s|[\\]x$|\"|g")"

char shellcode[] = "\x41\xb0\x02\x49\xc1\xe0\x18\x49\x83\xc8\x04\xeb\x1f\x5f\x48\x89"

"\xfe\x4c\x89\xc0\x48\x31\xff\x48\xff\xc7\x31\xd2\xb2\x0d\x0f\x05"

"\x49\x83\xe8\x03\x4c\x89\xc0\x48\x31\xff\x0f\x05\xe8\xdc\xff\xff"

"\xff\x48\x65\x6c\x6c\x6f\x20\x57\x6f\x72\x6c\x64\x21\x0a"

It makes it a bit easier to copy and paste the shell code into your test C program. The command is just using sed format C character array syntax. Now you can easily copy & paste the shellcode into your test program. Or use even more sed magic to replace your current test shell code ;) The last part i leave to you.... Cheers...

References:-

[0x00] http://dustin.schultz.io/blog/2010/11/15/mac-os-x-64-bit-assembly-system-calls/

[0x01] http://blackwinghq.com/assets/labs/presentations/MachShellcode2.pdf

[0x02] https://gist.github.com/FiloSottile/7125822

[0x03] http://thexploit.com/secdev/finding-the-syscall-implementations-in-os-x/

[0x04] http://x86-64.org/documentation/abi.pdf