Virus-writing Bulletin

Code Mutations via Behaviour Analysis

-
. ● glósóli ● .


                *******************************************
                   Code Mutations via Behaviour Analysis  
                                by Dr. Theodore Spankman
                *******************************************




  1) Introduction

  2) Basics - Register Operators only
     2.1) Analyse me: Local Behaviour Table
     2.2) Create Random Code
     2.3) Compare me: Blackbox-Analyse
     2.4) More Freedom: Global Behaviour Table

  3) Extentions
     3.1) Including the Stack
     3.2) Including Memory

  4) Possible Extentions
     4.1) Hide me: Global Behaviour Table at Runtime
     4.2) Permutation - easy as it can be
     4.3) Jump somewhere
     4.4) Behaviour Flow
     4.5) Automatic shrinking/extention/trash

  5) Conclusion








  1) Introduction

  The basic idea is: The file analyses the behaviour of its own code and
  compares it with the behaviour of a randomly generated code. If the
  behaviour is the same, the original file-code will be substituted by the
  new random code.

  This article describes how this idea can be realized.





  2) Basics - Register Operators only

  2.1) Analyse me: Local Behaviour Table

       The program's code is splitted into 8byte blocks of code, padded with
       NOPs). With that knowlegde we can parse the whole file, and analyse
       the behaviour of each 8byte part.

       We only use Register operators for the moment. When we run an unknown
       code, all that can be changed are the 8 Registers and FLAGS Registers.

       We define a table consisting of 8+1 * 4 bytes:

       - - - - - - - - - - - - - - - - - -
                 Local Behaviour Table (LBT):

                 Each behaviour Block consists of:
                 Registers: 9*4=36 byte
                  0x00 dd EAX
                  0x04 dd EBX
                  0x08 dd ECX
                  0x0C dd EDX
                  0x10 dd EBP
                  0x14 dd ESI
                  0x18 dd EDI
                  0x1C dd FLAGS
                  0x20 dd ESP
       - - - - - - - - - - - - - - - - - -

       Before running the unknown code, we set all registers (except of EPS)
       to zero. After running the code, we save the change of the registers:

       - - - - - - - - - - - - - - - - - -
          sub   dword[LBT+0x00], eax
          sub   dword[LBT+0x04], ebx
                ...
          sub   dword[LBT+0x20], esp
       - - - - - - - - - - - - - - - - - -


       (For the flag-register: pushfd/pop Reg32)
       With that table, we know the principal behaviour of out code. Thats
       all we need for the moment.



  2.2) Create Random Code

       We want to create an 8 byte random code. The best way is to use a
       template system. One template could look like this:

       - - - - - - - - - - - - - - - - - -
           OPAlImm8:
                cmp     ebp, (RandomCodeBufferSize-2)
                jg      EndCRCCycle

                mov     ebx, 9
                xor     ecx, ecx
                call    CreateSpecialRndNumber

                mov     byte[RandomCode+ebp], 0x04   ; add al, Imm8
                cmp     dl, 0
                je      OPAlImm8Cont

                    ...

                mov     byte[RandomCode+ebp], 0x3C   ; cmp al, Imm8 
                cmp     dl, 7
                je      OPAlImm8Cont

                jmp  OPAlImm8End

                OPAlImm8Cont:
                mov     byte[RandomCode+ebp+1], dl
                add     ebp, 0x2
           jmp  EndCRCCycle
           OPAlImm8End:
       - - - - - - - - - - - - - - - - - -

       One important restriction is the ESP. The Stack Pointer must stay in 
       the reserved Stack Memory (from [fs:0x04]-[fs:0x08]). Therefore the
       RCG must not be the source of MOV, ADD, ... . INC and DEC are usually
       OK if you restore the ESP variable after running the unknown code.



  2.3) Compare me: Blackbox-Analyse

       We could create a Local Behaviour Table for the file code and the
       random code now, and compare the tables. Thats all?

       No! We have used an initial condition (setting all registers to zero),
       thus the LBT is dependent on that. For example these codes would have
       the same LBT:

                mov eax, 1           inc al
                inc bl               inc bl

       Or this one: 

                mov eax, ebx         xor eax, eax

       That means, when we found equivalent LBTs, we have to make further
       tests using different initial conditions.

       We can change the registers and the flags to random values, and create
       new LBTs with that new conditions. If we run that test with always
       changing conditions for a big number of times (I used 100 and 1000 for
       register only and flags tests, respectivly), and every single LBT of
       the programs code and the random code is equal, we found a code with
       same behaviour, and can substitute the old one.



  2.4) More Freedom: Global Behaviour Table

       It is a very strict restriction to the new random code that it has to
       have exactly the same behaviour. Usually its not required to have have
       the same behaviour in all registers, most are unused anyway.

       We could use this additional freedom, and creating a Global Behaviour
       Table (GBT) and compile-time, giving the code the information of which
       registers have to be the same and which can be different.

       Its a good idea to use a macro and symbolic variables for that
       (everything else would be insane):

       - - - - - - - - - - - - - - - - - -
       ; Restrictions:
       ; A=EAX, B=EBX, C=ECX, D=EDX, P=EPB, S=ESI, I=EDI, F=FLAGS

       rNoRes	  EQU 00000000b
       rA	  EQU 00000001b
       rB	  EQU 00000010b
       rAB	  EQU 00000011b
       rC	  EQU 00000100b
       ...
       rABCDPSIF  EQU 11111111b

       GlobalBehaviourTableList equ   

       macro cc restr*, instr*, op1, op2
       {
           ; Macro for padding commands to 8byte each and for
           ; adding the restrictions to the GlobalBehaviourTableList

           local StartCommand, EndCommand
           StartCommand:
           if op2 eq
               if op1 eq
                   instr
               else
                   instr op1
               end if
           else
               instr op1,op2
           end if
           EndCommand:
           times (8-EndCommand+StartCommand): nop          ; padding

           match any, GlobalBehaviourTableList             ; Add further elements to list
           \{
               GlobalBehaviourTableList equ GlobalBehaviourTableList,restr
           \}

           match , GlobalBehaviourTableList                ; Add first element to list
           \{
               GlobalBehaviourTableList equ restr
           \}
       }
       - - - - - - - - - - - - - - - - - -

       There is no restiction for ESP, as this register always has to be
       equivalent in file code and random code.

       One simple example:
       - - - - - - - - - - - - - - - - - -
                cc rNoRes, nop
                cc rA, mov, eax, 0x0          ; eax=0
                cc rA, inc, eax               ; eax=1
                cc rA, add, eax, 2            ; eax=3
                cc rA, nop                    ; eax=3
                cc rC, mov, ecx, eax          ; ecx=3
                cc rC, nop                    ; ecx=3  
       - - - - - - - - - - - - - - - - - -

       will be translated to

       - - - - - - - - - - - - - - - - - -
       00402000 > $ 90             NOP
       00402001   . 90             NOP
       00402002   . 90             NOP
       00402003   . 90             NOP
       00402004   . 90             NOP
       00402005   . 90             NOP
       00402006   . 90             NOP
       00402007   . 90             NOP
       00402008   . B8 00000000    MOV EAX,0
       0040200D   . 90             NOP
       0040200E   . 90             NOP
       0040200F   . 90             NOP
       00402010   . 40             INC EAX
       00402011   . 90             NOP
       00402012   . 90             NOP
       00402013   . 90             NOP
       00402014   . 90             NOP
       00402015   . 90             NOP
       00402016   . 90             NOP
       00402017   . 90             NOP
       00402018   . 83C0 02        ADD EAX,2
       0040201B   . 90             NOP
       0040201C   . 90             NOP
       0040201D   . 90             NOP
       0040201E   . 90             NOP
       0040201F   . 90             NOP
       00402020   . 90             NOP
       00402021   . 90             NOP
       00402022   . 90             NOP
       00402023   . 90             NOP
       00402024   . 90             NOP
       00402025   . 90             NOP
       00402026   . 90             NOP
       00402027   . 90             NOP
       00402028   . 89C1           MOV ECX,EAX
       0040202A   . 90             NOP
       0040202B   . 90             NOP
       0040202C   . 90             NOP
       0040202D   . 90             NOP
       0040202E   . 90             NOP
       0040202F   . 90             NOP
       00402030   . 90             NOP
       00402031   . 90             NOP
       00402032   . 90             NOP
       00402033   . 90             NOP
       00402034   . 90             NOP
       00402035   . 90             NOP
       00402036   . 90             NOP
       00402037   . 90             NOP
       - - - - - - - - - - - - - - - - - -

       After a few minutes, the new code looks like that:

       - - - - - - - - - - - - - - - - - -
       00402000 > $ 80D5 39        ADC CH,39
       00402003   . 90             NOP
       00402004   . 92             XCHG EAX,EDX
       00402005   . 4B             DEC EBX
       00402006   . 8BD6           MOV EDX,ESI
       00402008   . 45             INC EBP
       00402009   . 20C8           AND AL,CL
       0040200B   . 33C0           XOR EAX,EAX
       0040200D   . 2BF2           SUB ESI,EDX
       0040200F   . 4F             DEC EDI
       00402010   . 88EE           MOV DH,CH
       00402012   . 39CD           CMP EBP,ECX
       00402014   . 12F5           ADC DH,CH
       00402016   . 40             INC EAX
       00402017   . 4F             DEC EDI
       00402018   . 22FA           AND BH,DL
       0040201A   . 14 02          ADC AL,2
       0040201C   . 13C8           ADC ECX,EAX
       0040201E   . 3C 07          CMP AL,7
       00402020   . 83ED C6        SUB EBP,-3A
       00402023   . 29DE           SUB ESI,EBX
       00402025   . 83DA D9        SBB EDX,-27
       00402028   . 8BD8           MOV EBX,EAX
       0040202A   . 80E2 37        AND DL,37
       0040202D   . 23D4           AND EDX,ESP
       0040202F   . 91             XCHG EAX,ECX
       00402030   . 2C 05          SUB AL,5
       00402032   . 88CE           MOV DH,CL
       00402034   . 21EF           AND EDI,EBP
       00402036   . 4F             DEC EDI
       00402037   . 90             NOP                   ; ecx=3
       - - - - - - - - - - - - - - - - - -




  3) Extentions

  3.1) Including the Stack

       Including Stack-Commands is quite easy. There are some things to note:

       What happens at 'pop Reg32'? Reg32 changes and ESP changes, we already
       have all of that informations in the LBT, so nothing to change for POP.

       But what is with 'push Reg32/Imm32'? ESP changes, but we dont know what
       is on the stack from the above rules. 'PUSH 0x0' would result in the
       same LBT as 'PUSH 0x1' does.

       Therefore we have to add an additional dword to the LBT, defining the
       latest value of the stack (if newESP=oldESP-4 changed):

       - - - - - - - - - - - - - - - - - -
                  Local Behaviour Table (LBT):
                          ...
                     Stack: 1*4=4 byte
                     0x44 STACK change (if push is used)
       - - - - - - - - - - - - - - - - - -

       One restriction to the random code appears, too:
       You must not use pop/push as trash (push/pop is ok). As an example,
       you have:

       - - - - - - - - - - - - - - - - - -
                  xor eax, eax
                  ret
       - - - - - - - - - - - - - - - - - -

       This could be translated to:

       - - - - - - - - - - - - - - - - - -
                  xor eax, eax
                  pop eax
                  push 0
                  ret
       - - - - - - - - - - - - - - - - - -

       In principal, a legal substituation from the rules above, but fatal.

       One can solve that in two ways: Increase the Local Behaviour Table by
       8 dwords, and compare the changing of the stack; or you could just
       add a byte to note whether the pop occured before the push (this is what
       I've used in my example virus).





  3.2) Including Memory

       It is very tricky to use memory instructions for the random code and
       get the behaviour later.

       First thing is, that we have to use an Exception Handler. I tried SEH,
       but there are problems because the EXCEPTION_REGISTRATION and the
       callback function handle has to be on the stack, and this value can be
       changed by the random code (for instance "SBB DWORD PTR SS:[ESP+B],EAX",
       which actually happened).
       As it's the worst thing when the Exception Handler will be destroyed, we
       have to use something else: VEH (Vectored Exception Handler). This is
       provided by Windows XP+ as AddVectoredExceptionHandler-API.

       Second problem: The memory may overwrite any data that is not protected.
       That means: Its own code (if its in the .data section), any data created
       with Virtual Alloc (and was no Read/Execution-Only-Flag).
       We have to use VirtualProtect to any allocated memory.

       Third problem: We can not and should not protect the .data section, as
       we want new code-substituations for that, too. But the random code can
       overwrite some important data in the .data section. A solution is to
       mirror the .data memory with READ-ONLY potection (or NOACCESS flag)
       After the execution and analyse of the random code, we can restore the
       original .data section.

       Fourth problem: We have to save the pointer to the DataMirror somewhere.
       The register may change, so we have to use the .data section. But this
       pointer can be overwritten by random code, then we dont know the place
       of the DataMirror anymore, and can not restore the original .data - big
       problem!
       However, we can use some self-repair technique: We store the pointer to
       the DataMirror three times at totally different positions in the .data
       section, and after execution use that algorithm:

       - - - - - - - - - - - - - - - - - -
              mov eax, hDataMirror1;
              if (hDataMirror1!=hDataMirror2) { mov eax, hDataMirror3; }
       - - - - - - - - - - - - - - - - - -

       eax is the pointer to the DataMirror if there is maximum one destroyed
       pointer. This should be enough for us (we just have 8 byte of random
       code).

       Fifth problem: The instructions may rely on the values of the memory,
       just the same thing as with initial register values. We have to fill the
       memory as well as the stack with random memory, to prevent that
       dependence on the initial values.

       Sixth problem: The result of the random code (the local behaviour table)
       can not be saved in .data (as this one will be restored later) or in the
       DataMirror (as this one is used for comparing). So one has to allocate
       additional memory for saving the LBT temporarily, and copy it to the
       restored .data Section later.


       For saving the results I added 2*4 additional dwords to the Local
       Behaviour Table:

       - - - - - - - - - - - - - - - - - -
                  Memory: 2*4*4=8*4=32 byte:
                     0x24 dd MemOffset1
                     0x28 dword[MemOffset1]
                     0x2C dd MemOffset2
                     0x30 dword[MemOffset2]
                     0x34 dd MemOffset3
                     0x38 dword[MemOffset3]
                     0x3C dd MemOffset4
                     0x40 dword[MemOffset4]
       - - - - - - - - - - - - - - - - - -

       There is space for saving results of four changes of the random code,
       each change uses two dwords, one for the addresse that has been changed,
       and one for the actual data that has been changed.

       The chance that a valid change appears is very slow if we use totally
       random initial values. One could increase the chance by using the
       following trick:

       All registers get a small random value between 0 and ~200, and one
       variable gets the value of the .data-section:

            eax=0x33    ebp=0x67
            ebx=0x8A    esi=.data + 0x12
            ecx=0x04    edi=0xAA
            edx=0x12    esp=esp

       This increases the chance of valid memory changes alot - for instance:

            mov dword[esi+ecx], eax

       will be valid, which would have not be if the registers were totally
       random.

       This diagram shows what will be done with the memory blocks while
       analysing one unknown code:


     Preparation
     ===========

                        ___________ 
            mirroring  |           |
           ----------> | Data      |
          |            |   Mirror  |
     ___________       |           |
    |           |      |           |
    | .data     |      |___________|       ___________ 
    |           |                         |           |
    |           |                         | Random    |
    |           |  -------------------->  |   Data    |
    |___________|    important values     |           |
                                          |           |
                                          |___________|


                        ___________ 
                       |           |
                       | Data      |
                       |   Mirror  |
     ___________       |           |
    |           |      |           |
    | .data     |      |___________|       ___________ 
    |           |                         |           |
    |           |                         | Random    |
    |           |  <--------------------  |   Data    |
    |___________|     randomize .data     |           |
                                          |           |
                                          |___________|




                    E X E C U T I O N



     Comparison
     ==========
                        ___________ 
                       |           |
                       | Data      |
                       |   Mirror  |
     ___________       |           |
    |           |      |           |
    | .data     |      |___________|       ___________ 
    |           |                         |           |
    |           |                         | Random    |
    |           |  <<------------------>> |   Data    |
    |___________|       compare data      |           |
                             |            |           |
                             |            |___________|
                             |             ___________
                             |            |           |
                             -----------> | temp. LBT |
                               results    |___________|


     Restoration 
     ===========


                        ___________ 
            restore    |           |
           ----------  | Data      |
          |            |   Mirror  |
          V            |           |
     ___________       |           |
    |           |      |___________|
    | .data     |      
    |           |  
    |           |                        ___________
    |           |        save LBT       |           |
    |___________|  <------------------- | temp. LBT |
                                        |___________|  
                        

                     



  4) Possible extentions

     These are some ideas that increase the power of this technique and
     are easy to implement - everything without the need of deep code
     analyse or full disassembling.

  4.1) Hide me: Global Behaviour Table at Runtime

       The technique above uses a behaviour-table hardcoded in the file,
       which could be a source for detection by lazy avers.
       There is no need for that, one could create the GBT at runtime, and
       write it to the memory. It's just a little bit tricky as the GBT
       is defined implicit for all commands - but some fake-code or hand-
       adjusted GBT can solve that problem.


  4.2) Permutation - easy as it can be

       As the whole code consists of 8byte command blocks, its very easy
       to use permutation - there is no need of disassembling as in usual
       permutation engines. One can simply copy the different 8byte blocks
       to a random place in the file and connect them with Jmp-Operations.
          

  4.3) Jump somewhere

       One weakness of this technique may be the handling of jmp/ret/call
       commands. One must not execute them due to unexpected behaviour,
       therefore I have defined a further restriction in the GBT: rNoEmul.
       If this command appears, the corresponding 8byte block will not be
       analysed, thus no equivalent codes will be found.
       As the code knows where a jump will be (because of that rNoEmul
       byte in the GBT), it can include trash or change the instruction
       (call -> pop Reg32 + jmp Reg32). When trash is included, it should
       restrict the trash to not contain significant operation opcodes
       (0xC3 = ret, 0xE8 = call, 0xEB = jmp, 0x7x = jxxx).


  4.4) Behaviour Flow

       Imagine you have a code like this:

       - - - - - - - - - - - - - - - - - -
                  cc rA,   mov, eax, 0
                  cc rAB,  mov, ebx, eax
                  cc rB,   inc, ebx
                  cc rBC,  xor, ecx, ecx
       - - - - - - - - - - - - - - - - - -

       Command 3 and 4 are independent and could be exchanged. But how can
       the code itself find that out?

       We could analyse both commands (16 bytes) together in a black-box
       test and compare output. If the behaviour of the changed 16 bytes
       is the same as the original one, we could use the new one. We can
       see immediatly that line 2 and 3 can not be exchanged - exactly the
       same result we would get in the blackbox test.

       The only thing we have to think of is to adjust the restrictions,
       which is straight forward.

       In the end, we get some kind of behaviour flow, a very beautiful
       result:

       - - - - - - - - - - - - - - - - - -
      cc rA,   mov, eax, 0        cc rA,   mov, eax, 0          cc rA,   mov, eax, 0
      cc rAB,  mov, ebx, eax      cc rAB,  mov, ebx, eax        cc rAC,  xor, ecx, ecx
      cc rB,   inc, ebx      ->   cc rBC,  xor, ecx, ecx   ->   cc rABC, mov, ebx, eax
      cc rBC,  xor, ecx, ecx      cc rBC,  inc ebx              cc rBC,  inc ebx
       - - - - - - - - - - - - - - - - - -



  4.5) Automatic shrinking/extention/trash

       Shrinking and extention is usually a very hard buisness - requires
       full code disassembling to some meta-languages. Here it is much
       easier, the only thing we have to do is adjusting the jump addresses.

       For shrinking, we just use take a 16byte block (instead of the
       usual 8 byte block) for analysing, and compare it with a 8 byte
       block of random data - very simple!

       The extention is the inverse process - we take a 8byte code from
       the file code and compare it with a 16byte block of random code.
       The random 16byte block should consist of 2 independent 8 byte
       blocks, to prevent problems when this code will be analysed alone.
       This is very trivial too.

       The trash is a similar technique: We get an 8 byte code and analyse
       its behaviour with respect to the restrictions of the direct-above
       block. Again - very simple.






  5) Conclusion

     Analysing the behaviour of code and substituate with random instructions
     is powerful as well as beautiful.

     We should use more the power of unpredictable randomness!



                                                        Dr. Theodore Spankman
                                                                December 2010



     PS1: The initial idea comes from an article released in August 2004:
          http://vx.netlux.org/lib/vsp14.html
     
     PS2: Special thanks goes to hh86 for motivation, help and useful ideas
          in connection with this research... Merci! :)
Virus-writing Bulletin 2011