Polymorphism ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ> An¡bal Lecter I know it may be a bit strong featuring both an encryption article with a polymorphism one in the same issue, but this one is dedicated to those of you who have a more advanced level. If you are still a bit confused with encryption, better forget this article and try with YAM. We'll very basically introduce polymorphic routines: design, construction and functioning. In this article, we'll study a 'pseudo-polymorphic' generator, this is: grounding on a basic routine, make more difficult the detection of the virus (as the routine's kernel isn't variated), depending on your aims of work. What's a PER (Polymorphic Encryption Routine)?: PERs are born aiming to avoid detection schemes based on the uneffective strings of bytes. These systems are based on the idea that viruses always preserve a number of stable bytes in each generation (at least in the header, when encryp- ted). With PERs we are trying to avoid this unconvinience, always trying to va- riate that header, either: 1. Lexically: substituting directly some hex codes for others. label : 2825 sub [di],ah turns into: 0005 add [di],al In this case it would be enough to substract 2820h from the word at label:, although we should have thought that we should put the value to use in AH or AL; depending on the case, such code would change too. How? keep on reading }:-) 2. Sintactically: changing the order of the commands, but seeking the sa- me result. label : add [di], ah xor [di], ah turns to: xor [di], ah add [di], ah This time we should keep in mind the order during encryption so as to invert it correctly. 3. Morphologically: variating its external appearance, but maintaining its kernel, including garbage in between the code. In order to do this, we can take hand of the classics: 90h = nop f8h = clc f9h = stc fah = cli fbh = sti fch = cld dch = std Look out! these last two are dangerous if using registers SI, DI and CX at the same time, cause you'll have to bear it in mind whether you pretend the loop to increase or decrease :-P Of course, you can combine them; the easiest example is for the ty- pical bait files: dw 2000 dup (90fbh) mov ax,4c00h int 21h This way, we can avoid the virus from not infecting it by searching a a big number of the same bytes. For the decrypting header, we would have a routine with the XOR and a little algorithm which would add a series of 'non-code'. 4. Finally, combining the three preceeding ways as you like. Further on, we'll see a bit on language grammar, specially Conway notation ;-) It would be equally possible to see BNF, but as far as i see it, it is less clear. From the three basic methods we've just seen for making possible the mu- tations, the one i see the easiest is the third one, as it just affects the code size and the 'innocent' instructions we include in it. The 2nd 1 is a bit more complex, because of having to change the order of the ins- tructions though they're kept the same. Whereas the first one, it's nece- ssary to know all of the opcodes and build a table (we have an example in MtE). I'll start with the third case; once you pick it up, you all will know e- nough so as to jump directly onto the second type as well as the third, or even combine them all, or whatever you think of. Let's see it with a direct example; let's take the decryption routine from the 13th July virus (original source code by Jordi Mas, a spanish ex-AVer dickhead which was known to have some of their viruses in the wild... and these viruses sucked as much as he did) }:-) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ->8 longitud equ fin-inicio_virus ; Defines the size inicio_virus: inicio: mov al,cs:hasta_final ; Decrypts the executable xor al,090h ; viral code mov si,offset hasta_final mov cx,longitud des_bl: xor byte ptr cs:[si],al inc si loop des_bl hasta_final db 90h ; Stores the random value ; it uses for the encryption ; .... ; routine ; 'Normal' viral code ; .... fin: ; End of the code - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ->8 In this case, we can see how the string of bytes that starts at 'inicio' and goes until 'hasta_final' (keeping the encryption value outside) will always be the same in each infection; it will only have 255 different po- ssibilities plus one more in which the code will be visible (00h). But these strings don't change: the virus is 100% detectable. It can be mutated, but must be done hand-helped, for instance, fitting into it some of these 'innocent' bytes in between the first and the se- cond instruction. Nevertheless, this can be done by any lamer ;-) We are leet and can do it better, can't we? ;-) In this approach to polymorphism by the third method, we can notice three of its characteristics: 1. Routine's kernel doesn't change. 2. The size does. 3. Distance in bytes between main instructions also variate. How can we do it? let's see: Grammar in Conway notation ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ This kind of grammar, as some of you will yet know, is mostly used to describe programming language's structure; it is also valid for spoken language, and it is used in AI for recognizing speech, or at least that's what i think. What's pretended with it is to represent the morphological structure of language by means of a flow chart. Take as an example: ÚÄÄÄÄÄ®ÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄ¿ ³ ÚÄÄįÄÄÄÄendÄÄÄ´ start ³ ³ ³ ¯ÄÂÄÄarticleÄÄÄsubjectÄÂÄverb ÅÄįÄÄadverbÄÄįÄÄÄÄÄÄÄÄÄÙ ³ ³ ³ ³ ÚÄmodeÄÄÄÄÄ´ ³ ³ ³ ³ ³ ÀįÄÄpronounÄÄįÄÄÄÄÄÙ ÀÄÄÄÄįÄÄÄmannerÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄtimeÄÄÄÄÄ´ ³ ³ ÀÄplaceÄÄÄÄÙ Let's see it in detail: ÚÄÄÄÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ®ÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ ÚÄÄįÄÄÄÄ(end)ÄÄÄįÄÄÄÄÄ´ (start) ³ ³ ³ įÄÄTheÄÄÂÄvirusÄÄÂÄinfectedÄÅÄÄÄÄalwaysÄÄįÄÄÄÄÙ ÚÄmassivelyÄÄÄ´ ³ ³ ³ ³ ³ ÀÄZhengxiÙ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÅÄfr.3h to 5hÄ´ ³ ³ ÀÄsome PCsÄÄÄÄÙ Experts on language will pardon me... this ain't a grammar lesson ;-) If we oberve and analyze the diagram, we can see it is possible to say the same with different words. Eg: "The virus always infected" "The virus infected massively" "The Zhengxi virus infected from 3h till 5h" But we're also allowed to 'toy' around. Eg: "The virus infected always, always" "The Zhengxi virus infected massively from 3h till 5h some PCs" "The Zhengxi virus infected always some PCs, always from 3h till 5h, always massively" Can you get the 'hidden' intention of this explanation? ;-) Better then, cause next issue we'll put all of this into practice and be- tter to have all the concepts assimilated. * An¡bal *