Several things to learn more (knowledge and code snippets) ---------------------------------------------------------- by The Mental Driller/29A 1) Thinks about Windows and shits that make our programs (or viruses) fail. 2) RADIX sort for DWORD arrays 3) Nothing more 1) Thinks about Windows and shits that make our programs (or viruses) fail. --------------------------------------------------------------------------- Guess what! Microsoft has bugs in the Kernel!! (oh, really????). Have you ever wonder why your polymorphic engine works fine and your virus not sometimes and what it's failing is the call to GetModuleHandle()? Well, one of the reasons can be the #@&$! bug that I realized to be even in Win2k: the direction flag (usually set to 1 with STD or cleared with CLD). Be sure that when you call to GetModuleHandle the flag is clear! This care must be special if we are coding polymorphic viruses (since it's a standard garbage instruction). What I wonder is WHY the programmers at Micro$oft relied at this fact when using LODS?/CMPS?/etc. (block instructions). Just test it: call GetModuleHandle passing "kernel32.dll" but first make STD. Exception for sure! And not an exception in our code, but in the kernel! Other dubba-dubba festipower fact: under Win9x, don't touch the thunk array at +10h in an import header! In order to "optimize" and save 0.1 seconds in the load of an executable, if the version of the system where the executable was compiled for and the version of the current system coincides, then this array is untouched, relying in the hard-coded addresses at this array. W0W! These guys amaze me more each day! Fortunatelly, Win2k doesn't seem to make this assumption. So, if you are coding some shit that requires updating the import table, just take this in account, and don't get mad if you didn't and the program fails. This fact has been used on the version 1.1 of MetaPHOR to make EPO that only works under Win9x. A real pain for emulators, which must be modified to handle this :). ----- 2) RADIX sort for DWORD arrays ------------------------------ This snippet of code was going to be used in MetaPHOR to make several things. Finally I didn't use it, but it doesn't mean it's not useful. For the ones that don't know the RADIX sort, just an explanation: this sorting method uses the elements of the array themselves as positions in other array. We can see the RADIX sort with an example: 4,6,1,4,1,2,5,3,2 --> Array to be sorted Just get an array at least as long as the biggest element of the array to sort, in this case 6. This is the Buffer1 array: ( ),( ),( ),( ),( ),( ) Now count the elements from the first position: we just initialize the Buffer1 array with 0s, and then we take each element of the array to order and increase the value at that position: 1 2 3 4 5 6 (2),(2),(1),(2),(1),(1) Now we get the array and beginning by the first element and with an addition of 0 we add the previous value to the current position: 1 2 3 4 5 6 (0),(2),(4),(5),(7),(8) Now, you can see that for every number we have the position in an ordered array! Now we get each element, look in what position must be stored, store the element in Buffer2 (the destiny array) and then we increase the position at Buffer1[Element]. The final result: 1 2 3 4 5 6 (2),(4),(5),(7),(8),(9) 0 1 2 3 4 5 6 7 8 (1),(1),(2),(3),(3),(4),(4),(5),(6) RADIX sort is the fastest sort algorithm, since it makes no comparisions and the sorting is lineal (3n, while QuickSort is logaritmic), but it requires an array at least as long as the maximum value that an element can have, something that will obligate to have a 4*4 Gb buffer array to sort a DWORD array. Then, what we do to order DWORDs? The trick is to order the array only taking in account the first byte, and then order the subarrays that have the same byte that we just ordered. With this we reduce the speed of the algorithm, but it's still faster than QuickSort and others (much faster!). Following, the code to do this: -------------------------------[cut here]-------------------------------------- .386p ; Radix sorting .model flat ; This isn't a standard radix: it can order a DWORD array only locals ; using two relatively little buffers (one of them at least as .code ; big as the original array). It was not easy! ;; I have optimized what I could. There are some flaws in optimizations, like ;; the array using and all that (moreover when the array to sort is little). ;; If you are in the mood of optimizing, do it before using this! :) extrn ExitProcess:PROC Main proc mov esi, offset VectorToSort mov edi, offset LastElement + 4 call MainRSort push 0 call ExitProcess Main endp ; ESI = Initial address ; EDI = Last address MainRSort proc mov ebp, 3 ; Sort byte 3 call PreRSort mov ebp, 2 ; Sort byte 2 call PreRSort mov ebp, 1 ; Sort byte 1 call PreRSort xor ebp, ebp ; Sort byte 0 call PreRSort ret MainRSort endp PreRSort proc push esi cmp ebp, 3 jz @@Sort3 cmp ebp, 2 jz @@Sort2 cmp ebp, 1 ; Select masks and rotation values jz @@Sort1 @@Sort0: mov eax, 0FFFFFF00h mov ebx, 0000000FFh mov ecx, 2 jmp @@Next00 @@Sort3: xor eax, eax mov ebx, 0FF000000h mov ecx, 0Ah jmp @@Next00 @@Sort2: mov eax, 0FF000000h mov ebx, 000FF0000h mov ecx, 12h jmp @@Next00 @@Sort1: mov eax, 0FFFF0000h mov ebx, 00000FF00h mov ecx, 1Ah @@Next00: mov [TempMask1], eax mov [TempMask2], ebx mov [ShiftMask], ecx @@Loop0: push edi mov ebx, esi mov edx, [ebx] xor ecx, ecx push esi mov esi, [TempMask1] and edx, esi xor ecx, ecx @@LoopCount: mov eax, [ebx] and eax, esi cmp eax, edx jnz @@Sort add ecx, 4 add ebx, 4 cmp ebx, edi jnz @@LoopCount @@Sort: pop esi cmp ecx, 4 jbe @@AlreadySorted mov edi, ebx push ebx call RSort pop ebx @@AlreadySorted: mov esi, ebx pop edi cmp esi, edi jb @@Loop0 pop esi ret PreRSort endp RSort proc mov ebx, 100h*4 - 8 mov ecx, offset VectorTemp1 fldz ; Big optimization! We store a 0 @@Loop0: fst qword ptr [ecx] ; in the FPU and overwrite with QWORDs add ecx, 8 ; instead of DWORDs. sub ebx, 8 jnz @@Loop0 fstp qword ptr [ecx] push esi mov edx, [TempMask2] mov ecx, [ShiftMask] @@Next1: mov ebx, [esi] and ebx, edx rol ebx, cl mov eax, [VectorTemp1+ebx] add eax, 4 mov [VectorTemp1+ebx], eax add esi, 4 cmp esi, edi jnz @@Next1 pop esi mov ebx, offset VectorTemp1 mov ecx, 100h xor edx, edx @@Loop1: mov eax, [ebx] mov [ebx], edx add edx, eax add ebx, 4 ; Addition of indexes dec ecx jnz @@Loop1 push esi mov edx, [TempMask2] @@Next2: mov ebx, [esi] mov ecx, [ShiftMask] mov eax, ebx and ebx, edx rol ebx, cl mov ecx, [VectorTemp1+ebx] mov [VectorTemp2+ecx], eax add ecx, 4 mov [VectorTemp1+ebx], ecx add esi, 4 cmp esi, edi jnz @@Next2 pop esi push esi mov ebx, offset VectorTemp2 @@Loop2: mov eax, [ebx] mov [esi], eax add esi, 4 add ebx, 4 cmp esi, edi ; Copy the data. This part overwrites jnz @@Loop2 ; the unsorted vector with the sorted pop esi ; one at VectorTemp2. If you don't need ret ; it, you can eliminate this. RSort endp .data TempMask1 dd 0 TempMask2 dd 0 ShiftMask dd 0 VectorTemp1 dd 256 dup (0) VectorTemp2 dd 256 dup (0) VectorToSort dd 48384782h ; Sample array of elements to sort. dd 61900345h dd 61900344h dd 17100035h dd 76189667h dd 61890000h dd 19098752h dd 88128948h dd 61253625h dd 15672950h dd 15682488h dd 61900271h dd 16832673h dd 64509385h dd 77878716h dd 22180509h dd 62728299h dd 19238930h dd 65209999h dd 51829859h dd 01748738h LastElement dd 55676134h end Main -------------------------------[cut here]-------------------------------------- 3) Nothing more Ummmhh... I can't think on any more related to virus to fill this a little :). Only a big "hello" to the creative VXers and a big "get a life" to teenage script-kiddies that fill with shit people's mailboxes (for example, VBS massmailing worms generated with Kalamar's tools). The Mental Driller/29A