/-----------------------------\ | Xine - issue #3 - Phile 107 | \-----------------------------/ Archive infectors : Generalities By Unknown MnemoniK/iKx Introduction : +--------------+ Archive infectors aren't common in the wild . Anyway scanners scan archive files because archives are the center of sotfware/data exchanges, archives walk through diskettes , cdroms , network and of course the internet . You see, a good way for spreading ? There are already some archive infectors , like Zhengxi ( the best one I think ) , but there's also a zip infector disassembled in 29A#2. Few month ago, I took my interest on it, so I began to study archives. In fact,there's no complex point in it, just understand the packet storing and the CRC32 How Compression format works in general ? +------------------------------------------+ In the 99% of archivers , organization of datas work as in this scheme : +---------------------+ | Compressed data 1 | +---------------------+ Simple eh ? So , for an infection | Compressed data 2 | you just need to drop a packet at +---------------------+ at the end of the file , fortunately | Compressed data ... | you don't need to compress the virus +---------------------+ because you can drop it without | Compressed data n | compression ,you just need to switch +---------------------+ a byte to do that . Now lets take a look on compressed data blocks ,its divided in two blocks Header datas and compressed zone , I only interest myself on the Header part,the compressed zone will be just the virus , lets how it's generally organized (for common type, I will precise how it's works for ZIP RAR and ARJ later) Type Description +---------------------+ +---------------------------------------+ | Signature | ---------- | This is just a word , in zip PK and | +---------------------+ | in ARJ 060h 0EAh , but it's sometimes | | Size of the Header | --------+ | forgetted like in RAR structure | +---------------------+ | +---------------------------------------+ | Version is used for | | +---------------------------------------+ | Extraction ,Version | +- | Compressed datas start at Signature + | | made by and anything| --------+ | Size of the Header , for decompressor | | about version | | +---------------------------------------+ +---------------------+ | +---------------------------------------+ | Minor informations | +- | In fact , this have no real use, just | | like host OS,method | | Compressor will ignore decompression | | of compression and | --------+ | of a file if VERNEED > Version Prog. | | reserved things | | +---------------------------------------+ +---------------------+ | +---------------------------------------+ | Dos Datas like Date | +- | We just fix compression method to 0 , | | and Hours of creat. | --------+ | for no compression at all | +---------------------+ | +---------------------------------------+ | Compressed Size | | +---------------------------------------+ | Non compressed Size | -----+ +- | Dos hour, Dos Date, Dos attributes... | +---------------------+ | +---------------------------------------+ | The CRC32 | --+ | +---------------------------------------+ +---------------------+ | +---- | Equal for our virus because no compss.| | Name of the file | | +---------------------------------------+ +---------------------+ +------- | Complex addition of all bytes | | +---------------------------------------+ +------------- | Name of the file ,length are often in | | Minors datas but often fixed at 13 | +---------------------------------------+ NB: After the compressed datas, there's sometimes extra datas,this is not used for the file but for comments or used for some shadow datas like zip for mac This is a common header, there's sometimes added some things that are not really important ,for faking an header , you just need to pick one or use a model , after that ,you create the name and render the new CRC32 of the virus , drop the header , drop the virus and it's done. It's appears really too simple, yeah, in rar or arj , you use this scheme but for zip , you have more complex things to resolve The Crc 32 +----------+ The Crc is one method to be sure of the integrity of the file, think that your message transit over a non-secure network, you can't be sure of the message you have received , so you will make a calculation : Message: I _ L O V E _ Y O U Hex(in Dec) 73 32 76 79 86 69 32 89 79 85 Then the crc are equal 700 The host have received this message I _ L O C K _ Y O U ,ermmm... but the secure way to see if this was the good message is to see if the CRC are the same , in this case , the CRC equal 687 , understand ? In reality , stuff are more complex , but are the same for all kinda compressors,the CRC32 calculation is done in two steps,first the creation of a table, second the calculation of each byte of datas The Table +---------+ The table is 1024 bytes length , it's built with a 32 bits algorithm. The idea of the table is great because it will be the same on each computer , lets see the code, I pick this one from PKUNZJR.COM I have disassembled . crc_table: mov di,offset starttable+1024 ;the table offset location ;remember : It begin by the end mov bp,255 ;set bp equal 255 ;255 * 4 = 1020 std ;set Direction Flag On , regresive ;decount for stosw TableHighloop: ;the major loop in the Crc table mov cx,8 ;set the minus loop to 8 mov dx,bp ;dx = bp , major counter loop xor ax,ax ;ax = zero TableLowLoop: shr ax,1 ;mov one byte of ax at right in bin rcr dx,1 ;if anything losted , put it on dx ;equal rol 1 on 32 bits reg jae anomality ;if superior or equal skip encrypt ;just to complexify things xor dx,08320h ;encrypt value by a signature xor ax,0EDB8h ;0EDB88320h in 32 bits anomality: loop TableLowLoop ;make it 8 times stosw ;write ax on table xchg dx,ax stosw ;write dx on table dec bp ;decrement the counter jnz TableHighLoop ;repeat it until bp = 0 mov word ptr [di],0 ;last value equal 0 sub di,2 mov word ptr [di],0 cld ;clear direction flag ret After that , you will have a zone of 1024 bytes wich fix datas on all computers , the method of calculation is the same for the majority of archives format ( and the most common ) The Calulation +--------------+ The calculation works with the table ,you get the current byte in the buffer , make some operations and get the dword in the table that corresponds at the location of the result . Then you XOR the CRC with this dword and when finished , you NOT the CRC , note that you start with FFFFFFFFh in the CRC counter , if you don't understand , don't panic and see code In some words : ; beware!during this operation,the bx state(and the handle)will be changed ; si = offset of the buffer you will render CRC32 ; di = offset of the CRC table ; bp = length of the buffer mov cx,0ffffh ; set CRC ( CX/DX ) at mov dx,0ffffh ; 0FFFFFFFFh xor ax,ax ; set ax = 0 Crc_loop: lodsb ; load byte in SI into al mov bx,ax ; bx <- ax xor bl,cl ; operations mov cl,ch ; blah mov ch,dl ; blah mov dl,dh ; blah mov dh,bh ; one more operation shl bx,1 ; this is interresting shl bx,1 ; shl bx,2 xor cx,word ptr [bx+di] ; xor CRC32,[bx+crctable] xor dx,word ptr [bx+di+02] ; dec bp ; decrement counter jnz Crc_loop ; do it x times not dx ; not the CRC32 not cx ret ; and come back on the prog. Infection +----------+ We can build algorithms of infection : Common Algorithm : 1ø) Open the file 2ø) Read the first Header 3ø) Generate a the virus (poly...) 4ø) Calculate Crc of the virus 5ø) Put the Crc on the header 6ø) Modify the name in the header 7ø) Seek to the end of the file 8ø) Write The header and the virus 9ø) Close the file This methods are quite okay , now lets see how to infect the most common archive like ZIP ARJ and RAR , their structures are different by two or three things , but this is a good occasion to see and understand code. go to the next chapter ! -----------------+ -- + - + + - + -- +------------------