40Hex Number 9 Volume 2 Issue 5 File 008 ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ CODE OPTIMISATION, A BEGINNER'S GUIDE ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ Written by Dark Angel ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ When writing a virus, size is a primary concern. A bloated virus carrying unnecessary baggage will run slower than its optimised counterpart and eat up more disk space. Never optimise any code before it works fully, since altering code after optimisation often messes up the optimisation and, in turn, messes up the code. After it works, the focus can shift to optimisation. Always keep a backup of the last working copy of the virus, as optimisation often leads to improperly working code. With this in mind, a few techniques of optimisation will be introduced. There are two types of optimisation: structural and local. Structural optimisation occurs when shifting the position of code or rethinking and reordering the functions of the virus shorten its length. A simple example follows: check_install: mov ax,1234h int 21h cmp bx,1234h ret install_virus: call check_install jz exit_install If this is the only instance that the procedure check_install is called, the following optimisation may be made: install_virus: mov ax,1234h int 21h cmp bx,1234h jz exit_install The first fragment wastes a total of 4 bytes - 3 for the call and 1 for the ret. Four bytes may not seem to be worth the effort, but after many such optimisations, the code size may be brought down significantly. The reverse of this optimisation, using procedures in lieu of repetitive code fragments, may work in other instances. Properly designed and well-thought out code will allow for such an optimisation. Another structural optimisation: get attributes open file read/only read file close file exit if already infected clear attributes open file read/write get file time/date write new header move file pointer to end of file concatenate virus restore file time/date close file restore attributes exit Change the above to: get attributes clear attributes open file read/write read file if infected, exit to close file get file time/date move file pointer to end of file concatenate virus move file pointer to beginning write new header restore file time/date close file restore attributes exit By using the second, an open file and a close file are eliminated while adding only one move file pointer request. This can save a healthy number of bytes. Local, or peephole, optimisation is often easier to do than structural optimisation. It consists of changing individual statements or short groups of statements to save bytes. The easiest type of peephole optimisation is a simple replacement of one line with a functional equivalent that takes fewer bytes. The 8086 instruction set abounds with such possibilities. A few examples follow. Perhaps the most widespread optimisation, replace: mov ax,0 ; this instruction is 3 bytes long mov bp,0 ; mov reg, 0 with any reg = nonsegment register takes 3 bytes with xor ax,ax ; this takes but 2 bytes xor bp,bp ; mov reg, 0 always takes 2 bytes or even sub ax,ax ; also takes 2 bytes sub bp,bp One of the easiest optimisations, yet often overlooked by novices, is the merging of lines. As an example, replace: mov bh,5h ; two bytes mov bl,32h ; two bytes ; total: four bytes with mov bx,532h ; three bytes, save one byte A very useful optimisation moving the file handle from ax to bx follows. Replace: mov bx,ax ; 2 bytes with xchg ax,bx ; 1 byte Another easy optimisation which can most easily applied to file pointer moving operations: Replace mov ax,4202h ; save one byte from "mov ah,42h / mov al,2" xor dx,dx ; saves one byte from "mov dx,0" xor cx,cx ; same here int 21h with mov ax,4202h cwd ; equivalent to "xor dx,dx" when ax < 8000h xor cx,cx int 21h Sometimes it may be desirable to use si as the delta offset variable, as an instruction involving [si] takes one less byte to encode than its equivalent using [bp]. This does NOT work with combinations such as [si+1]. Examples: mov ax,[bp] ; 3 bytes mov word ptr cs:[bp],1234h ; 6 bytes add ax,[bp+1] ; 3 bytes - no byte savings will occur mov ax,[si] ; 2 bytes mov word ptr cs:[si],1234h ; 5 bytes add ax,[si+1] ; 3 bytes - this is not smaller A somewhat strange and rather specialised optimisation: inc al ; 2 bytes inc bl ; 2 bytes versus inc ax ; 1 byte inc bx ; 1 byte A structural optimisation can also involve getting rid of redundant code. As a virus related example, consider the infection routine. In few instances is an error-trapping routine after each interrupt call necessary. A single "jc error" is needed, say after the first disk-writing interrupt, and if that succeeds, the rest should also work fine. Another possibility is to use a critical error handler instead of error checking. How about this example of optimised code: mov ax, 4300h ; get file attributes mov dx, offset filename int 21h push dx ; save filename push cx ; and attributes on stack inc ax ; ax = 4301h = set file attributes push ax ; save 4301h on stack xor cx,cx ; clear attributes int 21h ...rest of infection... pop ax ; ax = 4301h pop cx ; cx = original attributes of file pop dx ; dx-> original filename int 21h Optimisation is almost always code-specific. Through a combination of restructuring and line replacement, a good programmer can drastically reduce the size of a virus. By gaining a good feel of the 80x86 instruction set, many more optimisations may be found. Above all, good program design will aid in creating small viruses.