40Hex Number 12 Volume 3 Issue 3 File 003 Self Checking Executable Files Demogorgon Phalcon/Skism In this article I will explain a method that will allow .COM files to be immune to simple viruses. In order to infect a .COM file, a virus must change several bytes at the beginning of the code. Before the virus returns control to the original program, it will 'disinfect' it into memory, so that the program runs as it did before infection. This disinfection process is crucial, because it means that the image on the disk will not be the same as the memory image of the program. This article describes a method by which a .COM file can perform a self-check by reading its disk image and comparing it to its memory image. The full pathname of the program that is being executed by DOS is located in the environment block. The segment of the environment block can be read from the PSP. It is located at offset [2Ch]. The name of the program is the last entry in the environment block, and can be located by searching for two zeros. The next byte after the two zeros contains the length of the string that follows it. After the length is an ASCIIZ string containing the pathname of the current process. The following code opens the file being executed: nish: mov es, word ptr ds:[2Ch] ; segment of environment xor ax, ax mov di, 1 loop_0: dec di scasw jne loop_0 mov dx, di add dx, 2 ; start of pathname push es pop ds mov ax, 3D02h ; open, read/write access int 21h Next, we must read in the file (using dos services function 3Fh, read file or device). We can read the file into the heap space after the program, as long as we are sure we will not overwrite the stack. The sample program in this file reads itself in entirely, but remember, it is not necessary to do so. It is only necessary to read and compare the first few bytes. Also, the program could read itself in blocks instead of all at once. If a file finds itself to be infected, it should report this to the user. Remember, even though the file knows it is infected, the virus has already executed. Memory resident viruses will already have loaded themselves into memory, and direct action viruses will already have infected other files on the drive. Thus, any virus that employs disinfection on the fly will be able to avoid detection and removal. Here is the full source to the self checking program: ;();();();();();();();();();();();();();();();();();();();();() .model tiny .code org 100h start: mov es, word ptr ds:[2Ch] ; dos environment block xor ax, ax mov di, 1 loop_0: dec di scasw jne loop_0 mov dx, di add dx, 2 ; <- point to current push es ; process name pop ds mov ah, 3Dh ; open file with handle int 21h jc bad ; error opening file ? mov bx, ax push cs push cs pop es pop ds ; I am a com file. mov cx, heap - start ; length lea dx, heap ; where to read file into mov ah, 3Fh ; read file or device int 21h jc bad ; error reading file ? ; here, do a byte for byte compare lea si, start lea di, heap repe cmpsb ; compare 'em jne bad lea dx, clean mov ah, 9 int 21h jmp quit_ bad: mov ah, 9 lea dx, infected int 21h quit_: mov ax, 4C00h int 21h clean db 'Self check passed.$' infected db 'Self check failed. Program is probably infected.$' heap: end start ;();();();();();();();();();();();();();();();();();();();();() While some self checking routines opt to use a crc or checksum error detection method, the byte for byte method is both faster and more accurate. Weak points: This routine will not work against a stealth virus which employs disinfection on the fly. Such viruses take over the dos interrupt (int 21) and disinfect all files that are opened and read from. As the routine in this article attempts to read itself into memory, the stealth virus would disinfect it and write an uninfected copy to ram. Of course, there are ways to defeat this. If this program were to use some sort of tunneling, it could bypass the stealth virus and call DOS directly. That way, infections by even the most sophisticated viruses would be detectable. Disinfection: So, now you can write programs that will detect if they have been infected. How about disinfection? This too is possible. Most viruses simply replace the first three bytes of the executable file with a jump or a call, which transfers control to the virus code. Since only the first three bytes are going to be changed (in almost all cases), it will usually be possible for a program to disinfect itself by replacing the first three bytes with what is supposed to be there, and then truncating itself to the correct size. The next program writes the entire memory image to disk, rather than just the first three bytes. That way, it can be used to disinfect itself from all nonstealth viruses. The steps to disinfect are simple. First of all, you must move the file pointer back to the beginning of the file. Use interrupt 21, ah=42h for this. The AL register holds the move mode, which must be 00 in this case (move from beginning of file). CX:DX holds the 32bit number for how many bytes to move. Naturally, this should be 0:0. The second step is to write back the memory image to the file. Since the virus has already restored the first few bytes of our program in memory, we must simply write back to the original file, starting from 100h in the current code segment. i.e.: mov ah, 40h mov cx, heap - start ; bytes to write lea dx, start int 21h ; write file or device Finally, we must truncate the file back to its original size. To truncate a file, we must move the file pointer to the end and call the 'write file or device' function with cx, the bytes to write, equal to zero. To move the pointer, do this: mov ax, 4200h mov cx, (heap - start) SHR 16 ; high word of file ptr mov dx, (heap - start) ; low word of file ptr int 21h ; move file pointer Since we are dealing with .COM files here, it is safe to assume that cx, the most significant word of the file ptr, can be set to zero, because our entire file must fit into one segment. We do not need to calculate it as above. To truncate: xor cx, cx mov ah, 40h int 21h ; truncate file The full code for the self disinfecting program follows. ;();();();();();();();();();();();();();();();();();();();();() .model tiny .code org 100h start: mov es, word ptr ds:[2Ch] ; segment of environment xor ax, ax mov di, 1 loop_0: dec di scasw jne loop_0 mov dx, di add dx, 2 push es pop ds mov ax, 3D02h ; open, read/write access int 21h mov bx, ax ; handle into bx push cs push cs pop es pop ds mov cx, heap - start lea dx, heap mov ah, 3Fh ; read file or device int 21h jc quit_ ; can't read ? lea si, start lea di, heap repe cmpsb ; byte for byte compare jne bad lea dx, clean ; we are golden mov ah, 9 ; print string int 21h jmp main_program bad: mov ah, 9 ; we are infected lea dx, infected int 21h lea dx, disinfection int 21h ; now, disinfect. File handle is still in bx ; we must move the file pointer to the beginning xor cx, cx xor dx, dx mov ax, 4200h int 21h ; move file pointer mov ah, 40h ; 40hex! mov cx, heap - start lea dx, start int 21h ; write file or device jnc success lea dx, not__ mov ah, 9 int 21h success:mov ah, 9 lea dx, successful int 21h xor cx, cx mov ah, 40h ; 40hex! int 21h ; truncate file main_program: quit_: mov ax, 4C00h int 21h disinfection db 0Dh, 0Ah, 'Disinfection $' not__ db 'not ' successful db 'successful.$' clean db 'Self check passed.$' infected db 'Self check failed. Program is probably ' db 'infected.$' heap: end start ;();();();();();();();();();();();();();();();();();();();();() Weak points: The same weak points that apply above also apply here. Additionally, the program may, by writing itself back to disk, give the virus the opportunity to reinfect. Remember, any memory resident viruses will already have loaded into memory by the time the program disinfects itself. When the program tries to disinfect itself, any virus that intercepts the 'write file or device' interrupt will intercept this write and re-infect. Again, tunneling is the clear solution.