Introduction to Randomly Evolving Machinecode
                          by Tom Van Braeckel

				 
June 22th 2001
Copyright(C)2001-2002

This is an introduction to Randomly Evolving  Machinecode (REM).

Copyright (C) 2001-2002, Tom Van Braeckel.   
Anyone may reproduce this document, in whole or in part, provided that:   
(1) any copy or republication of this whole document or a part of  this
document must show Tom Van Braeckel as the  source,  and  must  include
this notice; and (2) any other use of this material must reference this
manual and Tom  Van  Braeckel,  and  the  fact  that  the  material  is
copyright by Tom Van Braeckel and is used by permission.

Introduction to Randomly Evolving Machinecode
2001 Edition

Table of Contents

1. Introduction
1.1 What is REM ?
1.2 Link to nature
1.3 Known issues

2 Definitions and definition extensions
2.1 Randomly
2.2 Evolving
2.3 Machinecode
2.4 Natural Selection
2.5 Digital Selection

3 The use of REM
3.1 Proving Evolution
3.2 Studying accelerated evolution and natural selection
3.3 Other uses

4 Virus-like REM
4.1 Introduction to viruses
4.2 Why write virus-like REM
4.3 Introduction to Antiviruses
4.4 Introduction to Polymorphic viruses
4.5 Introduction to virus-like REM

5 About the Author

6 Bibliography
6.1 Evolution vs Creation
6.2 Earth Timeline
6.3 The Human Genome: A Creationist Overview
6.4 DebateDarwin.com
6.5 Understanding and Managing Polymorphic Viruses
6.6 Brain


***********************************************************************


1. Introduction

Table of Contents:

1.1 What is REM ?
1.2 Link to nature
1.3 Known issues
1.3.a Speed
1.3.b Corruption
1.3.c Lack of resources


1.1 What is REM ?

REM stands for Randomly Evolving Machinecode. 
REM is a program which replicates by copying itself,  randomly changing
one or more bytes in every offspring.
After  replicating,  REM   runs  the  offsprings,  which  should  start
replicating too. 
When an offspring is not able to replicate, it extincts.


1.2 Link to nature

Most scientist believe that in nature, primitive organisms replicate by
creating offsprings that seem the same at first sight, and mostly  they
are the same too. But, sometimes, when we examine  the  offspring  more
closer, we notice that it has changed in some way. The change is  often
just one or a few genes that have mutated.  This  change is most likely
to be random.
The idea of porting this sort of mutation to computers has been  around
for a while; "porting this sort of mutation to computers" would mean to
create a program which replicates,  but randomly changes one bit or one
byte in every offspring.


1.3 Known Issues
1.3.a Speed

It  took many million years for a single replicating cell to evolve to
an intelligent  being, no matter how  primitive  the  outcome  it  was.
Unless we are very  patiently  and  we  invent  a way  to  stop  aging,
REM's evolution has to go a lot faster.

Luckily,  computers are known to be very  fast  for  their  price,  and
companies just keep developing faster computers. The  example  programs
all produce many offsprings per second running on a 500  Mhz  computer.

Problem:  Evolution is too slow.
Solution: Fast mutation.
New Problem: Corruption

Unluckily, fast mutation leads to another problem to solve, corruption.


1.3.b Corruption

In fast mutation,  because of the fact that computers can  handle  only
work using digital calculations,  a  minor change to a computer program
is very likely to corrupt the program completely. 
In some cases, the whole computer will crash,  making it impossible for
the REM to continue evolving. 
Nature solves this problem by mutating very little  and  mutating  with 
large periods of time between every mutation.  This  way,  an  organism 
first has the chance to produce many offsprings  before  trying  a  new
mutation.  When the mutation corrupts the  organism,  the  others  just
continue replicating.

But no matter how bad it looks for REM right now, there is  a  solution
for the corruption problem, running run on many computer systems, which
the author will call "resources" for now.

Problem:  Fast mutation leads to a great chance of corruption.
Solution: Resources; run REM on many computer systems.
New Problem: Lack of resources

Unluckily,  resources lead to another problem to solve, the lack of it.


1.3.c Lack of resources

Say there are in the whole world 100 computers  running  REM  on  them.
That's very little, and the computers are not likely to  have  the  REM
running on them for years, so even IF the  REM  evolved  spectacularly,
the world would never get to see it. 

The solution is easy;  program the first version of REM  to  email  its
offsprings to a few other people owning a computer.  That way,  it will
run on the other computers too, mutating and replicating.
But,  this way  of  solving  the  problem  is  illegal  because  it  is
considered the theft  of  computer  resources,  AND  there  are  people
constantly writing programs to  stop  this  code  from  spreading,  the
Virus-Experts.

Problem: Lack of resources
Solution: Write virus-like REM
New Problem: Virus-Experts, Antiviruses

Luckily, Virus-Experts and Antiviruses are only a minor problem,  since
current virus detection methods  don't  stand  a  chance  against  REM.

Read more about virus-like REM in 4: Virus-like REM.


-----------------------------------------------------------------------


2  Definitions and extensions

Table of Contents:

2.1 Randomly
2.2 Evolving
2.3 Machinecode
2.4 Natural Selection

2.1 Randomly

ran·dom (rndm) 

- Having no specific pattern, purpose, or objective: random  movements. 
- Relating to an event in which all outcomes are equally likely, as  in
the testing of  a  blood  sample  for  the  presence  of  a  substance.

In REM, we refer to randomly because every byte is equally likely to be
changed. 

NOTE: Even though the evolution seems random, the chance of the outcome
being a non-working program which hasn't reproduced for generations  is
very little,  because if a generation is not able to reproduce, it will
probably extinct.
This process is called "natural selection", see 2.4

But, can a computer generate random numbers ? 
Some say it can't,  because the only thing it can do is calculate,  and
in theory, the outcome of every calculation is predictable.
When we refer to random numbers in computers,  we refer  to  a  number,
which is the result of a calculation using numbers we  can  predict  in
theory, but which we cannot predict in practice.

Example: 
--------

IN AX,0x40		; Store whatever is located at 0x40 in AX.
			; On an x86 CPU (Intel, AMD,...), 
			; this is the CPU timer.

			; AH is the left of what is in AX
			; AL is the right of what is in AX

			; Example:                 AX = 4535 
			; 			=> AH = 45 
			;			=> AL =   40

			; So now AH and AL are both hard to predict.

MOV AH,0x0E		; Don't mind this if 
MOV DX,0000h		; you don't know what it means.

INT 0x10		; Now we print the value of AL to the screen.
			; In the example this would be character nr. 40
			; which is an '@'.
			

			; Try to predict the output.
			; It is very hard.

There are more complicated ways to generate even more  random  numbers,
but for REM, this will do.


Can a human think of a random number ?
Every thing we do is the result of electrical impulses  in  our  brain.
If someone or something could measure all those impulses and would know
exactly what impulse does what,  it could -theoretically-  predict  the
random number you are going to choose.

Can anything be random ?
If someone or something could predict what you are going to do,  and if
the same machine could predict what everyone on this planet is going to
do, it could just speed them up in some  virtual  planet,  and  predict
what is going to happen in 100 years from now.

When you flip a coin, the random number depends on the force with which
you do it,  the wind,  gravity and so on.  Well,  if something knew all 
these factors,  it could predict the  number (0 or 1) even  before  the
coin falls back into the palm of your hand.


2.2 Evolving

e·volv·ing

- To develop (a characteristic) by evolutionary processes.

ev·o·lu·tion (v-lshn, v-)

- A gradual process in which something changes  into  a  different  and
usually more complex or better form.
- The process of developing.
- Change in  the genetic composition of a  population during successive
generations, as  a  result of  natural selection acting on  the genetic
variation among individuals, and resulting in  the  development of  new
species.

As we see when looking at the first definition of evolution,  we notice
that,  to be able to say REM is evolving,  it needs  to  change into  a
different form,  which is accomplished by changing a random byte  at  a
random location.
Changing into a more complex or(/and) better form happens automatically
due to Natural Selection (see 2.4 Natural Selection).


2.3 Machinecode

ma·chine code

- A set  of  instructions  for  a  specific  central  processing  unit,
 designed  to  be  usable  by  a  computer  without  being  translated.
- Also called Machine Language.

Machinecode is often confused with sourcecode. 
It is NOT:

MOV AH,3Eh
MOV BX,sourcefile_handle
INT 21h
JC ERROR3

Machinecode is a  set of hexadecimal values placed in a specific order.
Open an executable file in a  standard your favorite text editor to see
what machinecode looks like.

2.4 Natural Selection

nat·u·ral se·lec·tion

- The process in nature by  which,  according  to  Darwin's  theory  of
evolution, only the organisms best adapted to their environment tend to
survive  and  transmit  their  genetic  characteristics  in  increasing
numbers to succeeding generations, while those less adapted tend to  be
eliminated.

'Natural Selection'  contains the word natural,  which means it happens
in nature.  Because of the fact that REM cannot exist in nature,  there
is a new term which CAN be used; Digital Selection (2.5).


2.5 Digital Selection

dig·i·tal se·lec·tion

- The process in a digital environment by which, according  to Tom  Van
Braeckel's theory of Randomly Evolving Machinecode,  only  the Randomly 
Evolving Machincode best adapted to its environment tend to survive and
to reproduce, that way transmitting their characteristics in increasing
numbers to succeeding generations,  while those less adapted tend to be
eliminated.

Example: Suppose REM has been running on x86  computers  for  about  10
-------- years.  CPU capabilities have continued to increase and people
decide to start using 64-bit computers.  This is great  for  CPU making
companies but terrible for REM, as it can only run on 32-bit computers.

But, as it evolves, there will be REM offsprings that CAN run on 64-bit
computers, and most likely ONLY on 64-bit computers.

When a 32-bit REM creates a 64-bit  REM  and  emails  it  through,  the
64-bit offspring will just continue to evolve on the new platform.

In time, the 32-bit REM's will extinct or evolve to 64-bit REM's.
This is called Digital Selection.

But such a big change might not even be needed. 
Maybe new OS'es will include an advanced emulator which  allows  32-bit
code to be run on a 64-bit computer. In this case, the REM could slowly
adapt to the new processors.



-----------------------------------------------------------------------


3 The use of REM

Table of Contents:
3.1 Proving Evolution
3.2 Studying accelerated evolution and natural selection

3.1 Proving Evolution

In his 'Creation vs Evolution', Mr. Wiebe states:

"[..] Most of us understand that the information  that  represents  the
data and instructions for a computer program  has  a  particular  code,
designed specifically by the software engineer.

What would we expect to happen if, [..] we zapped the binary image from
which it was executing with a random change of some data bit?

In most cases,  the program would probably crash or seriously  fail  to
accomplish anything useful.
In some cases, the program might continue on oblivious to  the  change.
In a very  few  cases,  the  program  might  exhibit  some  interesting
aberrant behavior. 
But in no cases would we expect to get a  more  complex  program  or  a
program of a totally different kind."
			-- Mr. Wiebe in his 'Creation vs Evolution'

Here Mr. Wiebe is talking about what he would EXPECT.
If Mr. Wiebe had seen REM evolving he would  have  been  able  to  talk
about what he would really SEE.

Mr. Wiebe doesn't expect the program to become more complex or  totally 
different. 
I don't expect it to become totally different too. 
But I DO expect the program to become A LITTLE different,  and possibly
A LITTLE more complex.

Mr. Wiebe continues:

"So it is with random genetic mutations. 
Life forms are more complex than any computer program that we have ever
designed. 
Random genetic mutations are bad.
When   they   have  an  observable  effect  (i.e.,  are  phenotypically
expressed),  they  are  almost always to the detriment of the organism, 
killing it, maiming it, making it sterile, etc. [..]"
			-- Mr. Wiebe in his 'Creation vs Evolution'

Mr. Wiebe uses REM to explain he doesn't expect a mutation to  be  more
complex or from a totally different kind.
Mr. Wiebe says mutations are bad, because  he  never  expects  them  to
improve an organism.

Well what if REM WOULD sometimes IMPROVE by mutation ?
I'd say, if a simulation of an organism CAN improve  by  mutation,  why
can't monera have done the same millions (even billions) of years ago ?


3.2 Studying evolution and natural selection

Or to be more correct: 
"Studying accelerated evolution, fast mutation and digital selection."

A human's life is way to short to study evolution at normal speed.  But
using REM, we can simulate evolution by making it about 10 000 times as
fast.

Study this:

Typical Monere: 	  1 offspring / second
Typical REM: 		  500 times as fast ( 500 offsprings / second)
1 000 000 Typical REM's:  500 000 000 times as fast

Suppose it  took a monere 500 million years to evolve to an intelligent
being.  A little math shows us  it  should take one  year for 1 000 000 
REM's to evolve to an intelligent being.
Whoever thinks I am missing something here,  feel free to  contact  me.

3.3 Other uses

In stead of using viruses to spread REM (reason for this at 1.3.c), one
could use REM to spread viruses.
More on REM to spread viruses in '4: Virus-like REM'.


-----------------------------------------------------------------------


4 Virus-like REM 

Table of Contents:

4.1 Introduction to viruses
4.2 Why write virus-like REM
4.3 Introduction to Antiviruses
4.4 Introduction to Encrypted viruses
4.5 Introduction to Polymorphic viruses
4.5.a Polymorphic Detection
4.5.a.a Generic Decryption
4.5.a.b Heuristic-Based Generic Decryption
4.5.a.c The Striker System
4.6 Introduction to virus-like REM


4.1 Introduction to viruses

A virus is a cracker program  that  searches  out  other  programs  and
"infects" them by embedding a copy of itself  in  them,  so  that  they
become Trojan horses.
When these programs are executed, the embedded virus is  executed  too,
thus propagating the "infection".

Unlike  a  worm,  a virus  cannot  infect   other   computers   without
assistance. It is propagated by vectors such as humans trading programs
with their friends.  The virus may do  nothing but propagate itself and
then allow the program to run normally.  

Notice that 'Virii' is  not the official plural for virus,  but term is
widely used.
The official plural for virus is viruses.

In stead of virus-like REM,  one could even write worm-like REM,  which
would mean the REM spreads  on  it's  own,  by  -for example-  emailing
itself to other people.

A worm is a program that propagates itself over a network,  reproducing
itself as it goes. 
Nowadays the term has negative connotations, as it is assumed that only
crackers write worms.

To keep it simple, when referring to viruses, I mean viruses and worms.


4.2 Why write virus-like REM

The use of virus-like REM is explained in  "1.3.c : Lack of resources".

In short:  When running REM on multiple  computers,  there  is  a  much
greater chance an interesting offspring will be created. There's a very
small chance people will voluntary run REM on their  computers, and one
might see using virus-like REM as a  great  way  to  run  REM  on  many
computers.


4.3 Introduction to Antiviruses

An Antivirus is a software program  designed  to  identify  and
remove a known or potential computer virus.  
Most antivirus programs include an auto-update feature that enables the
program to download profiles of new viruses so that it  can  check  for
the new viruses as soon as they are discovered.

"A simple virus that merely replicates itself is the easiest to detect.
If a user launches an infected program,  the virus gains control of the
computer and attaches  a  copy  of  itself  to  another  program  file.

After it spreads, the virus transfers control back to the host program,
which functions normally.  Yet no  matter how many times a simple virus
infects a  new file or floppy disk,  for example,  the infection always
makes an exact copy of itself. Anti-virus software need only search, or
scan, for a tell-tale sequence of bytes known as a signature  found  in
the virus.

In response,  virus authors began  encrypting viruses.  The idea was to
hide  the  fixed  signature  by  scrambling  the   virus,   making   it
unrecognizable to a virus scanner.
These viruses are called encrypted viruses."
                             -- Carey Nachenberg for Symantic, see 6.5


4.4 Introduction to Encrypted viruses

An encrypted virus looks like this:

1 Encryption Routine : A program which  decrypts  the  encrypted  virus
---------------------- using the encryption key.
2 Encryption Key : The "key" to decrypt the  encrypted  virus.  Without
------------------ this key, the encrypted virus cannot  be  decrypted.
  The encryption key changes every time the virus replicates.
3 Encrypted Virus : The encrypted virus.  Because the key changes every
------------------- time the virus  decrypts,  so  does  the  encrypted
virus.

This might look like a hard-to-detect virus, but it is not, because the
Encryption Routine never changes. The  virus scanner just looks for the
routine and  when  it  finds  a  file  containing  it,  it's  a  virus.


4.5 Introduction to Polymorphic viruses

"In retaliation, virus authors developed the polymorphic virus. Like an
encrypted virus, a polymorphic virus includes a  scrambled  virus  body
and a decryption routine that first gains control of the computer, then
decrypts the virus body.

However, a polymorphic virus adds to these two components  a  third:  a
mutation engine that  generates  randomized  decryption  routines  that
change each time a virus infects a new program.

With no fixed signature to scan for, and no fixed  decryption  routine,
no two infections look alike. The result  is  a  formidable  adversary.

The Tequila and Maltese Amoeba  viruses  caused  the  first  widespread
polymorphic infections in 1991.
In 1992, Dark  Avenger,  author  of  Maltese  Amoeba,  distributed  the
Mutation Engine, also  known  as  MtE,  to  other  virus  authors  with
instructions on how  to  use  it  to  build  still  more  polymorphics.

Today, anti-virus researchers report that polymorphic viruses  comprise
about five percent of the more than 8,000 known viruses."
                             -- Carey Nachenberg for Symantic, see 6.5

4.5.a Polymorphic Detection

Anti-virus researchers first fought back by creating special  detection
routines  designed  to  catch  each  polymorphic  virus,  one  by  one. 

This  approach  proved   inherently  impractical,  time-consuming,  and
costly. Each new polymorphic requires its own detection program. 


4.5.a.a Generic Decryption

"A scanner that uses generic  decryption  relies  on  this  behavior to
detect polymorphics. It loads this file into a  self-contained  virtual
computer created from RAM.
Inside this virtual computer,  program files execute as if running on a 
real computer. The scanner monitors and controls the program file as it
executes inside the virtual computer. 
A polymorphic virus running inside  the  virtual  computer  can  do  no
damage because it is isolated from the real computer.

The key problem with generic decryption is speed. 
Generic decryption is of no practical  use  if  it  spends  five  hours
waiting  for  a   polymorphic  virus  to  decrypt  inside  the  virtual
computer."                  -- Carey Nachenberg for Symantic, see 6.5


4.5.a.b Heuristic-Based Generic Decryption

"To solve  this  problem ,  generic  decryption employs  heuristics,  a
generic set of rules that helps differentiate  non - virus  from  virus
behavior.

As an example, a typical nonvirus program will in  all  likelihood  use
the results from math computations it  makes  as  it  runs  inside  the
virtual computer.
On the other hand, a polymorphic virus may perform similar computations
, yet throw away the results because those results  are  irrelevant  to
the virus. 
Heuristic-based   generic   decryption   looks  for  such  inconsistent
behavior.
An inconsistency increases the likelihood of infection  and  prompts  a
scanner that relies on heuristic-based rules to extend  the  length  of
time a suspect file executes inside the  virtual  computer ,  giving  a
potentially infected file enough time to decrypt itself  and  expose  a
lurking virus.


Inhibitor Rules:
• If the contents of a register are destroyed before being
used, increase VirusProbability by 1.2%.
• If a NOP instruction is encountered, then increase
VirusProbability by .5%.
• If the program does no memory writes within 100
executed instructions, decrease VirusProbability by 5%.
• If the program generates DOS interrupts, decrease
VirusProbability by 15%.

Unfortunately, heuristics demand continual research and updating.
Heuristic rules tuned to detect 500 viruses, for example, may  miss  10
of those viruses when altered to detect 5 new viruses."
                             -- Carey Nachenberg for Symantic, see 6.5


4.5.a.c The Striker System

"Symantec’s Striker system provides anti-virus researchers  with a  new
weapon to detect polymorphics.
Like generic decryption, each time it scans a new program file, Striker
loads this file into a self-contained  virtual  computer  created  from
RAM.
The  program executes in this virtual computer as if it were running on
a real computer.

However,  Striker   does   not  rely  on  heuristic  guesses  to  guide
decryption. 
Instead, it relies on virus profiles or rules that are specific to each
virus, not a generic set of  rules  that  differentiate  nonvirus  from
virus behavior.

When scanning a new file, Striker first attempts  to  exclude  as  many
viruses as possible from consideration, just as a doctor rules out  the
possibility of chicken pox if an examination fails to detect scabs on a
patient’s body.

For  example,   different  viruses  infect  different  executable  file
formats.  Some  infect  only .COM files. Others infect only .EXE files.
Some viruses infect both.  Very few infect  .SYS files. As a result, as 
it scans an .EXE file, Striker ignores polymorphics  that  infect  only
.COM and .SYS files. If all viruses are eliminated from  consideration,
then the file is deemed clean. 
Striker closes it and advances to scan the next file.

To date, generic decryption has proved to be the single most  effective
method of detecting polymorphics. Striker improves  on  this  approach.

Yet it is only a matter of time before virus authors design  some  new,
insidious type of virus that evades current methods of detection."
                             -- Carey Nachenberg for Symantic, see 6.5

4.6 Introduction to virus-like REM

Virus-like REM looks like this:

REM: ----- 1 Mutation Unit:  The part of the REM which loads itself and
      \    ----------------  changes one or more bytes. 
       \                     The Reproduction Unit is also changed.
        \
         - 2 Reproduction Unit: The part of the REM  which  makes  sure
           -------------------- the mutated REM will spread.  This part
				will infect .EXE files or, for example,
                                simply  email  itself  to  other users.

The Mutation Unit (which is the first part of the code) changes the REM
a little. 

The Reproduction takes care of other things, such as emailing itself to
other users, or disabling the Antivirus Auto-Update Function.

Currently, there is no detection method against Virus-like REM's and
REM-like viruses, so antivirus-experts have to write a special
detection module for every REM-like virus or virus-like REM.


-----------------------------------------------------------------------


5 About the author

It's very hard to write a few lines  about  yourself  someone  else  is
going to read, here's something that bit me; 

I'm interested in every aspect of almost everything.

Anyone who has a comment, a question, an answer, an idea, a compliment,
and anyone who  doesn't  agree  with  something  I  mentioned  in  this
introduction, feel free to send an e-mail to tom@coders.be .
No need to say every email gets replied to.

One could also check my homepage at: http://t-Omicr0n.hexyn.be/ to find
out what I've been doing lately.

-----------------------------------------------------------------------


6 Bibliography

Table of Contents:

6.1 Evolution vs Creation
6.2 Earth Timeline
6.3 The Human Genome: A Creationist Overview
6.4 DebateDarwin.com
6.5 Understanding and Managing Polymorphic Viruses
6.6 Brain


6.1 Evolution vs Creation

A very impressive and must-read paper written by Garth D. Wiebe on  why
he believes in creation in stead of evolution.
Written in English.

Document URL: http://www.ultranet.com/~wiebe/e.htm


6.2 Earth Timeline

A simple image on when what happened long on the earth a long time ago.
It starts 600 Million years ago.
Commented in Dutch.

Image URL: http://montessori-infosite.kennisnet.nl/1emens/1emensplaat/tijdlijngr.JPG


6.3 The Human Genome Project

A Creationist explains the Human Gnome Project:
Document URL: http://www.icr.org/headlines/humangenomemap.html

The Human Genome Project: Genome 'treasure trove' 
Document URL: http://newsvote.bbc.co.uk/hi/english/sci/tech/newsid_1164000/1164839.stm


6.4 DebateDarwin.com

"The monkey trial is still with us. Evolution is being attacked, and is
fighting back. It's  a  lively  debate  full  of  sound  and  fury  and
signifying something, but what?"

This site is to give a  hearing  to  the  actors  in  this  play  about
origins.  That   covers  quite  a  bit  of  ground  ---  evolutionists,
creationists, intelligent designers,...

Document URL: http://www.debatedarwin.com/


6.5 Understanding and Managing Polymorphic Viruses

Another excellent paper by Carey Nachenberg for Symantec. It handles on
Polymorphic Viruses and is easy to read, even for beginners.

Document URL: http://www.norton.com/


6.6 Brain

Not to be confused with intelligence.
One who has brains is not automatically intelligent.

The portion of the vertebrate central nervous system that  is  enclosed
within the cranium,  continuous with the spinal cord,  and composed  of
gray matter and white matter.

It is the primary center for  the  regulation  and  control  of  bodily
activities,  receiving   and   interpreting   sensory   impulses,   and
transmitting information to the muscles and body organs.

It is also the seat of consciousness,  thought,  memory,  and  emotion.
Note: Homepage and contact info are outdated.

Back to index