Introduction to Randomly Evolving Machinecode
by Tom Van Braeckel
June 22th 2001
Copyright(C)2001-2002
This is an introduction to Randomly Evolving Machinecode (REM).
Copyright (C) 2001-2002, Tom Van Braeckel.
Anyone may reproduce this document, in whole or in part, provided that:
(1) any copy or republication of this whole document or a part of this
document must show Tom Van Braeckel as the source, and must include
this notice; and (2) any other use of this material must reference this
manual and Tom Van Braeckel, and the fact that the material is
copyright by Tom Van Braeckel and is used by permission.
Introduction to Randomly Evolving Machinecode
2001 Edition
Table of Contents
1. Introduction
1.1 What is REM ?
1.2 Link to nature
1.3 Known issues
2 Definitions and definition extensions
2.1 Randomly
2.2 Evolving
2.3 Machinecode
2.4 Natural Selection
2.5 Digital Selection
3 The use of REM
3.1 Proving Evolution
3.2 Studying accelerated evolution and natural selection
3.3 Other uses
4 Virus-like REM
4.1 Introduction to viruses
4.2 Why write virus-like REM
4.3 Introduction to Antiviruses
4.4 Introduction to Polymorphic viruses
4.5 Introduction to virus-like REM
5 About the Author
6 Bibliography
6.1 Evolution vs Creation
6.2 Earth Timeline
6.3 The Human Genome: A Creationist Overview
6.4 DebateDarwin.com
6.5 Understanding and Managing Polymorphic Viruses
6.6 Brain
***********************************************************************
1. Introduction
Table of Contents:
1.1 What is REM ?
1.2 Link to nature
1.3 Known issues
1.3.a Speed
1.3.b Corruption
1.3.c Lack of resources
1.1 What is REM ?
REM stands for Randomly Evolving Machinecode.
REM is a program which replicates by copying itself, randomly changing
one or more bytes in every offspring.
After replicating, REM runs the offsprings, which should start
replicating too.
When an offspring is not able to replicate, it extincts.
1.2 Link to nature
Most scientist believe that in nature, primitive organisms replicate by
creating offsprings that seem the same at first sight, and mostly they
are the same too. But, sometimes, when we examine the offspring more
closer, we notice that it has changed in some way. The change is often
just one or a few genes that have mutated. This change is most likely
to be random.
The idea of porting this sort of mutation to computers has been around
for a while; "porting this sort of mutation to computers" would mean to
create a program which replicates, but randomly changes one bit or one
byte in every offspring.
1.3 Known Issues
1.3.a Speed
It took many million years for a single replicating cell to evolve to
an intelligent being, no matter how primitive the outcome it was.
Unless we are very patiently and we invent a way to stop aging,
REM's evolution has to go a lot faster.
Luckily, computers are known to be very fast for their price, and
companies just keep developing faster computers. The example programs
all produce many offsprings per second running on a 500 Mhz computer.
Problem: Evolution is too slow.
Solution: Fast mutation.
New Problem: Corruption
Unluckily, fast mutation leads to another problem to solve, corruption.
1.3.b Corruption
In fast mutation, because of the fact that computers can handle only
work using digital calculations, a minor change to a computer program
is very likely to corrupt the program completely.
In some cases, the whole computer will crash, making it impossible for
the REM to continue evolving.
Nature solves this problem by mutating very little and mutating with
large periods of time between every mutation. This way, an organism
first has the chance to produce many offsprings before trying a new
mutation. When the mutation corrupts the organism, the others just
continue replicating.
But no matter how bad it looks for REM right now, there is a solution
for the corruption problem, running run on many computer systems, which
the author will call "resources" for now.
Problem: Fast mutation leads to a great chance of corruption.
Solution: Resources; run REM on many computer systems.
New Problem: Lack of resources
Unluckily, resources lead to another problem to solve, the lack of it.
1.3.c Lack of resources
Say there are in the whole world 100 computers running REM on them.
That's very little, and the computers are not likely to have the REM
running on them for years, so even IF the REM evolved spectacularly,
the world would never get to see it.
The solution is easy; program the first version of REM to email its
offsprings to a few other people owning a computer. That way, it will
run on the other computers too, mutating and replicating.
But, this way of solving the problem is illegal because it is
considered the theft of computer resources, AND there are people
constantly writing programs to stop this code from spreading, the
Virus-Experts.
Problem: Lack of resources
Solution: Write virus-like REM
New Problem: Virus-Experts, Antiviruses
Luckily, Virus-Experts and Antiviruses are only a minor problem, since
current virus detection methods don't stand a chance against REM.
Read more about virus-like REM in 4: Virus-like REM.
-----------------------------------------------------------------------
2 Definitions and extensions
Table of Contents:
2.1 Randomly
2.2 Evolving
2.3 Machinecode
2.4 Natural Selection
2.1 Randomly
ran·dom (rndm)
- Having no specific pattern, purpose, or objective: random movements.
- Relating to an event in which all outcomes are equally likely, as in
the testing of a blood sample for the presence of a substance.
In REM, we refer to randomly because every byte is equally likely to be
changed.
NOTE: Even though the evolution seems random, the chance of the outcome
being a non-working program which hasn't reproduced for generations is
very little, because if a generation is not able to reproduce, it will
probably extinct.
This process is called "natural selection", see 2.4
But, can a computer generate random numbers ?
Some say it can't, because the only thing it can do is calculate, and
in theory, the outcome of every calculation is predictable.
When we refer to random numbers in computers, we refer to a number,
which is the result of a calculation using numbers we can predict in
theory, but which we cannot predict in practice.
Example:
--------
IN AX,0x40 ; Store whatever is located at 0x40 in AX.
; On an x86 CPU (Intel, AMD,...),
; this is the CPU timer.
; AH is the left of what is in AX
; AL is the right of what is in AX
; Example: AX = 4535
; => AH = 45
; => AL = 40
; So now AH and AL are both hard to predict.
MOV AH,0x0E ; Don't mind this if
MOV DX,0000h ; you don't know what it means.
INT 0x10 ; Now we print the value of AL to the screen.
; In the example this would be character nr. 40
; which is an '@'.
; Try to predict the output.
; It is very hard.
There are more complicated ways to generate even more random numbers,
but for REM, this will do.
Can a human think of a random number ?
Every thing we do is the result of electrical impulses in our brain.
If someone or something could measure all those impulses and would know
exactly what impulse does what, it could -theoretically- predict the
random number you are going to choose.
Can anything be random ?
If someone or something could predict what you are going to do, and if
the same machine could predict what everyone on this planet is going to
do, it could just speed them up in some virtual planet, and predict
what is going to happen in 100 years from now.
When you flip a coin, the random number depends on the force with which
you do it, the wind, gravity and so on. Well, if something knew all
these factors, it could predict the number (0 or 1) even before the
coin falls back into the palm of your hand.
2.2 Evolving
e·volv·ing
- To develop (a characteristic) by evolutionary processes.
ev·o·lu·tion (v-lshn, v-)
- A gradual process in which something changes into a different and
usually more complex or better form.
- The process of developing.
- Change in the genetic composition of a population during successive
generations, as a result of natural selection acting on the genetic
variation among individuals, and resulting in the development of new
species.
As we see when looking at the first definition of evolution, we notice
that, to be able to say REM is evolving, it needs to change into a
different form, which is accomplished by changing a random byte at a
random location.
Changing into a more complex or(/and) better form happens automatically
due to Natural Selection (see 2.4 Natural Selection).
2.3 Machinecode
ma·chine code
- A set of instructions for a specific central processing unit,
designed to be usable by a computer without being translated.
- Also called Machine Language.
Machinecode is often confused with sourcecode.
It is NOT:
MOV AH,3Eh
MOV BX,sourcefile_handle
INT 21h
JC ERROR3
Machinecode is a set of hexadecimal values placed in a specific order.
Open an executable file in a standard your favorite text editor to see
what machinecode looks like.
2.4 Natural Selection
nat·u·ral se·lec·tion
- The process in nature by which, according to Darwin's theory of
evolution, only the organisms best adapted to their environment tend to
survive and transmit their genetic characteristics in increasing
numbers to succeeding generations, while those less adapted tend to be
eliminated.
'Natural Selection' contains the word natural, which means it happens
in nature. Because of the fact that REM cannot exist in nature, there
is a new term which CAN be used; Digital Selection (2.5).
2.5 Digital Selection
dig·i·tal se·lec·tion
- The process in a digital environment by which, according to Tom Van
Braeckel's theory of Randomly Evolving Machinecode, only the Randomly
Evolving Machincode best adapted to its environment tend to survive and
to reproduce, that way transmitting their characteristics in increasing
numbers to succeeding generations, while those less adapted tend to be
eliminated.
Example: Suppose REM has been running on x86 computers for about 10
-------- years. CPU capabilities have continued to increase and people
decide to start using 64-bit computers. This is great for CPU making
companies but terrible for REM, as it can only run on 32-bit computers.
But, as it evolves, there will be REM offsprings that CAN run on 64-bit
computers, and most likely ONLY on 64-bit computers.
When a 32-bit REM creates a 64-bit REM and emails it through, the
64-bit offspring will just continue to evolve on the new platform.
In time, the 32-bit REM's will extinct or evolve to 64-bit REM's.
This is called Digital Selection.
But such a big change might not even be needed.
Maybe new OS'es will include an advanced emulator which allows 32-bit
code to be run on a 64-bit computer. In this case, the REM could slowly
adapt to the new processors.
-----------------------------------------------------------------------
3 The use of REM
Table of Contents:
3.1 Proving Evolution
3.2 Studying accelerated evolution and natural selection
3.1 Proving Evolution
In his 'Creation vs Evolution', Mr. Wiebe states:
"[..] Most of us understand that the information that represents the
data and instructions for a computer program has a particular code,
designed specifically by the software engineer.
What would we expect to happen if, [..] we zapped the binary image from
which it was executing with a random change of some data bit?
In most cases, the program would probably crash or seriously fail to
accomplish anything useful.
In some cases, the program might continue on oblivious to the change.
In a very few cases, the program might exhibit some interesting
aberrant behavior.
But in no cases would we expect to get a more complex program or a
program of a totally different kind."
-- Mr. Wiebe in his 'Creation vs Evolution'
Here Mr. Wiebe is talking about what he would EXPECT.
If Mr. Wiebe had seen REM evolving he would have been able to talk
about what he would really SEE.
Mr. Wiebe doesn't expect the program to become more complex or totally
different.
I don't expect it to become totally different too.
But I DO expect the program to become A LITTLE different, and possibly
A LITTLE more complex.
Mr. Wiebe continues:
"So it is with random genetic mutations.
Life forms are more complex than any computer program that we have ever
designed.
Random genetic mutations are bad.
When they have an observable effect (i.e., are phenotypically
expressed), they are almost always to the detriment of the organism,
killing it, maiming it, making it sterile, etc. [..]"
-- Mr. Wiebe in his 'Creation vs Evolution'
Mr. Wiebe uses REM to explain he doesn't expect a mutation to be more
complex or from a totally different kind.
Mr. Wiebe says mutations are bad, because he never expects them to
improve an organism.
Well what if REM WOULD sometimes IMPROVE by mutation ?
I'd say, if a simulation of an organism CAN improve by mutation, why
can't monera have done the same millions (even billions) of years ago ?
3.2 Studying evolution and natural selection
Or to be more correct:
"Studying accelerated evolution, fast mutation and digital selection."
A human's life is way to short to study evolution at normal speed. But
using REM, we can simulate evolution by making it about 10 000 times as
fast.
Study this:
Typical Monere: 1 offspring / second
Typical REM: 500 times as fast ( 500 offsprings / second)
1 000 000 Typical REM's: 500 000 000 times as fast
Suppose it took a monere 500 million years to evolve to an intelligent
being. A little math shows us it should take one year for 1 000 000
REM's to evolve to an intelligent being.
Whoever thinks I am missing something here, feel free to contact me.
3.3 Other uses
In stead of using viruses to spread REM (reason for this at 1.3.c), one
could use REM to spread viruses.
More on REM to spread viruses in '4: Virus-like REM'.
-----------------------------------------------------------------------
4 Virus-like REM
Table of Contents:
4.1 Introduction to viruses
4.2 Why write virus-like REM
4.3 Introduction to Antiviruses
4.4 Introduction to Encrypted viruses
4.5 Introduction to Polymorphic viruses
4.5.a Polymorphic Detection
4.5.a.a Generic Decryption
4.5.a.b Heuristic-Based Generic Decryption
4.5.a.c The Striker System
4.6 Introduction to virus-like REM
4.1 Introduction to viruses
A virus is a cracker program that searches out other programs and
"infects" them by embedding a copy of itself in them, so that they
become Trojan horses.
When these programs are executed, the embedded virus is executed too,
thus propagating the "infection".
Unlike a worm, a virus cannot infect other computers without
assistance. It is propagated by vectors such as humans trading programs
with their friends. The virus may do nothing but propagate itself and
then allow the program to run normally.
Notice that 'Virii' is not the official plural for virus, but term is
widely used.
The official plural for virus is viruses.
In stead of virus-like REM, one could even write worm-like REM, which
would mean the REM spreads on it's own, by -for example- emailing
itself to other people.
A worm is a program that propagates itself over a network, reproducing
itself as it goes.
Nowadays the term has negative connotations, as it is assumed that only
crackers write worms.
To keep it simple, when referring to viruses, I mean viruses and worms.
4.2 Why write virus-like REM
The use of virus-like REM is explained in "1.3.c : Lack of resources".
In short: When running REM on multiple computers, there is a much
greater chance an interesting offspring will be created. There's a very
small chance people will voluntary run REM on their computers, and one
might see using virus-like REM as a great way to run REM on many
computers.
4.3 Introduction to Antiviruses
An Antivirus is a software program designed to identify and
remove a known or potential computer virus.
Most antivirus programs include an auto-update feature that enables the
program to download profiles of new viruses so that it can check for
the new viruses as soon as they are discovered.
"A simple virus that merely replicates itself is the easiest to detect.
If a user launches an infected program, the virus gains control of the
computer and attaches a copy of itself to another program file.
After it spreads, the virus transfers control back to the host program,
which functions normally. Yet no matter how many times a simple virus
infects a new file or floppy disk, for example, the infection always
makes an exact copy of itself. Anti-virus software need only search, or
scan, for a tell-tale sequence of bytes known as a signature found in
the virus.
In response, virus authors began encrypting viruses. The idea was to
hide the fixed signature by scrambling the virus, making it
unrecognizable to a virus scanner.
These viruses are called encrypted viruses."
-- Carey Nachenberg for Symantic, see 6.5
4.4 Introduction to Encrypted viruses
An encrypted virus looks like this:
1 Encryption Routine : A program which decrypts the encrypted virus
---------------------- using the encryption key.
2 Encryption Key : The "key" to decrypt the encrypted virus. Without
------------------ this key, the encrypted virus cannot be decrypted.
The encryption key changes every time the virus replicates.
3 Encrypted Virus : The encrypted virus. Because the key changes every
------------------- time the virus decrypts, so does the encrypted
virus.
This might look like a hard-to-detect virus, but it is not, because the
Encryption Routine never changes. The virus scanner just looks for the
routine and when it finds a file containing it, it's a virus.
4.5 Introduction to Polymorphic viruses
"In retaliation, virus authors developed the polymorphic virus. Like an
encrypted virus, a polymorphic virus includes a scrambled virus body
and a decryption routine that first gains control of the computer, then
decrypts the virus body.
However, a polymorphic virus adds to these two components a third: a
mutation engine that generates randomized decryption routines that
change each time a virus infects a new program.
With no fixed signature to scan for, and no fixed decryption routine,
no two infections look alike. The result is a formidable adversary.
The Tequila and Maltese Amoeba viruses caused the first widespread
polymorphic infections in 1991.
In 1992, Dark Avenger, author of Maltese Amoeba, distributed the
Mutation Engine, also known as MtE, to other virus authors with
instructions on how to use it to build still more polymorphics.
Today, anti-virus researchers report that polymorphic viruses comprise
about five percent of the more than 8,000 known viruses."
-- Carey Nachenberg for Symantic, see 6.5
4.5.a Polymorphic Detection
Anti-virus researchers first fought back by creating special detection
routines designed to catch each polymorphic virus, one by one.
This approach proved inherently impractical, time-consuming, and
costly. Each new polymorphic requires its own detection program.
4.5.a.a Generic Decryption
"A scanner that uses generic decryption relies on this behavior to
detect polymorphics. It loads this file into a self-contained virtual
computer created from RAM.
Inside this virtual computer, program files execute as if running on a
real computer. The scanner monitors and controls the program file as it
executes inside the virtual computer.
A polymorphic virus running inside the virtual computer can do no
damage because it is isolated from the real computer.
The key problem with generic decryption is speed.
Generic decryption is of no practical use if it spends five hours
waiting for a polymorphic virus to decrypt inside the virtual
computer." -- Carey Nachenberg for Symantic, see 6.5
4.5.a.b Heuristic-Based Generic Decryption
"To solve this problem , generic decryption employs heuristics, a
generic set of rules that helps differentiate non - virus from virus
behavior.
As an example, a typical nonvirus program will in all likelihood use
the results from math computations it makes as it runs inside the
virtual computer.
On the other hand, a polymorphic virus may perform similar computations
, yet throw away the results because those results are irrelevant to
the virus.
Heuristic-based generic decryption looks for such inconsistent
behavior.
An inconsistency increases the likelihood of infection and prompts a
scanner that relies on heuristic-based rules to extend the length of
time a suspect file executes inside the virtual computer , giving a
potentially infected file enough time to decrypt itself and expose a
lurking virus.
Inhibitor Rules:
• If the contents of a register are destroyed before being
used, increase VirusProbability by 1.2%.
• If a NOP instruction is encountered, then increase
VirusProbability by .5%.
• If the program does no memory writes within 100
executed instructions, decrease VirusProbability by 5%.
• If the program generates DOS interrupts, decrease
VirusProbability by 15%.
Unfortunately, heuristics demand continual research and updating.
Heuristic rules tuned to detect 500 viruses, for example, may miss 10
of those viruses when altered to detect 5 new viruses."
-- Carey Nachenberg for Symantic, see 6.5
4.5.a.c The Striker System
"Symantec’s Striker system provides anti-virus researchers with a new
weapon to detect polymorphics.
Like generic decryption, each time it scans a new program file, Striker
loads this file into a self-contained virtual computer created from
RAM.
The program executes in this virtual computer as if it were running on
a real computer.
However, Striker does not rely on heuristic guesses to guide
decryption.
Instead, it relies on virus profiles or rules that are specific to each
virus, not a generic set of rules that differentiate nonvirus from
virus behavior.
When scanning a new file, Striker first attempts to exclude as many
viruses as possible from consideration, just as a doctor rules out the
possibility of chicken pox if an examination fails to detect scabs on a
patient’s body.
For example, different viruses infect different executable file
formats. Some infect only .COM files. Others infect only .EXE files.
Some viruses infect both. Very few infect .SYS files. As a result, as
it scans an .EXE file, Striker ignores polymorphics that infect only
.COM and .SYS files. If all viruses are eliminated from consideration,
then the file is deemed clean.
Striker closes it and advances to scan the next file.
To date, generic decryption has proved to be the single most effective
method of detecting polymorphics. Striker improves on this approach.
Yet it is only a matter of time before virus authors design some new,
insidious type of virus that evades current methods of detection."
-- Carey Nachenberg for Symantic, see 6.5
4.6 Introduction to virus-like REM
Virus-like REM looks like this:
REM: ----- 1 Mutation Unit: The part of the REM which loads itself and
\ ---------------- changes one or more bytes.
\ The Reproduction Unit is also changed.
\
- 2 Reproduction Unit: The part of the REM which makes sure
-------------------- the mutated REM will spread. This part
will infect .EXE files or, for example,
simply email itself to other users.
The Mutation Unit (which is the first part of the code) changes the REM
a little.
The Reproduction takes care of other things, such as emailing itself to
other users, or disabling the Antivirus Auto-Update Function.
Currently, there is no detection method against Virus-like REM's and
REM-like viruses, so antivirus-experts have to write a special
detection module for every REM-like virus or virus-like REM.
-----------------------------------------------------------------------
5 About the author
It's very hard to write a few lines about yourself someone else is
going to read, here's something that bit me;
I'm interested in every aspect of almost everything.
Anyone who has a comment, a question, an answer, an idea, a compliment,
and anyone who doesn't agree with something I mentioned in this
introduction, feel free to send an e-mail to tom@coders.be .
No need to say every email gets replied to.
One could also check my homepage at: http://t-Omicr0n.hexyn.be/ to find
out what I've been doing lately.
-----------------------------------------------------------------------
6 Bibliography
Table of Contents:
6.1 Evolution vs Creation
6.2 Earth Timeline
6.3 The Human Genome: A Creationist Overview
6.4 DebateDarwin.com
6.5 Understanding and Managing Polymorphic Viruses
6.6 Brain
6.1 Evolution vs Creation
A very impressive and must-read paper written by Garth D. Wiebe on why
he believes in creation in stead of evolution.
Written in English.
Document URL: http://www.ultranet.com/~wiebe/e.htm
6.2 Earth Timeline
A simple image on when what happened long on the earth a long time ago.
It starts 600 Million years ago.
Commented in Dutch.
Image URL: http://montessori-infosite.kennisnet.nl/1emens/1emensplaat/tijdlijngr.JPG
6.3 The Human Genome Project
A Creationist explains the Human Gnome Project:
Document URL: http://www.icr.org/headlines/humangenomemap.html
The Human Genome Project: Genome 'treasure trove'
Document URL: http://newsvote.bbc.co.uk/hi/english/sci/tech/newsid_1164000/1164839.stm
6.4 DebateDarwin.com
"The monkey trial is still with us. Evolution is being attacked, and is
fighting back. It's a lively debate full of sound and fury and
signifying something, but what?"
This site is to give a hearing to the actors in this play about
origins. That covers quite a bit of ground --- evolutionists,
creationists, intelligent designers,...
Document URL: http://www.debatedarwin.com/
6.5 Understanding and Managing Polymorphic Viruses
Another excellent paper by Carey Nachenberg for Symantec. It handles on
Polymorphic Viruses and is easy to read, even for beginners.
Document URL: http://www.norton.com/
6.6 Brain
Not to be confused with intelligence.
One who has brains is not automatically intelligent.
The portion of the vertebrate central nervous system that is enclosed
within the cranium, continuous with the spinal cord, and composed of
gray matter and white matter.
It is the primary center for the regulation and control of bodily
activities, receiving and interpreting sensory impulses, and
transmitting information to the muscles and body organs.
It is also the seat of consciousness, thought, memory, and emotion.
Note: Homepage and contact
info are outdated.
Back to index