/---------------------------------------\
-| Formulas for Random Number Generators |----------------------------------
 \---------------------------------------/ /-----------------------------\
-------------------------------------------| by The Mental Driller / 29A |--
                  /----------------------\ \-----------------------------/
------------------| written for Matrix#3 |----------------------------------
                  \----------------------/

When a polymorphic engine is done, we need a function to generate random
numbers, but not always all the number generation routines are well done,
because sometimes the results fall in a value loop that always return a
value very near to the last one, or a numeric sequence easily predictable,
which can make that our polymorphic engine generates not very polymorphic
code.

Many VX people have researched this subject because the random number
generator (RNG) of the engine is one of the main bricks to make it. In
this article, I'll show some nearly-to-perfect RNGs that can be used in
a polymorphic/metamorphic engine to generate that precious random
number that we always wanted to have :P. In this article, I'll explain
some routines without entering very deep in the mathematical aspects of
them, since many of you hate maths, I know :).

The first one thing that we must/should do is to initialize the seed of
the RNG. We can initialize with the date of the system, for example, to
make slow poly/metamorphism, since the RNG will generate the same sequence
of random numbers in that day. If we don't care about that, one way of
getting a perfect random seed is to use the instruction RDTSC, which will
return a 64 bits value in EDX:EAX, being this number the clock count of
the processor since the computer was switched on. After getting these
values, we put them in the variables that we will use in our RNG.

The routines that I expose in every point are only examples to show the
technique. Since making a poly/metamorphic engine (if you use them to
make this) is very personal, you maybe want to make even the RNG, but
you can freely use this formulas, of course.

0 - Simple RNGs to call few times
---------------------------------

There are situations where you need to get a sequence of pseudo-random
values but you don't need an special or perfect routine, but only one that
satisfies you in a single routine (for example, a payload). A simple but
effective RNG is the next formula:

Random:         add     eax, [ebp+RndSeed]
                add     [epb+RndSeed], eax
                ret

Don't use this to make a polymorphism engine! This shitty RNG is very useful
for little viruses that only need to make a random number eventually, but it
is quite predictable, at least when generating code.


1 - Linear-congruent RNG
------------------------

Some polymorphic viruses use this type of RNG. The basic idea is to
multiply the last generated number by a large odd integer and then add an
odd number to break a possible linearity. Let's see the RNG in GriYo's
Dengue:

get_rnd32:      push ecx
                push edx
                mov eax,dword ptr [ebp+rnd32_seed]
                mov ecx,41C64E6Dh
                mul ecx
                add eax,00003039h
                and eax,7FFFFFFFh             ;; (this is optional)
                mov dword ptr [ebp+rnd32_seed],eax
                pop edx
                pop ecx
                ret

This gets the last number generated by the routine and multiplies it by
41C64E6Dh, being this number a random fixed one (just be sure it's odd and
has randomized actived bits). After the multiplication, an odd number is
added to this, to break a possible linearity or loop. This formula is very
good to get a global random, but the sequence of the LSB (less significant
bit) is easily predictable (1, 0, 1, 0, 1...) since the multiplication of
an odd number by an odd number always results in an odd number, and an even
number by an odd number results in an even number. Since the fixed addition
converts last number from odd to even and viceversa every time this function
is called, this is not valid to take the low bit as a random YES or NO.
Well, we can use higher bits. The obvious advantadge of this algorith is
its size, giving a good ratio of randomness in a few lines of code.


2 - Pseudo-linear-congruent RNGs based in basic operations
----------------------------------------------------------

This RNGs are the most common used, and the most easy to make, but it's
difficult to accurate them to give quasi-real random numbers. The next
formula is used in my viruses, and it's a derivation of a more basic
one, with intermiddle operations to break possible linearities for
every operation that is taken. This routine can be taken as an example
of a well made RNG without having to learn maths or investigate about
close dispersions and all that:

Random: push    ecx
        mov     eax, [ebp+RndSeed_1]
        dec     dword ptr [ebp+RndSeed_1]
        xor     eax, [ebp+RndSeed_2]
        mov     ecx, eax
        rol     dword ptr [ebp+RndSeed_1], cl
        add     [ebp+RndSeed_1], eax
        adc     eax, [ebp+RndSeed_2]
        add     eax, ecx
        ror     eax, cl
        not     eax
        sub     eax, 3
        xor     [ebp+RndSeed_2], eax
        xor     eax, [ebp+RndSeed_3]
        rol     dword ptr [ebp+RndSeed_3], 1
        sub     dword ptr [ebp+RndSeed_3], ecx
        sbb     dword ptr [ebp+RndSeed_3], 4
        inc     dword ptr [ebp+RndSeed_2]
        pop     ecx
        ret

In this routine, we have three seeds to generate the number. RndSeed_1 is
a variable that acts as a "main seed", being the most variable one.
RndSeed_2 is a counter which increases every time we call to this RNG,
which acts as a linearity breaker. RndSeed_3 is modified with the result of
XORing RndSeed_1 and RndSeed_2, plus being ROLed once and added a value
which can be 4 or 5 depending on the result after subtracting (RndSeed_1
XOR RndSeed_2), which avoids the prediction of the LSB.

Going step by step, we load the RndSeed_1 in EAX and then we decrement it.
The decrement will contribute to avoid future loops in the values of the
generated numbers. After that, we XOR that value in EAX with RndSeed_2, a
so called "basic operation". XOR operation is very used in RNG because it
mixes the bits of two values very efficiently. Then, we save that value in
ECX for later use (concretely, to modify RndSeed_3) and we use the value in
CL to rotate RndSeed_1 CL times (this operation has no mathematical meaning,
it's just for garbling things a bit), and we modify again RndSeed_1 adding
the result of the XOR operation to have them interconnected and vary them
more randomly than if we use a fixed value to modify RndSeed_2 (but be
careful: abusing of this feature can make a loop in the returned values since
its an "endomogame" (ahem :) variation of the variables).

The next line will add RndSeed_2 to EAX, which was RndSeed_1 XOR RndSeed_2.
This operation will only multiply the value in EAX by 2 or will put again
the same value in EAX sometimes, depending on the bit disposition of
RndSeed_1 at start, but in the majority of cases will clear some bits and
set others, since XOR and ADD performs the same result very often (XOR is a
bit-by-bit addition without carry), or just XOR and SUB. With ADC instead of
ADD (using the Carry Flag set or cleared by the previous ADD), we use a
slightly form of this ADD that will break some possible linearities, since
it's hard to predict the value of the CF.

After this, we use some operations to garble the obtained value. ADD, XOR,
SUB, etc. In this case, an ADD (which makes the functionality explained
before, making the same effect as we made to RndSeed_2), and we ROR the
value to randomize. Then, we make NOT to break a possible proximity with
previous values (if the value was 3000h, now it's -3000h). This proximity
break is more strong when the value is average, and more weak when the value
is near the edges of a DWORD (0, 80000000h), but we can live with that :).
Since NOT is weak sometimes, we break (again) a possible linearity adding
an odd fixed number (3) which will derive (after the subsequent operations)
in a great modification of the initial number. We again use this value to
modify RndSeed_2 (this is optional).

After this garbling experience :), we use finally RndSeed_3, not used before
and therefore makes a significative change to the current value. Then, we
modify RndSeed_3 using the previous saved value in ECX and some things more
(as it was explained before), and that's all. Optionally, I added an
increment on RndSeed_2 to be sure that it changes and doesn't fall in an
"endomogamic" loop of values.

This is an example of an advanced modification of random seeds to get a
random value. This time the routine has been improved from an initial
RNG to make that the values were more random-look-like. Some lines can be
put or not, it's up to you, and one of this RNG can resemble very little
with other, although the functionality of both is very good. Just use these
tricks exposed here to get a good RNG if you don't want to use MULs or
operations like that.


3 - PRIDE-based RNG
-------------------
This quasi-perfect RNG is based on the idea exposed in "Advanced polymorphic
engine construction" in 29A#5, extrapolating the Pseudo-Random Index
DEcryption to random number generation.

The fact is that we can use the PRIDE formula (exposed below) to generate
random numbers as if we were going to access to a buffer (initially, PRIDE
was conceived to do that). With little modifications, the formula will
generate all possible numbers between 0 and a given range, without repeating
any number.

This time, we are going to modify the formula and generate a sequence that
assures that, if you call it 2^64 times, you'll get a sequence of 2^64 values
that contains each value 2^32 times (2^32 possible values * 2^32 times =
2^64 values). This means that, with the appropiated values, is a perfect RNG,
something that it isn't easy to get.

First, the formula (already adapted to this subject):

RndSeed_1       dq      ?      ; Qword (double dword)
RndSeed_2       dq      ?
RndSeed_3       dq      ?  ; This values are randomized with RDTSC or getting
                           ; the system time/date


Random:         push    edx
                mov     eax, [ebp+RndSeed_1]
                mov     edx, [ebp+RndSeed_1+4]
                xor     eax, [ebp+RndSeed_2]
                xor     edx, [ebp+RndSeed_2+4]
                shrd    eax, edx, 11h
                push    eax
                mov     eax, [ebp+RndSeed_3]
                mov     edx, [ebp+RndSeed_3+4]
                and     eax, 0FFFFFFFEh
                add     [ebp+RndSeed_1], eax
                adc     [ebp+RndSeed_1+4], edx
                inc     dword ptr [ebp+RndSeed_2]
                adc     dword ptr [ebp+RndSeed_2+4]
                pop     edx
                ret

This is the formula. Very little, although powerful. The disadvantadge of
this routine is the very big random seeds, which makes difficult to get
enough true random sources to fill that data (6 dwords), and moreover it's
recommended to have an average seed value (not very near to value edges) to
avoid getting values in subsets (although this subsets are dispersed among
them). The functionality of the formula is based on a linear modification
of a modification value (which ends with an exponential modification). This
formula wouldn't be valid if we try to get absolute numbers (not limited by
the DWORD/QWORD size), but taking the fact that this performs an implicit
MOD, the formula works great.

Maybe you wonder why I use the SHRD EAX,EDX,11h: it's because the same
reason we have to avoid the less significant bit on linear-congruent RNGs,
since this formula fails misses the same point. So, we use the fact that
it's a 64 bits value to take a high part of the value to return those 32 bits
and obtain a random value. The fact is that, if we use RndSeeds of 32 bits
and we eliminate the instructions that use EDX (and the SHRD), we get a
formula with the same functionality, better constructed (2^32 calls will
generate a fully garbled sequence of the 2^32 possible return values), but
the LSB will be 1,0,1,0,1,0..., thing that sometimes we can allow, sometimes
not. It's up to you.

The whole explanation of the formula isn't very difficult (as you see
before), but it has its little tricks. For example, that AND EAX,-2. If we
don't put that, we only generate even or odd numbers, and the result for
2^32 calls will be 2^31 values two times (different sequences in principle
but always even or odd numbers). The explanation comes from the XOR between
RndSeed_1 and Rnd_Seed2: Since RndSeed_2 is a counter, the LSB will alternate
the value (0,1,0,1,...) so the LSB of RndSeed_1 will be 0,1,0,1,... or
1,0,1,0,... (the same or inverted). Since RndSeed_3 is fixed (i.e. it's not
modified when this function is called), when making the addition of RndSeed_3
to RndSeed_1 and after increasing RndSeed_2, the operation will render in a
no-modification of the LSB in RndSeed_1, thus making always that value. So,
we anulate the last bit and at least the bit value alternates. With this
method of getting a random (using SHRD to use from bit 17 and up) this
adjust isn't needed at all, but well, the basic formula is in that way,
and I didn't want to change it very much to leave it being adaptable easily.


OK, this examples and techniques can be used to make a good RNG. Obviously,
if you are going to call the RNG only once or twice, it's not worthy to
make a RNG, since you can call directly to RDTSC (for example), or get the
system time in milliseconds. At least, I hope this things will be useful
for you when you make your RNG and your poly/metamorphism engine.

-/--------------------------\-----------------------------------------------
-|- The Mental Driller/29A -|-----------------------------------------------
-\--------------------------/-----------------------------------------------