VX Heaven

Library Collection Sources Engines Constructors Simulators Utilities Links Forum

Formulas for Random Number Generators

The Mental Driller
Matrix Zine [3]
April 2001

[Back to index] [Comments]

When a polymorphic engine is done, we need a function to generate random numbers, but not always all the number generation routines are well done, because sometimes the results fall in a value loop that always return a value very near to the last one, or a numeric sequence easily predictable, which can make that our polymorphic engine generates not very polymorphic code.

Many VX people have researched this subject because the random number generator (RNG) of the engine is one of the main bricks to make it. In this article, I'll show some nearly-to-perfect RNGs that can be used in a polymorphic/metamorphic engine to generate that precious random number that we always wanted to have :P. In this article, I'll explain some routines without entering very deep in the mathematical aspects of them, since many of you hate maths, I know :).

The first one thing that we must/should do is to initialize the seed of the RNG. We can initialize with the date of the system, for example, to make slow poly/metamorphism, since the RNG will generate the same sequence of random numbers in that day. If we don't care about that, one way of getting a perfect random seed is to use the instruction RDTSC, which will return a 64 bits value in EDX:EAX, being this number the clock count of the processor since the computer was switched on. After getting these values, we put them in the variables that we will use in our RNG.

The routines that I expose in every point are only examples to show the technique. Since making a poly/metamorphic engine (if you use them to make this) is very personal, you maybe want to make even the RNG, but you can freely use this formulas, of course.

0 - Simple RNGs to call few times

There are situations where you need to get a sequence of pseudo-random values but you don't need an special or perfect routine, but only one that satisfies you in a single routine (for example, a payload). A simple but effective RNG is the next formula:

Random:         add     eax, [ebp+RndSeed]
                add     [epb+RndSeed], eax

Don't use this to make a polymorphism engine! This shitty RNG is very useful for little viruses that only need to make a random number eventually, but it is quite predictable, at least when generating code.

1 - Linear-congruent RNG

Some polymorphic viruses use this type of RNG. The basic idea is to multiply the last generated number by a large odd integer and then add an odd number to break a possible linearity. Let's see the RNG in GriYo's Dengue:

get_rnd32:      push ecx
                push edx
                mov eax,dword ptr [ebp+rnd32_seed]
                mov ecx,41C64E6Dh
                mul ecx
                add eax,00003039h
                and eax,7FFFFFFFh             ;; (this is optional)
                mov dword ptr [ebp+rnd32_seed],eax
                pop edx
                pop ecx

This gets the last number generated by the routine and multiplies it by 41C64E6Dh, being this number a random fixed one (just be sure it's odd and has randomized actived bits). After the multiplication, an odd number is added to this, to break a possible linearity or loop. This formula is very good to get a global random, but the sequence of the LSB (less significant bit) is easily predictable (1, 0, 1, 0, 1...) since the multiplication of an odd number by an odd number always results in an odd number, and an even number by an odd number results in an even number. Since the fixed addition converts last number from odd to even and viceversa every time this function is called, this is not valid to take the low bit as a random YES or NO. Well, we can use higher bits. The obvious advantadge of this algorith is its size, giving a good ratio of randomness in a few lines of code.

2 - Pseudo-linear-congruent RNGs based in basic operations

This RNGs are the most common used, and the most easy to make, but it's difficult to accurate them to give quasi-real random numbers. The next formula is used in my viruses, and it's a derivation of a more basic one, with intermiddle operations to break possible linearities for every operation that is taken. This routine can be taken as an example of a well made RNG without having to learn maths or investigate about close dispersions and all that:

Random: push    ecx
        mov     eax, [ebp+RndSeed_1]
        dec     dword ptr [ebp+RndSeed_1]
        xor     eax, [ebp+RndSeed_2]
        mov     ecx, eax
        rol     dword ptr [ebp+RndSeed_1], cl
        add     [ebp+RndSeed_1], eax
        adc     eax, [ebp+RndSeed_2]
        add     eax, ecx
        ror     eax, cl
        not     eax
        sub     eax, 3
        xor     [ebp+RndSeed_2], eax
        xor     eax, [ebp+RndSeed_3]
        rol     dword ptr [ebp+RndSeed_3], 1
        sub     dword ptr [ebp+RndSeed_3], ecx
        sbb     dword ptr [ebp+RndSeed_3], 4
        inc     dword ptr [ebp+RndSeed_2]
        pop     ecx

In this routine, we have three seeds to generate the number. RndSeed_1 is a variable that acts as a "main seed", being the most variable one. RndSeed_2 is a counter which increases every time we call to this RNG, which acts as a linearity breaker. RndSeed_3 is modified with the result of XORing RndSeed_1 and RndSeed_2, plus being ROLed once and added a value which can be 4 or 5 depending on the result after subtracting (RndSeed_1 XOR RndSeed_2), which avoids the prediction of the LSB.

Going step by step, we load the RndSeed_1 in EAX and then we decrement it. The decrement will contribute to avoid future loops in the values of the generated numbers. After that, we XOR that value in EAX with RndSeed_2, a so called "basic operation". XOR operation is very used in RNG because it mixes the bits of two values very efficiently. Then, we save that value in ECX for later use (concretely, to modify RndSeed_3) and we use the value in CL to rotate RndSeed_1 CL times (this operation has no mathematical meaning, it's just for garbling things a bit), and we modify again RndSeed_1 adding the result of the XOR operation to have them interconnected and vary them more randomly than if we use a fixed value to modify RndSeed_2 (but be careful: abusing of this feature can make a loop in the returned values since its an "endomogame" (ahem :) variation of the variables).

The next line will add RndSeed_2 to EAX, which was RndSeed_1 XOR RndSeed_2. This operation will only multiply the value in EAX by 2 or will put again the same value in EAX sometimes, depending on the bit disposition of RndSeed_1 at start, but in the majority of cases will clear some bits and set others, since XOR and ADD performs the same result very often (XOR is a bit-by-bit addition without carry), or just XOR and SUB. With ADC instead of ADD (using the Carry Flag set or cleared by the previous ADD), we use a slightly form of this ADD that will break some possible linearities, since it's hard to predict the value of the CF.

After this, we use some operations to garble the obtained value. ADD, XOR, SUB, etc. In this case, an ADD (which makes the functionality explained before, making the same effect as we made to RndSeed_2), and we ROR the value to randomize. Then, we make NOT to break a possible proximity with previous values (if the value was 3000h, now it's -3000h). This proximity break is more strong when the value is average, and more weak when the value is near the edges of a DWORD (0, 80000000h), but we can live with that :). Since NOT is weak sometimes, we break (again) a possible linearity adding an odd fixed number (3) which will derive (after the subsequent operations) in a great modification of the initial number. We again use this value to modify RndSeed_2 (this is optional).

After this garbling experience :), we use finally RndSeed_3, not used before and therefore makes a significative change to the current value. Then, we modify RndSeed_3 using the previous saved value in ECX and some things more (as it was explained before), and that's all. Optionally, I added an increment on RndSeed_2 to be sure that it changes and doesn't fall in an "endomogamic" loop of values.

This is an example of an advanced modification of random seeds to get a random value. This time the routine has been improved from an initial RNG to make that the values were more random-look-like. Some lines can be put or not, it's up to you, and one of this RNG can resemble very little with other, although the functionality of both is very good. Just use these tricks exposed here to get a good RNG if you don't want to use MULs or operations like that.

3 - PRIDE-based RNG

This quasi-perfect RNG is based on the idea exposed in "Advanced polymorphic engine construction" in 29A#5, extrapolating the Pseudo-Random Index DEcryption to random number generation.

The fact is that we can use the PRIDE formula (exposed below) to generate random numbers as if we were going to access to a buffer (initially, PRIDE was conceived to do that). With little modifications, the formula will generate all possible numbers between 0 and a given range, without repeating any number.

This time, we are going to modify the formula and generate a sequence that assures that, if you call it 2^64 times, you'll get a sequence of 2^64 values that contains each value 2^32 times (2^32 possible values * 2^32 times = 2^64 values). This means that, with the appropiated values, is a perfect RNG, something that it isn't easy to get.

First, the formula (already adapted to this subject):

RndSeed_1       dq      ?      ; Qword (double dword)
RndSeed_2       dq      ?
RndSeed_3       dq      ?  ; This values are randomized with RDTSC or getting
                           ; the system time/date

Random:         push    edx
                mov     eax, [ebp+RndSeed_1]
                mov     edx, [ebp+RndSeed_1+4]
                xor     eax, [ebp+RndSeed_2]
                xor     edx, [ebp+RndSeed_2+4]
                shrd    eax, edx, 11h
                push    eax
                mov     eax, [ebp+RndSeed_3]
                mov     edx, [ebp+RndSeed_3+4]
                and     eax, 0FFFFFFFEh
                add     [ebp+RndSeed_1], eax
                adc     [ebp+RndSeed_1+4], edx
                inc     dword ptr [ebp+RndSeed_2]
                adc     dword ptr [ebp+RndSeed_2+4]
                pop     edx

This is the formula. Very little, although powerful. The disadvantadge of this routine is the very big random seeds, which makes difficult to get enough true random sources to fill that data (6 dwords), and moreover it's recommended to have an average seed value (not very near to value edges) to avoid getting values in subsets (although this subsets are dispersed among them). The functionality of the formula is based on a linear modification of a modification value (which ends with an exponential modification). This formula wouldn't be valid if we try to get absolute numbers (not limited by the DWORD/QWORD size), but taking the fact that this performs an implicit MOD, the formula works great.

Maybe you wonder why I use the SHRD EAX,EDX,11h: it's because the same reason we have to avoid the less significant bit on linear-congruent RNGs, since this formula fails misses the same point. So, we use the fact that it's a 64 bits value to take a high part of the value to return those 32 bits and obtain a random value. The fact is that, if we use RndSeeds of 32 bits and we eliminate the instructions that use EDX (and the SHRD), we get a formula with the same functionality, better constructed (2^32 calls will generate a fully garbled sequence of the 2^32 possible return values), but the LSB will be 1,0,1,0,1,0..., thing that sometimes we can allow, sometimes not. It's up to you.

The whole explanation of the formula isn't very difficult (as you see before), but it has its little tricks. For example, that AND EAX,-2. If we don't put that, we only generate even or odd numbers, and the result for 2^32 calls will be 2^31 values two times (different sequences in principle but always even or odd numbers). The explanation comes from the XOR between RndSeed_1 and Rnd_Seed2: Since RndSeed_2 is a counter, the LSB will alternate the value (0,1,0,1,...) so the LSB of RndSeed_1 will be 0,1,0,1,... or 1,0,1,0,... (the same or inverted). Since RndSeed_3 is fixed (i.e. it's not modified when this function is called), when making the addition of RndSeed_3 to RndSeed_1 and after increasing RndSeed_2, the operation will render in a no-modification of the LSB in RndSeed_1, thus making always that value. So, we anulate the last bit and at least the bit value alternates. With this method of getting a random (using SHRD to use from bit 17 and up) this adjust isn't needed at all, but well, the basic formula is in that way, and I didn't want to change it very much to leave it being adaptable easily.

OK, this examples and techniques can be used to make a good RNG. Obviously, if you are going to call the RNG only once or twice, it's not worthy to make a RNG, since you can call directly to RDTSC (for example), or get the system time in milliseconds. At least, I hope this things will be useful for you when you make your RNG and your poly/metamorphism engine.

[Back to index] [Comments]
By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! aka