herm1t

*Virus Writing Bulletin [1]*

*January 2011*

[email protected]

http://vx.netlux.org/herm1t/

It is desireable feature for a polymorphic virus to avoid linear memory access during decryption. In 2000 The Mental Driller proposed an algorithm, known as PRIDE [1] which could easily produce a large number of permutations. In this paper I would present another technique to randomize the memory access within decryption loop, based on the linear and inversive congruential random numbers generators which properties are well characterized.

Several years ago *whale* asked me is it possible to construct a pair of pseudo-random numbers generators which could produce the same sequence in a direct and reverse order [2]. At that time I was unable to answer the question, and suggested to generate a pseudo-random sequence in *encryption* part of the engine, store it to memory and use the resulting array from top to down; but when I recently flipped through the archives looking for some e-mail address I saw this conversation and decided to check the topic. I quickly found the effective and extremely simple solution to generate a pretty large number of RNGs with desired properties.

The oldest and most widely used type of RNG algorithms is *linear congruential generator*:

where *m* is the "modulus", *a* - "multiplier", *c* - "increment" and is the "seed" are values that specify the generator. For example, many of the Z0MBiE's viruses including Mistfall [3] are using LCG with a = 214013, c = 2531011, (modulus with the size of machine word will simplify the arithmetics), while the initial seed for a generator being choosed randomly (virus writes prefer to seed the generators with the values returned by `GetTickCount()` function and/or `RDTSC` instruction.

There are inverse of LCG which is called *inversive congruential generator*:

where is a *modular multiplicative inverse* of *a* modulo *m*:

and that is exactly what we need to reverse the sequence of "random" numbers.

Suppose that we have and multiply it by , , since this expression is equal to . All neccessary details once could find in Wiki [4].

The multiplicative inverse for the given *a* and *m* exists **if, and only if**, *a* and *m* are coprime (which is to say they have no common factors or that their greatest common divisor is equal to 1). Sadly, the number is composite (product of 3, 5, 17, 257, 65537), so if we choose it as a modulus, we'll need to test all possible values of *a* whether it is divisible by the factors listed above. To avoid the tests we could select the prime *m* which mean that we could use any value of *a* and it will have no common divisors with *m* by definition.

Surely it will reduce the overall number of possible generators, but the remaining generators are still numerous. We could even estimate our losses. The prime counting function proposed by Gauss and Legendre approximates the number of primes below any given *x* as . So if we choose our *m* from 1024...2048 we'll get about 120 possible values for *m*. Since those numbers are small, this value could be counted exactly, but that's not the issue. The point here is that we'll get enough candidates anyway.

Finally, we could find the inverse by using *extended Euclidian algorithm*:

uint32_t xgcd(uint32_t a, uint32_t b)

{

uint32_t x, y, u, v, m, n, q, r, b0;

x = 0; y = 1; u = 1; v = 0; b0 = b;

while (a != 0) {

q = b / a; r = b % a;

m = x - u * q; n = y - v * q;

while (m > b0) m += b0;

while (n > b0) n += b0;

b = a; a = r; x = u;

y = v; u = m; v = n;

}

return x % b0;

}

//...

r = xgcd(a, m);

{

uint32_t x, y, u, v, m, n, q, r, b0;

x = 0; y = 1; u = 1; v = 0; b0 = b;

while (a != 0) {

q = b / a; r = b % a;

m = x - u * q; n = y - v * q;

while (m > b0) m += b0;

while (n > b0) n += b0;

b = a; a = r; x = u;

y = v; u = m; v = n;

}

return x % b0;

}

//...

r = xgcd(a, m);

We could check the obtained value by testing whether `(a * r) % m == 1`

, or by modifying the xgcd() function: `return b == 1 ? x % b0 : 0;`

. But this check is redundant, since we knew that for any *a < m* and prime *m*, there would be always an inverse *r* and that it is always unique in the multiplicative group .

Now we could traverse the sequence of pseudo-random numbers back and forth with:

uint32_t lcg(uint32_t x)

{

return (x * a + c) % m;

}

uint32_t icg(int32_t x)

{

x -= c;

while (x < 0)

x += m;

return (r * x) % m;

}

{

return (x * a + c) % m;

}

uint32_t icg(int32_t x)

{

x -= c;

while (x < 0)

x += m;

return (r * x) % m;

}

So, for the prime *m* (1024..2048) and *a* (1..m) (*a* could not be greater than *m*) we got 208 850 possible RNGs (this number must be multiplied by *c* to take into account all possible increments. Not so bad and still simple. One could use the *sieve of Eratosthenes* to find the prime numbers (with a bit of optimization the memory requirements for the numbers below could be reduced down to 60Kb):

/* http://www.fpx.de/fp/Software/Sieve.html */

#define TEST(f,x) (*(f+(x)/16)&(1<<(((x)%16L)/2)))

#define SET(f,x) *(f+(x)/16)|=1<<(((x)%16L)/2)

uint8_t *make_primes(int max)

{

uint8_t *feld;

int size = (max >> 4) + 1, teste = 1, mom;

if ((feld = (uint8_t*)malloc(size)) == NULL)

return NULL;

bzero(feld, size);

while ((teste += 2) < max)

if (! TEST(feld, teste))

for (mom = 3L * teste; mom < max; mom += teste << 1)

SET (feld, mom);

return feld;

}

int is_prime(int n, uint8_t *feld)

{

if (n % 2 == 0)

return 0;

if (!TEST(feld, n))

return 1;

return 0;

}

int find_nearest_prime(int n, uint8_t *feld, int max)

{

int m = n - 2;

if (n % 2 == 0)

m++;

while ((m += 2) < max)

if (! TEST(feld, m))

return m;

return n;

}

#define TEST(f,x) (*(f+(x)/16)&(1<<(((x)%16L)/2)))

#define SET(f,x) *(f+(x)/16)|=1<<(((x)%16L)/2)

uint8_t *make_primes(int max)

{

uint8_t *feld;

int size = (max >> 4) + 1, teste = 1, mom;

if ((feld = (uint8_t*)malloc(size)) == NULL)

return NULL;

bzero(feld, size);

while ((teste += 2) < max)

if (! TEST(feld, teste))

for (mom = 3L * teste; mom < max; mom += teste << 1)

SET (feld, mom);

return feld;

}

int is_prime(int n, uint8_t *feld)

{

if (n % 2 == 0)

return 0;

if (!TEST(feld, n))

return 1;

return 0;

}

int find_nearest_prime(int n, uint8_t *feld, int max)

{

int m = n - 2;

if (n % 2 == 0)

m++;

while ((m += 2) < max)

if (! TEST(feld, m))

return m;

return n;

}

As I already told in the description of Lacrimae virus, a max distance between any given *x* and nearest prime number in the range from one to would not exceed 113 [5].

- The Mental Driller "Advanced polymorphic engine construction", 29A#5, December 2000
- thread in pvt.virii newsgroup, 2003
- Z0mbie, Mistfall virus
- Wikipedia "Modular multiplicative inverse"
- herm1t "Code integration on Linux: Cooking the PIE", EOF-DR-RRLF, 2008

By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! vxer.org aka vx.netlux.org