Maximize
Bookmark

VX Heaven

Library Collection Sources Engines Constructors Simulators Utilities Links Forum

Delta Offset

Tokugawa Ieyasu
http://toku.es/2010/05/delta-offset/
May 2010

[Back to index] [Comments]

Distributed under a CC Attribution-Share Alike 3.0 Unported License.

1. The problem

We'll begin with something that, sooner or later, every virus written in assembler will need. First of all, let me show you a somewhat unusual program:

        global  main

        section .text

        msg     db      'Hello, World!',0x0A
len     equ     $ - msg

main:   mov     edx, len
        lea     ecx, [msg]
        mov     ebx, 0x01
        mov     eax, 0x04 ; __NR_write
        int     0x80

        mov     ebx, 0x00
        mov     eax, 0x01 ; __NR_exit
        int     0x80
 

It's unusual in the sense that data is inside the code section, but that's how the viruses are. In order to infect other files, both the code and data must be placed sequentially in the same section, so they can be copied with a simple loop inside a new host (ok, I know that this is not always true, but it's not important right now).

Take a look to the above code inside any executable:

 004004D0:			'Hello, World!',0x0A
 004004DE: BA 0E 00 00 00	mov	edx, 0x0E
 004004E3: 8D 0C 25 D0 04 40 00	lea	ecx, [0x004004D0]
 004004EA: BB 01 00 00 00	mov	ebx, 0x01
 004004EF: B8 04 00 00 00	mov	eax, 0x04
 004004F4: CD 80		int	0x80
 004004F6: BB 00 00 00 00	mov	ebx, 0x00
 004004FB: B8 01 00 00 00	mov	eax, 0x01
 00400500: CD 80		int	0x80

As can be seen with the lea instruction, the code will only work when loaded at that same address. This is not a problem most of the time, but a virus is executed from a new address within every new host, and needs to be programmed in a way that it can access code and data in such situations.

2. The solution

Taking advantage in the fact that the call instruction doesn't use fixed addresses to do its work, the problem can be solved as below:

        global  main

        section .text

        msg     db      'Hello, World!',0x0A
len     equ     $ - msg

main:   call    delta
delta:  pop     rbp
        sub     rbp, delta

        mov     edx, len
        lea     ecx, [rbp + msg]
        mov     ebx, 0x01
        mov     eax, 0x04 ; __NR_write
        int     0x80

        mov     ebx, 0x00
        mov     eax, 0x01 ; __NR_exit
        int     0x80
 

But before understand how the trick works, we must understand how the call works. This instruction pushes the value of the EIP register onto the stack and jumps to the address given by the operand. In this particular case, it adds the offset of the called code to the instruction pointer and simply goes on with the execution in the new location.

Now let's see how the above code looks inside executables:

 004004D0:			'Hello, World!',0x0A
 004004DE: E8 00 00 00 00	call	dword 0x004004E3
 004004E3: 5D			pop	rbp
 004004E4: 48 81 ED E3 04 40 00	sub	rbp, 0x004004E3
 004004EB: BA 0E 00 00 00	mov	edx, 0x0E
 004004F0: 8D 8D D0 04 40 00	lea	ecx, [rbp + 0x004004D0]
 004004F6: BB 01 00 00 00	mov	ebx, 0x01
 004004FB: B8 04 00 00 00	mov	eax, 0x04
 00400500: CD 80		int	0x80
 00400502: BB 00 00 00 00	mov	ebx, 0x00
 00400507: B8 01 00 00 00	mov	eax, 0x01
 0040050C: CD 80		int	0x80

NOTE: You shouldn't be confused with the call address in the disassembled version of the code. As I explained before, the call adds the immediate value 0 to the instruction pointer, but the disassembler shows the address where the execution of the code will continue. Is not a fixed address.

2.1. Detailed explanation

It must be clear now how works this trick, but for the sake of clarity we'll see a practical example. Let's imagine that the code is moved to the address 0x00500000:

 00500000:			'Hello, World!',0x0A
 0050000E: E8 00 00 00 00 	call	dword 0x00500013
 00500013: 5D			pop	rbp
 00500014: 48 81 ED E3 04 40 00	sub	rbp, 0x004004E3
 0050001B: BA 0E 00 00 00	mov	edx, 0x0E
 00500020: 8D 8D D0 04 40 00	lea	ecx, [rbp + 0x004004D0]
 00500026: BB 01 00 00 00	mov	ebx, 0x01
 0050002B: B8 04 00 00 00	mov	eax, 0x04
 00500030: CD 80		int	0x80
 00500032: BB 00 00 00 00	mov	ebx, 0x00
 00500037: B8 01 00 00 00	mov	eax, 0x01
 0050003C: CD 80		int	0x80

After the call and the pop, we get the new address of that pop (which is now 0x00500013) in the rbp register. If we substract the original address from this value, we get how many bytes the code has been moved. And if we add this quantity to every old address, we get the correct new addresses where necesary.

 0x00500013 - 0x004004E3 = 0x000FFB30
 0x000FFB30 + 0x004004D0 = 0x00500000 ('Hello, World!',0x0A)

3. Final words

The code presented has been used since the very beggining, and you can find it in almost any assembler virus. Sometimes it's as simple as you see here, sometimes it's more tricky (to fool heuristics analyzers), but the goal is always the same. You also need it, and probably your viruses will get detected because of it (if you don't hide it).

I should also say that the code is not optimized at all, and is intended to run under a 2.6.x Linux kernel using software interruptions to implement system calls. I'm going to prepare articles about these two subjects as soon as possible.

3.1. About the code

OS: Ubuntu 10.04 with Linux Kernel 2.6.32-22-generic x86_64

CPU: Intel Core 2 Duo

By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! vxer.org aka vx.netlux.org
deenesitfrplruua