Lord Julus

*VX-tasy [articles]*

* 1998*

- Before F0reW0rd - W0rd ;-)
- Foreword
- First approach
- Anti - emulator code
- The FPU attack
- Creating fooling code
- Creating very complicated decryptors
- Creating self modifying code
- Tips & Tricks
- The envelope of the matrix method
- P-mode attack
- Final Word

The following document is a study. It's only purpose is to be used in the virus research only. The author of this article is not responsible for any misuse of the things written in this document. Most of the things published here are already public and they represent what the author gathered across the years.

The author is not responsible for the use of any of these information in any kind of virus.

Lord Julus

Due to the fact that I was very anxious to release this, and the fact that while writing it my computer got burned, and that, anyway I was sick and tired of looking at it anymore, I released it in a, let's say for now Version 1.0. As soon as I'll feel again ready to write, I shall come with more ideas and stuff. For now just read this and don't kick me if you find any mistakes I didn't have time to correct... Anyway, during the writing of this I kinda felt a little more on the encryption side, which actually is the basis of a good fight with an AV. You got an unbeatable encryption, you rule! So, don't be frightened by the math involved here: everything is explained. Secondly, also while writing this article I got involved in Win32 programing. This made me leave the mortal's world for a while ;-) and go in higher circles. So, just read along...

Well, my dearest friends and enemies (;-)), here I am again, not really having much to do these days but work and code... Alongside these, I would name, "high" things to do, I still have time to study, analyze and check various stuff around. Since a little while back, I started a campaign of writing anti-anti-viral programs. These would be like memory TSR's bypassers and memory patchers and searchers. Well, I looked deep and now I decided it's time I put it down in words, black on white (or more like white on blue, as I see it now ;-)).

Anyway, for those of you who know jam about what's debugging and emulating, I will try to make a short description here in the foreword on the debug process, emulating process and some other stuffs.

Here come the descriptions of the terms about to be used here (definitions taken out of the Webster, and additional explanations by me):

- DEBUG
- "To detect and remove defects or errors from smth."
- DEBUGGER
- (in comp. sense) = A tool to debug code

In common language, this 'debug' term has enlarged the specter, no longer meaning only detecting and removing errors, but also simply looking over the code. We'll take a look later at the most common debuggers.

- EMULATE
- "Try to equal or excel; imitate with effort to equal or surpass"
- EMULATION
- "Effort or desire to equal or excel with others"
- EMULATOR
- (in comp. sense) = A tool to emulate code

This term also has a different connotation in the computer business. It doesn't really means making a program in order for it to be able to imitate someone else's program. This would be, of course, stupid. If you want to copy the best tool is 'copy`n`paste' ;-)). Anyway, the emulator is a piece of code, usually very complicated, actually much more complicated then the code to be emulated itself, which has the capability to take a program instruction by instruction and imitate what that program would do if it were ran. BUT, the emulator will never allow a program to really do what it should do. It only tries to come to that program's goal and guess it. This comes to the next definition:

- HEURISTIC
- "stimulating interest as means of furthering investigation"
Actually, the best definition for this term can be found in the polymorphic tutorial written by The Black Baron. It reads: 'heuristic = A set of well defined rules to apply to a problem in the hope of achieving a known result'. Hope you got it...

Anyway, since the beginning of the viral activity, somewhere in 1987, the anti-virus-writers had to use some powerful tools. We all know that it's much harder to build, restore or repair something then to destroy or damage it (take a life example: hit your TV screen with a hammer... it only takes a second... than try to repair it ;-))), and also it's always much easier to prevent something bad but to restore it's damage afterwards. Just like Confucius said: "Those who do not see the danger coming shall surely suffer from the urge approaching". Therefore, the antiviral community have started to build certain types of tools in order to come in a not so fair fight (thousands of virus writers and a couple of AV guys ;-)).

Mainly, the developed tools are these:

- TSR blockers/checkers
- String scanners for memory/files/places on disks
- Heuristic scanners / code emulators

Let's define them quickly and start the real thing:

This category of AV utilities is largely used and was made famous by VSAFE (one of the most known TSR utilities). Their main purpose is prevention. They do not clean viruses, but stay there in memory and before any operation is done they check... If something strange pops up (like a write-to-disk, an executable change) they have the nasty habit to flush a (usually) red window on the screen warning about the danger. Others are blockers for different viruses. Usually a virus checks whether it's resident or not by calling an interrupt with certain values and waiting for something in return. A TSR blocker will simulate the virus by returning the 'already resident' values. In this way, even if your files are infected, the virus will never go resident.

Any virus, like any piece of code actually is made of those tiny, little 0's and 1's called bits, which form those pretty nice 8 bit thingies called bytes, which put in pairs form those really nifty words (and no, I don't think you're stupid ;-)). Anyway, a code has this thing called a signature. A string scanner will search in a file/memory or anywhere else for a set of bytes and will decide whether it's a virus or not. Smart scanners allow smart wildcards, like scan x bytes, then jump over the next 3 bytes, scan other 2, and so on... Anyway, even with the growing popularity of the polymorphic viruses, most of the viruses around can be detected with a signature.

Let's imagine you write a new virus... Of course, there is no one who knows any signature for your virus for the simple reason that it's new. Here the heuristic analyzers come around. These 'look' into your code and set some levels of danger for that particular code. For example if a heuristic scanner finds a check for 'load & execute' command, it will probably warn the user. The code emulator does more. It simulates the code execution by putting 'by hand' values into the registers and trying to 'see' what the code does. This method is essential for new viruses and for polymorphic viruses.

Ok, now we have defined our 'environment', sort of speak, so let's start talking a little deeper about each of the above AV-types.

The TSR-blocker... Yeah... This one is the easiest type of AV to go around. There are a lot of TSR-blockers out there... If you feel threatened by any one of them, simply disassemble the darn thing an check out the method it uses in order to check. There are several ways they use. The most common is to monitor interrupts 21h, 13h, 76h, 25h, 26h. All these can be overridden by simple tunneling/tracing routines. But, many TSR shits have some anti-tunneling routines that might warn the user about tunneling in progress. But, this is not a matter to speak about in this article.

Another largely used method is the monitoring of the INT 03/01. This means that the TSR checks every command your program does and decides if a couple of commands are dangerous or not. These are usually crap, because they slow down everything. However, this type of TSR blocker gets killed by the Prefetching Queue.

Another method is the monitoring of the INT 08 (the clock interrupt). Using this interrupt the AV checks various things. A simple CS:IP check type tunneling and you override it's checks. However, be careful to leave it like it would work. You may need to disassemble it and find particular bytes to patch (some of them use INT 08 also to trigger keyboard event, like the pop-up of the options menu; and if the options menu doesn't work, the user may notice).

Anyway, the TSR blockers can be easily killed as you saw. However, the AV guy might create a specific TSR-blocker for your specific virus (for example by making a TSR that returns your virus' `already resident` signature, but this means your virus is really good or you really pissed of some guy...)

In order to kill the string scanner all you need to do is to put in your virus a well random-oriented polymorphic decryptor. In this way you're safe from any kind of string scanning. Right after the polymorphic decryptor has finished it's job, it's time for another decryptor. You have got to create a well balanced set of decryptors, kinda like this:

- the more complicated the poly decryptor is, the less it polymorphic
- the longest the poly decryptor is, the hardest is for the emulator to go thru (don't forget that there exist code emulators + scan strings; they can go thru your poly decryptor and scan string the second decryptor)
- the more complicated the second decryptor is, it's break becomes more difficult for the emulator

So, you need a balance. A well scan-string based AV will not have a very good anti-poly routine. This because the loading of the scan strings and searching for them takes long time. The same for the emulator. In order to create a good poly decryptor, check out the article I wrote on poly decryptors at http://members.tripod.com/~lordjulus.

So, the scan strings go down the drain too...

And finally I reached where I wanted to... The heuristic scanners and the code emulators. These are the most dangerous AV ever and they seem to be written by some smart guys (some of them ;-))... Anyway, the main disadvantage for the code emulator (called CE from here) is it's speed. As it must 'emulate' each instruction, it has to kinda do what the CPU would do, still however using the CPU... Therefore it's slow. Also, a lot of instructions are not emulated by them or some of them are emulated incorrectly. Further I will try to put up a set of methods I recommend you to use inside the second decryptor. In case the CE goes past your poly decryptor, it *must* hang in the second decryptor, otherwise, your virus is disclosed.

One of the best methods in the fight with the AV code emulator is to find out pieces of code that generate a certain known result. The same result must be retrieved through another method, which also can be or can be not emulated by the AV. Having the two results one should do some operations with them over the most important registers. The idea around this is that if the AV is not capable to emulate one of the ways you retrieve a result, it will for sure use it's own result and will render to a fault. However, beware of comparisons and conditional jumps. What I'm referring to is this: say you made two routines and one of them is for sure impossible to emulate. The two routines return the same parameter. You put them in the AX and BX registers. If you do something like:

CMP AX, BX JNE I_AM_BEING_EMULATED

The code emulator for sure will jump to the conditional jump address, as it cannot compute one of the arguments correctly. One could think that this solves the problem. No way ! As we think of devious methods, so do the AV writers. Anytime a conditional jump is encountered a good emulator will save it's place. If the condition is met it will jump there. The above jump would probably be one to an infinite loop, or program terminate, or halt or stuff like that. The good emulator will not stop, but will return to the prior conditional jump and will try to continue emulating the code like if the condition was not met. This gives the emulator tremendous powers.

However, we can solve that too. Instead of comparing AX with BX we can add both of them let's say to a register that holds the key for the encryption. Or we can subtract one from the other and increment the DS data segment with the amount. Normally, if the code is executed correctly, in the first case the key would be incremented with a known number and in the second case the DS will remain unchanged. BUT if the emulator fails in computing one of the two values, or even worse both, of them, the emulating process will fail altogether.

Now let's see some ways we can fool the emulator.

As we all know, there are certain interrupts that respond by returning a known value. I'm saying 'known' meaning you can get it somewhere else too. Let's see:

The equipment list is a word that hold specific and very useful information about your computer. However, this word can also be found in BIOS at address 0000:0410h. Here goes the code:

XOR AX, AX MOV DS, AX MOV BX, WORD PTR [0410] INT 11H SUB AX, BX ADD <KEY>, AX

If the emulator skips INT 11, AX will be different from BX, so the *key* value will be corrupted.

This word can also be found in BIOS at 0000:0413h. The anti-emu code is the same as above, but the Int value and the BIOS source.

Now here we can do a lot of things. The very first and very good is this one:

MOV AX, 1686H INT 2FH

This one returns 0 in AX if the CPU is in protected mode and something else if it's in real mode. But, we also have this instruction:

This instruction (Store Machine Status Word) will put MSW in BX. The MSW (Machine Status Word) is a word with a lot of info on it. For our specific example we are only interested in the first bit. This bit is 0 if the CPU is in real mode and 1 if the CPU is in protected mode. Do you start to see a pattern here? First of all int 2fh is not emulated by many emulator around and the SMSW instruction either...

Ok, nowadays whoever still owns a 286 or less (duh!!) is considered to be owning a pocket calculator. Whoever has a 386 kinda like enters the human kind (;-)), BUT. There's always a but. If he does not posses a FPU (Floating Point Unit) he is also considered obsolete as human ;-). In other words who doesn't have a FPU on his computer could just skip all this stuff and go watch a movie or something ;-)))

Anyway, the FPU is a very powerful thing that wonders around your CPU helping with the math calculation speed. Plus, it's 'floating' prefix gives you an idea about it's main purpose: making floating point calculations. No more only integer numbers, now you can calculate using decimals also. Is this gonna help us ? Well, I tell you: A LOT ! Why ? The first argument is this: no code analyzer / emulator I know about (except probably Dr.Web which emulates a couple of instructions) is able to emulate FPU instructions. Some of them, like TBAV hang while emulating the code. Some of them just jump over the FPU instructions, hoping they are only junk or the program is no virus at all. Actually there are very few viruses out there using the FPU instructions and the explanations for this is that people want to see their viruses spreading. The FPU instructions pose a threat: on computers not equipped with a FPU the code will hang ! In the same idea, the anti-virus products writers didn't attempt to emulate FPU instructions as 99% of the viruses in the wild don't use them. Also, as you read above about how the instructions are emulated, emulating the FPU instructions would probably triple the time the emulator needs to go through the code and, as I said the slower the emulator goes, the worse the AV product is. Combine a FPU oriented decryptor with a huge polymorph generated decryptor and the code emulator will be lost in it.

First of all, in order not to crash the program currently running, one may want to check whether a copprocesor unit is installed. This is done easily by taking the MSW using the instruction SMSW AX, and checking the __ bit. If it's set we have a copprocessor. If it's not set and your virus uses FPU in decryptors, then it's a dead cause: get out with an error or something. If you just use FPU to fool the emulators that stop over FPU instructions, just skip the part.

We shall assume that we have a computer that has an installed coprocessor (387, 487, etc...).

First, let's talk about IEEE standard 754. This is the standard Intel uses in order to make the coprocessor 'understand' floating point numbers. Basically, these numbers are coded like this:

S, E, F, where:

- S
- sign
- E
- exponent
- F
- fraction part

The length of the S is one bit (0 if the number is positive, and 1 if it's negative). The length of the E is calculated like this:

F has a length equal to All_bits - E_length - 1.

Let's see for example how do we code a Double Word floating point number:

S - 1 bit E - 11 bits F - 20 bits ----------- = 32 bits

So, usualy the floating point number is expressed like this:

S, 2^E * F

What is very nice about the copprocessor unit is that you really don't need to put up with this crap way of storing the floating point numbers. The Fpu provides it's "stack" in order to help you out. The stack looks like this:

ST(0), ST(1), ... , ST(9)

The ST's are holders for the floating point numbers (let us understand eachother: a floating point includes an integer number; as you will see this is very important in out bussiness).

So, basically you have the loading instructions. These instructions allow you to load from a certain place a number. The number will be placed at the head of the stack and all other numbers will be pushed up. This goes like this:

load m : ST(0) = m ; ST(1) = 0 ; ST(2) = 0 ... load n : ST(0) = n ; ST(1) = m ; ST(2) = 0 ... load p : ST(0) = p ; ST(1) = n ; ST(2) = p ...

I hope this clears it. The most used stack register is the ST(0). This is because we have special instructions that use other stack registers to compute as a second operator. First take a look at the FPU instructions in a very nice table I ripped of from TechHelp and then I shall explain more with some examples:

Data Transfer and Constants | |
---|---|

FLD src | Load real: st(0) <- src (mem32/mem64/mem80) |

FILD src | Load integer: st(0) <- src (mem16/mem32/mem64) |

FBLD src | Load BCD: st(0) <- src (mem80) |

FLDZ | Load zero: st(0) <- 0.0 |

FLD1 | Load 1: st(0) <- 1.0 |

FLDPI | Load pi: st(0) <- n (ie, pi) |

FLDL2T | Load log2(10): st(0) <- log2(10) |

FLDL2E | Load log2(e): st(0) <- log2(e) |

FLDLG2 | Load log10(2): st(0) <- log10(2) |

FLDLN2 | Load loge(2): st(0) <- loge(2) |

FST dest | Store real: dest <- st(0) (mem32/mem64) |

FSTP dest | dest <- st(0) (mem32/mem64/mem80); pop stack |

FIST dest | Store integer: dest <- st(0) (mem32/mem64) |

FISTP dest | dest <- st(0) (mem16/mem32/mem64); pop stack |

FBST dest | Store BCD: dest <- st(0) (mem80) |

FBSTP dest | dest <- st(0) (mem80); pop stack |

Compare | |

FCOM | Compare real: Set flags as for st(0) - st(1) |

FCOM op | Set flags as for st(0) - op (mem32/mem64) |

FCOMP op | Compare st(0) to op (reg/mem); pop stack |

FCOMPP | Compare st(0) to st(1); pop stack twice |

FICOM op | Compare integer: Set flags as for st(0) - op (mem16/mem32) |

FICOMP op | Compare st(0) to op (mem16/mem32); pop stack |

FTST | Test for zero: Compare st(0) to 0.0 |

FUCOM st(i) | Unordered Compare: st(0) to st(i) [486] |

FUCOMP st(i) | Compare st(0) to st(i) and pop stack |

FUCOMPP st(i) | Compare st(0) to st(i) and pop stack twice |

FXAM | Examine: Eyeball st(0) (set condition codes) |

Arithmetic | |

FADD | Add real: st(0) <- st(0) + st(1) |

FADD src | st(0) <- st(0) + src (mem32/mem64) |

FADD st(i),st | st(i) <- st(i) + st(0) |

FADDP st(i),st | st(i) <- st(i) + st(0); pop stack |

FIADD src | Add integer: st(0) <- st(0) + src (mem16/mem32) |

FSUB | Subtract real: st(0) <- st(0) - st(1) |

FSUB src | st(0) <- st(0) - src (reg/mem) |

FSUB st(i),st | st(i) <- st(i) - st(0) |

FSUBP st(i),st | st(i) <- st(i) - st(0); pop stack |

FSUBR st(i),st | Subtract Reversed: st(0) <- st(i) - st(0) |

FSUBRP st(i),st | st(0) <- st(i) - st(0); pop stack |

FISUB src | Subtract integer: st(0) <- st(0) - src (mem16/mem32) |

FISUBR src | Subtract Rvrsd int: st(0) <- src - st(0) (mem16/mem32) |

FMUL | Multiply real: st(0) <- st(0) * st(1) |

FMUL st(i) | st(0) <- st(0) * st(i) |

FMUL st(i),st | st(i) <- st(0) * st(i) |

FMULP st(i),st | st(i) <- st(0) * st(i); pop stack |

FIMUL src | Multiply integer: st(0) <- st(0) * src (mem16/mem32) |

FDIV | Divide real: st(0) <- st(0) ÷ st(1) |

FDIV st(i) | st(0) <- st(0) ÷ t(i) |

FDIV st(i),st | st(i) <- st(0) ÷ st(i) |

FDIVP st(i),st | st(i) <- st(0) ÷ st(i); pop stack |

FIDIV src | Divide integer: st(0) <- st(0) ÷ src (mem16/mem32) |

FDIVR st(i),st | Divide Rvrsd real: st(0) <- st(i) ÷ st(0) |

FDIVRP st(i),st | st(0) <- st(i) ÷ st(0); pop stack |

FIDIVR src | Divide Rvrsd int: st(0) <- src ÷ st(0) (mem16/mem32) |

FSQRT | Square Root: st(0) <- sqrt st(0) |

FSCALE | Scale by power of 2: st(0) <- 2 ^ st(0) |

FXTRACT | Extract exponent: st(0) <- exponent of st(0); and gets pushed st(0) <- significand of st(0) |

FPREM | Partial remainder: st(0) <- st(0) MOD st(1) |

FPREM1 | Partial Remainder (IEEE): same as FPREM, but in IEEE standard [486] |

FRNDINT | Round to nearest int: st(0) <- INT( st(0) ); depends on RC flag |

FABS | Get absolute value: st(0) <- ABS( st(0) ); removes sign |

FCHS | Change sign: st(0) <- -st(0) |

Transcendental | |

FCOS | Cosine: st(0) <- COS( st(0) ) |

FPTAN | Partial tangent: st(0) <- TAN( st(0) ) |

FPATAN | Partial Arctangent: st(0) <- ATAN( st(0) ) |

FSIN | Sine: st(0) <- SIN( st(0) ) |

FSINCOS | Sine and Cosine: st(0) <- SIN( st(0) ) and is pushed to st(1) st(0) <- COS( st(0) ) |

F2XM1 | Calculate (2 ^ x)-1: st(0) <- (2 ^ st(0)) - 1 |

FYL2X | Calculate Y * log2(X): st(0) is Y; st(1) is X; this replaces st(0) and st(1) with: st(0) * log2( st(1) ) |

FYL2XP1 | Calculate Y * log2(X+1): st(0) is Y; st(1) is X; this replaces st(0) and st(1) with: st(0) * log2( st(1)+1 ) |

Processor Control | |

FINIT | Initialize FPU |

FSTSW | AX store Status word: AX <- MSW |

FSTSW dest | dest <- MSW (mem16) |

FLDCW src | Load control word: FPU CW <- src (mem16) |

FSTCW dest | Store control word: dest <- FPU CW |

FCLEX | Clear exceptions |

FSTENV dest | Store environment: store status, control and tag words and exception pointers into memory at dest |

FLDENV src | Load environment: load environment from memory at src |

FSAVE dest | Store FPU state: store FPU state into 94-bytes at dest |

FRSTOR src | Load FPU state: restore FPU state as saved by FSAVE |

FINCSTP | Increment FPU stack ptr: st(6)<-st(5); st(5)<-st(4),...,st(0)<-? |

FDECSTP | Decrement FPU stack ptr: st(0)<-st(1); st(1)<-st(2),...,st(7)<-? |

FFREE | st(i) Mark reg st(i) as unused |

FNOP | No operation: st(0) <- st(0) |

WAIT/FWAIT | Synchronize FPU & CPU: Halt CPU until FPU finishes current opcode. |

Along these instructions I can add here the

FXCH - exchange instruction st(0) <- st(1) st(1) <- st(0)

which is very usefull sometimes.

So, as you saw, mainly all you should use are registers ST(0) and ST(1) because you can use the shorter form of the instruction. Let's imagine we want to compute something like this:

cos(((a+b)*(c+d))/f)

I will give you a table with the instructions and the state of the stack in the same time so you can understand:

fild word ptr [a] ; ST(0) = a fild word ptr [b] ; ST(0) = b ; ST(1) = a fadd ; ST(0) = a + b fist word ptr [temp] ; save result fild word ptr [c] ; ST(0) = c fild word ptr [d] ; ST(0) = d ; ST(1) = c fadd ; ST(0) = c + d fild word ptr [temp] ; ST(0) = c + d ; ST(1) = a + b fmul ; ST(0) = (a+b)*(c+d) fild word ptr [f] ; ST(1) = f fdiv ; ST(0) = (a+b)*(c+d)/f fcos ; ST(0) = cos((a+b)*(c+d)/f)

See, it's much more easier to make calculations using the FPU. And the conversion between normal registers is done like this:

mov word ptr [x], ax fild word ptr [x]

You will ask me why did I use FILD (load integer) instead of FLD (load float) ? Easy, that's because I didn't get the time to fully explain the IEEE 754 standard and the loading instructions expect that the source to contain a number in the IEEE 754 standard. EG, if you have at address [this_Address] the number 1111h, if you do a FILD you will have 1111h into the ST(0), but if you do FLD you will have 1.89793e-40... Kinda nasty. Plus, you must use the form FLD DWORD PTR [x] to load a floating number. Anyway, as I said, these insides are of less importance. The most important thing is to know how to use them and have a good algorithm to use them. Anyway, for those of you who want to study more on the floating point storage way, I have included alongside this article a program in ZIP form copyright by Borland International. Use it wisely... May the FPU be with you ;-) Look there at the ways you have in order to retrieve the mantisa, the exponent and so on...

Oh, one more thing. I forgot to tell you, those who don't like to read all those .DOC files ;-) that in order to use the FPU with the TASM assembler, you need to use this kind of header for your files (actually this is the header I always use):

.386 .387 .model TPascal .code org 0

In this way you can safely use 32 bit registers and FPU instructions. I can say it's the best way to compile an ASM file...

Now, let's go down to business. We'll come back to the same good old method. We'll try to create a set of instructions that when normally executed will render to a known goal, but when emulated by a code emulator, they will generate a completely different result.

Of course, talking about math coprocessor instructions, we're gonna have to use a lot of math, but not high math, just common, ok, don't get scared.

Ok, let's take a peek at common math. As we all know, an odd number added to an even number always gives as a result an odd number (e.g. 3 + 4 = 7). What do we obtain if we divide an odd number by 4 ? Let's make a table:

X | X/4 | (X+1)/2 |
---|---|---|

1 | 0.25 | 1 |

3 | 0.75 | 2 |

5 | 1.25 | 3 |

7 | 1.75 | 4 |

9 | 2.25 | 5 |

11 | 2.75 | 6 |

13 | 3.25 | 7 |

15 | 3.75 | 8 |

17 | 4.25 | 9 |

19 | 4.75 | 10 |

21 | 5.25 | 11 |

23 | 5.75 | 12 |

25 | 6.25 | 13 |

27 | 6.75 | 14 |

29 | 7.25 | 15 |

In the first column we have a series of odd numbers, in the second column we have that number divided by 4. As you can see, the decimals vary: .25, .75, .25, etc. The last column is calculated like this (X+1)/2, where x is the number in the first column on the same line (e.g. (7+1)/2 = 8). What do we notice ? We notice that every time a .25 appears, next to it in the third column we have an odd number and every time a .75 appears, next to it in the third column we have an even number. Ok, now that we establish some rules let's go specific:

First you must generate 2 random words. After that, be sure one of them is odd and one of them is even. To do that, for the odd number set it's first bit:

OR <reg1>, 0001h

And for the even number reset it's first bit:

AND <reg2>, 1110h

Now add the two numbers into *reg3*. Ok, now we know that we have a random odd number in the register *reg3*.

Next step, make a floating point division with 4 on this odd number and take the real part and save it somewhere (reg4 for ex.). Then take the number again, increase it by 1 and make a floating division by 2. Now, as we divided an even number, the result will be an integer. This integer might be odd or even. To check it's parity we do this:

Mov ax, number Jp odd_number Even_number: ... Odd_number: ...

If the number is even we are sure that the reg4 contains 75, otherwise, if it's odd we know that the reg4 contains 25. Somewhere you should have an address that holds a double word that equals 25. If reg4 has 75, than negate the number at that address (a good way to do it in order to use more FPU instructions would be to make a floating point subtract by subtracting 25 from 0, obtaining -25). Now that we have this, simply add the double word to the number we have. The two possibilities are:

25 + 25 = 50 75 - 25 = 50

So, starting from two absolutely random numbers (which, BTW, can be DWORDS or QWORDS or whatever so you can use more FPU instructions), we obtained a fixed number, i.e. 50. Of course, this 50 number will be placed either in a ST(?) register or on a double word address. The only thing to do is crop it's end and just keep the 50 into the CL register.

Now, simply add a 6 to CL. In this way we shall have 56 in the CL register. And here comes the nice part:

ROL <key>, CL

I hope you got it. As CL is 56 and 56 is divisible by 8, it means that the key register will roll around 7 times, but still remain the same...

Now, what do you think a code emulator will do ? Before you do all the above stuff, put a random number at the address where the 50 will be. Be sure that number is odd. If a code emulator will simply jump over the FPU instructions, as most of them do, at the end it will retract in the CL register the odd random number, which means that the ROL instruction will permanently damage the key making it impossible for the emulator to correctly decrypt the encrypted code.

This is just an idea. You can think of more. For example try dividing by 2. Any odd number divided by 2 gives a .5 decimal. Also you could obtain the 6 in the same devious manner. Let's take an example:

FLDPI ST(?) is a FPU instruction that loads the PI number (3.141592654) into the ST(?) register. Now, we all know that PI*2 = 6.283185307. Which only leaves us to take the integer part of this and we have the 6!!

This is a simple method. We can think of more complicated ones, like:

Compute ArcTangent(1). This gives us a result equal to PI/4 (0.785398163). FMUL it by 4 and we have PI. Then SQR the number and we have PI*PI (9.869604401), and then FSUB a PI from it and we have 6.728011748. Now just remove the integer part and you'll have 6!

I tell you, there are hundreds of methods to do that.

Of course, all the above sequences will render to some very easy recognizable signatures. Therefore, these sequences should only be used in a second level decryptor. That means that you have your virus protected by a poly generated decryptor which kills any string scan possibility, but still it cannot have enough auto-generated anti emulating and anti-debugging routines. After the poly decryptor finishes it's work, it should give control to a second decryptor. Here you can insert all the anti-debugging and anti-emulating stuff you like.

Well, now I'm gonna go more deep. Do you remember Taylor's formula? This formula hides a lot of things we can play with. Let's see. If we have a function and we want to compute the value of that function for a particular value, sometimes it's impossible to do it without Taylor's formula. However, I will use it on a not so difficult function and that is EXP(X), or e to the power of x, where e is 2.718281828...

The general formula for the Taylor series is:

(x-a)^1 (x-a)^2 (x-a)^n n f(x) = f(a) + ------- * f'(a) + ------- * f''(a) + ... + ------- * f(a) 1! 2! n!

where a is a choosen constant.

A less difficult approach to this is the MacLaurin series, which is almost the same as Taylor's, with the difference that the constant a is 0. So we have:

x^1 x^2 x^n n f(x) = f(0) + ---- * f'(0) + ---- * f''(0) + ... + ---- * f(0) n! 2! n!

And we know that EXP(0) = 1, which means that all the f'(x) is 1 and disappear. So the formula remains like this:

x^2 x^3 x^n EXP(x) = 1 + x + --- + --- + ... + --- 2! 3! n!

The problem is, how deep are we gonna go in the search for the real result of the calculation ? I mean, which should be the value of the n number ? Let's look at this example table for x=3.

X^n | n | n! | X^n/n! |
---|---|---|---|

1.00 | 0 | 1.00 | 1.00 |

3.00 | 1 | 1.00 | 3.00 |

9.00 | 2 | 2.00 | 4.50 |

27.00 | 3 | 6.00 | 4.50 |

81.00 | 4 | 24.00 | 3.38 |

243.00 | 5 | 120.00 | 2.03 |

729.00 | 6 | 720.00 | 1.01 |

2,187.00 | 7 | 5,040.00 | 0.43 |

6,561.00 | 8 | 40,320.00 | 0.16 |

19,683.00 | 9 | 362,880.00 | 0.05 |

59,049.00 | 10 | 3,628,800.00 | 0.02 |

177,147.00 | 11 | 39,916,800.00 | 0.00 |

531,441.00 | 12 | 479,001,600.00 | 0.00 |

1,594,323.00 | 13 | 6,227,020,800.00 | 0.00 |

TOTAL | 20.08 |

As you can see, starting from n=11 we have only 0 on the last column. This means we can safely compute only 10 steps. Now, using Taylor's formula we have computed that EXP(3) = 20.08. Now, take a calculator and calculate it. Yes, I know, it's 20.09, or 20.086, depending on the calculator. Anyway, what we are interested in is the integer part. But first, let's look at a way to compute all this:

We need:

- A factorial routine
- A power routine
- A divide function
- An add function

This is a somehow optimized factorial routine (doesn't takes into account N=0)

;we enter with CX filled with the N number and we exit with AX filled ;with N! Factorial proc near fild word ptr [m] ; load 1 fild word ptr [m] ; three times fild word ptr [m] repeat: fmul st(1), st ; multiply by the base fadd st, st(2) ; increase the base loop repeat ; and repeat fincstp ; mov ST(1) to ST(0) fistp word ptr [m] ; store the result mov ax, word ptr [m] ; and get it into AX ret m dd 1 Factorial endp

We'll use the simple method of consecutive multiplication, as we only have 10 steps to go and the power we raise to is an integer number. The procedure will raise AX to the power CX:

Power Proc Near mov word ptr [m], ax fild word ptr [m] fild word ptr [m] dec cx repeat: fmul st, st(1) loop repeat fistp word ptr [m] mov ax, word ptr [m] ret m dd 1 Power Endp

Of course, the above procedures do not handle the exceptions (like 0!, or x^0). For the complete program, look at the TAYLOR.ASM file included in this tutorial.

And here comes the fun part:

ADD <KEY register>, AX

So, AX means 20, if the CPU/FPU executed all the instructions as said above, the register that holds the key should increase with 20. If the debugger or code emulator didn't compute correctly one of the instructions than the key we'll be added a random number, killing the decryption process completely (of course don't forget to set AX with some big random number before running the Taylor procedure).

Here is, however one of my favorite decryptors I have ever think of. It's main background is the propriety of three numbers, known as Phytagora's numbers. These numbers (a, b, c) verify the following formula:

a^2 = b^2 + c^2

Now, all you have to do is find 3 numbers that meet this propriety, like for example a=5, b=4, c=3. In order to do that, you must choose two random numbers (let's call them m and n) and apply the following formulas:

a = m*m + n*n b = 2*m*n c = |m*m - n*n|

The main propriety of the Pythagora's numbers is that if they are used as a triangle's sides, then the angle against the a side will always be 90ø:

|\ b | \ a | \ |______\ c

Therefore, given the fact that one triangle's angles summed give a total of 180ø, we can say that angles B and C summed give 90ø (where B is the angle made by a and b, and C is the angle made by a and c).

We also know how to compute these angles, as:

cos(B) = c / a ==> B = arccos(c/a) (1) cos(C) = b / a ==> C = arccos(b/a) (2) sin(B) = b / a ==> B = arcsin(b/a) (3) sin(C) = c / a ==> C = arcsin(c/a) (4)

and B + C = 90ø, which leads us to our main formula:

cos(B + C) = cos(90ø) = 1

and for cos(B+C) we have the following formula:

cos(B+C) = cos(B) * cos(C) + sin(B) * sin(C) = 1, so,

using (1), (2), (3) and (4), we have that:

cos(B+C)=cos(arccos(c/a))*cos(arccos(b/a))+sin(arcsin(b/a))*sin(arcsin(c/a)) = 1 (aprox.) ==>

we must round cos(B+C) in order to have 1.

So, we choosed 2 random number (which, BTW don't have to be integers, they also may be floating point numbers) which led us to 3 other numbers that meet Pythagora's propriety and using the last formula we are sure we'll obtain a result that equals 1.

As you can see, we are forced to use the ArcSin and ArcCos functions. Unfortunately, the FPU doesn't have these functions. However, it has the FATAN function, which computes the ArcTangent. In order to obtain the arcsin and arccos we can use the following formulas:

ArcSin = ArcTan(x/sqrt(1-sqr(x))); ArcCos = ArcTan(sqrt(1-sqr(x))/x);

Let me take a brief example:

m = 1 n = 2 a = 1*1 + 2*2 = 1 + 4 = 5 b = 2*1*2 = 4 c = |1*1 - 2*2| = |1 - 3| = |-3| = 3 verification: a^2 = 5*5 = 25 b^2+c^2 = 4*4 + 3*3 = 16 + 9 = 25

Now let's compute angles:

B = arccos(c/a) = arccos(3/5) = 0.927295218 C = arccos(b/a) = arccos(4/5) = 0.6435011879 B = arcsin(b/a) = arcsin(4/5) = 0.927295218 C = arcsin(c/a) = arcsin(3/5) = 0.6435011879

These have been computed using the ArcTan formula presented above.

cos(arccos(3/5)) = cos(0.927295218) = 0.9998690361902 cos(arccos(4/5)) = cos(0.6435011879) = 0.9999369305892 sin(arcsin(4/5)) = sin(0.927295218) = 0.0161836481643 sin(arcsin(3/5)) = sin(0.6435011879) = 0.0112309769722

so:

cos(B+C) = 0.9998690361902 * 0.9999369305892 + 0.0112309769722 * 0.0161836481643 = .----------------. = 0.999805975039 + 0.000181758179 = | 0.999987733218 | '----------------' ==> round(cos(B+C)) = 1 (bingo ! ;-))

As I said, the m and n numbers may be floating point numbers which will lead to floating point a, b, c's... Much nicer to handle them.

Let's see which FPU instruction do we need:

- FMUL
- FSIN
- FCOS
- FATAN
- FDIV
- FADD
- FROUND
- FSQR
- FSQRT
- FSUB

I would say rather plenty (not counting the loading and storing instructions...). Taking into consideration the quickness of the FPU, the above formula is completed very quickly. I want to see an emulator emulating it!

What do we do with the 1 we obtained ? We can use it to increase the pointer in the code to be decrypted, we can use it to increase the encryption key, or anything we can think of.

Included alongside this article you have a demonstration of the above calculations in the PYT.ASM file.

Also, both methods are used in the CTAYLOR.ASM and CPYT.ASM files which have the purpose to demonstrate the way to use the two methods presented which I called the FPU.Taylor.Crypt and FPU.Pythagoras.Crypt. Basically the programs will display a text on the screen, then it will display it scrambled and then unscrambled again. You can see the speed of the procedures there. I doubt that there exist any code emulator written yet to emulate that code!

Another nice way to use FPU instructions is to create self modifying code. Basically this is done like this:

- make a FPU calculus with a known result
- store the result on the following dword

For example, we have know how to obtain the number 00001234h. That's 17185 - 12525, for example.

We'll make this:

mov al, 13h mov si, 0 lea bx, patch add bx, 4 finit fild word ptr [b] fild word ptr [a] fsub ; ST(0) fist dword ptr [patch] patch: sub ax, 14h nop nop js patch ... a dw 12525 b dw 17185

In the moment the integer number is stored over the 'patch' address, the instruction sub ax, 14h changes to:

xor al, 12h add [bx+si], al

This means that after the XOR Al will turn to 1.

[bx+si] points to patch+4. By doing the Add [bx+si], the two NOPS will change into Xchg ax, cx. This instruction will put 1 into CX. Furthure, you can use the number 1 in CX in your code. If a code emulator skips the FPU instructions, the whole code goes to hell... This is because sub instruction will get executed and AX will be signed a reeeeeeally long time, which leads us into a very long loop with a conditional jump. This particular kind of jump kills many code emulators which pretend to return to the place where the condition happened and go on with the code... But what do you do when the code goes infinite?

Ok, everybody sometimes thinks that he discovered something marvelous. He is so happy... until he finds out that someone else discovered the same thing like a few years ago... ;-) It doesn't mean you are an illiterate, but, you just didn't read that particular book... Well, this happened to me to. I thought I found out something really neat, but it seems that another guy, a great coder named .... made this up way before I even thought about FPU's. It's called 'moving memory using FPU'. The message about this showed up on my virus mailing-list and I give full credit to it's author, but still I will present it here as I think it's a great idea.

So, the basic beyound this is that we have a load function in the FPU and a store function too. So:

; make DS:ESI point to the source code ; make ES:EDI point to the destination code ; ECX = length of code to be moved ; the code length is calculated in 16 bytes chunks mov_loop: fild qword ptr [esi] fild qword ptr [esi+8] fxch fistp qword ptr es:[edi] fistp qword ptr es:[edi+8] add edi, 16 add esi, 16 sub ecx, 16 jns mov_loop

So, this procedure moves memory very quickly and is undetectable for now by any AV or code-emulator. Hope you will use it smartly...

These would be some thoughts and ideas about how you can play with the FPU instructions, but I repeat, there are thousands and thousands of ways to do it. And, as I said, almost no emulator or real debugger can break it. If you can, you should use more than one method just to be sure, because however some AV's started emulating a couple of instructions.

I called this method in this way, exactly because we are about to use a matrix in order to obtain our encryption. Ok, so the usual algorithm for encryption just takes one by one bytes or words or dwords or whatever and applies an math operation over them and then stores the result. This is a linear encryption which can be broke very easy by a good programer. However, if we are creating a devious, hard to understand when coded encryption method, we got big chances. So, let's start. Let's say we have to encrypt a part of a file that looks like this:

a1, a2, a3, ... , an

where ak are the encryption unit (byte, word, dword, qword, tbyte,..)

We than we'll take the first 25 units and arange them in a sqare matrix like this:

a11 a12 a13 a14 a15 a21 a22 a23 a24 a25 a31 a32 a33 a34 a35 a41 a42 a43 a44 a45 a51 a52 a53 a54 a55

Ok, now let's define what is 'giving a roll to the matrix'. Imagine that the above matrix is a piece of paper. A sqare. And you want to fold it over the first diagonal. You would obtain this result:

a _________ _________ | /| | / | / | ------> | / | / | | / | / | | / (we took corner b over corner a) |/________| |/ b

Now, the same thing is what we shall do with our matrix above. We shall take each value from beneath the first diagonal and bring it over the opossite value. We'll do this by applying a math formula. First we are gonna apply an 'ADD-ROLL', which means that each element beneath the first diagonal will be added to it's pair above the diagonal. Let's see what do we get:

a11+a55 a12+a45 a13+a35 a14+a25 a15 a21+a54 a22+a44 a23+a34 a24 a25 (FD - ADD-ROLL) a31+a53 a32+a43 a33 a34 a35 (First Diagonal Add Roll) a41+a52 a42 a43 a44 a45 a51 a52 a53 a54 a55

So, I think it's clear enough. All elements above the first diagonal were added the elements beneth the first diagonal. In the second step we shall apply a SD-SUB-ROLL, which means that we are going to take the left-down corner and put it over the right-up corner and the math operation between the elements will be substract. I'm not drawing another matrix because I hope it's clear. Then we are going to apply a H-XOR-ROLL (horizontal xor roll), which means that we are taking all elements beneath the horizontal middle line of the matrix and xor them over their opossite elements above the horizontal line. Finally we apply a V-ADD-ROLL (vertical add roll), which means we add every element from the left side of the vertical center of the matrix to their opossite elements on the right side.

After all these are done, we can say that our initial matrix is pretty messed up. Let's call the final scrambled matrix A, and define it like this:

A11 A12 A13 A14 A15 : : A51 ........... A55

So, the final formulas after applying the above rollings are:

Encryption formulas (I noted the XOR operation with '|'):

A11 = (a11+a55)|a51 A12 = (a21+a45-a21-a54)|a52 A13 = (a13+a35-a31-a53)|a53 A14 = (a14+a25-a41-a52)|a54 + (a12+a45-a21-a54)|a52 A15 = (-a51-a15)|a55 + (a11+a55)|a51 A21 = (a21+a54)|(a41+a52) A22 = (a22+a44)|a42 A23 = (a23+a34-a32-a43)|a43 A24 = (-a24-a42)|a44 + (a22+a44)|a42 A25 = (-a25-a52)|(-a45-a54) + (a21+a54)|(a41+a52) A31 = a31+a53 A32 = a32+a43 A33 = a33 A34 = -a34-a43+a32+a43 A35 = -a35-a53+a31+a53 A41 = a41+a52 A42 = a42 A43 = a43 A44 = a44+a42 A45 = -a45-a54+a41+a52 A51 = a51 A52 = a52 A53 = a53 A54 = a54+a52 A55 = a55+a51

So, now we have our scrambled matrix. Of course, as you can see there still are there a couple of codes that didn't get encrypted. No problem ! So, we have 25 elements. Let's see:

if the a's are bytes we have 8*25 = 200 bits if the a's are words we have 16*25 = 400 bits

Anyway, the total number of bits is divisible by ten. Now here is the thing. You should create a 10 bit long key. Why ? Because this is most unusual. Put the first 8 bits in the register Al, for ex., and the other 2 bits in register Bl, like this:

aaaaaaaabb000000 | al || bl |

Now, we put our scrambled matrix like this:

A11, A12, ... , A21, A22, ... , A55

And we look at it at the bit level. First apply a XOR over the beginning of the elements. Then increase the key like this:

000000aaaaaaaabb | al || bl |

This is easily done using the shifting with carry. Then increase the pointer with one byte and apply again. It will be like this:

bits to scramble: xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx ... key: aaaaaaaa bb000000 aaaaaaaa bb000000 000000aa aaaaaabb 000000aa aaaaaabb ...

I hope you get it. You scramble at bit level with a ten letter key, with an interleaved algorithm, leaving four unscrambled bits every 12 bits. I would say it's a rather peculiar cryption. The decryption is very easy. The only nifty thing is that if someone sees the code that does the above thing disassembled he will have to work his butt of a few days to figure out the encryption.

OK! Now we have our matrix completely encrypted and placed in the encrypted code place. We return to the unencrypted code. We firstly took 25 units. Now we must take another 25 units and continue with the algorithm. And so we do, until we don't have 25 units, so we create a matrix with the last elements padded with zero. And there you have the Envelope of the matrix cryption method.

You will ask, well how the hell do I decrypt it ? Thought I'd leave you here? ;-)

So, first, when decrypting you must start again by retriving 25 units from the crypted code. Then you decrypt the 10 bit key encryption and you have the originall A11...A55 matrix. Here are the formulas to decrypt the matrix in order to obtain the original a11, a12,..., a15 elements. And no, they are not in a random order. They are in the exact order in which you CAN decrypt them! Here we go:

a11 = A33 a42 = A42 a43 = A43 a51 = A51 a52 = A52 a53 = A53 a54 = A54 - a52 a55 = A55 - a51 a44 = A44 - a42 a41 = A41 - a52 -a45 = A45 + a54-a41-a52 a31 = A31 - a53 -a35 = A35 + a53-a31-a53 a32 = A32 - a43 -a34 = A34 + a43-a32-a43 a11 = A11|a51 - a55 a21 = A21|(a41+a52)-a54 a22 = A22|a42 - a44 a23 = A23|a43 - a34+a32+a43 -a24 = (A24 - (a22+a44)|a42)|a44 + a42 a12 = A12|a52 - a45+a21+a45 a13 = A13|a53 - a35+a31+a53 -a15 = (A15 - (a11+a55)|a51)|a55 + a51 a14 = (A14 - (a12+a45-a21-a54)|a52)|a54 - a25+a41+a52 -a25 = (A25 - (a21+a54)|(a41+a52))|(-a45-a54) + a52

You must negate a45, a35, a34, a24, a15 and a25.

That's it! Applying these formulas you have the original state of the matrix which you put in it's decrypted string place unit after unit.

And as we spoke so much about FPU, I think I don't have to mention that all the above calculations may be done using the faster fpu instructions. I doubt any code emulator will be able to 'violate your mail' by ripping your envelopes ;-)))

I sure hope I'll get the time to write an example on this. Unfortunately, I didn't...

What I am going to tell you now is something I am not quite sure that it will work on all systems, but if you read it, you may try it at least. The main ideea is this: swith to P-Mode using a strange manner, do sumthing and then switch back to real-mode... A code emulator will be dead by then...

Here is the main thing we are interested in:

Some multiplex interrupts. First, the interrupt that tells us if we have DPMI present and if so, what is the address we need in order to switch to it. It goes like this:

Expects:AX = 1687h Returns:AX 0000h = successful else = no DPMI host present BX flags: bit 0: 0=32-bit programs are not supported 1=32-bit programs are supported bits 1-15: not used CL processor type: 02H = 80286 03H = 80386 04H = 80486 05H = Pentium >5 = reserved for future Intel CPUs DX DPMI major + minor version number (e.g., 010aH=1.10) SI number of 16-byte pares needed for DPMI host private ES:DI entry address to call to enter Protected Mode

SI on return, this is an amount of real-mode memory, in 16-byte paragraphs, that you must supply when you process the switch (see below). It might be 0000H, indicating no memory needed.

ES:DI on return, this is the Entry Address you must call (via a FAR CALL) in order to switch to protected mode. The calling parameters are:

Entry: AX= 0000H = you'll be running as a 16-bit application 0001H = you'll be running as a 32-bit application ES= the segment of the memory you're be supplying to DPMI host. If SI was 0 after INT 2fH 1687H, then ES is ignored. Return: CF set (CY) if switch to protected mode failed (and AX is a DPMI Error Code) CS = selector for your code segment (64K limit) SS = selector for your stack segment (64K limit) DS = selector for your data segment (64K limit) ES = selector for your program's PSP (256-byte limit) FS = 0 (on 80386+ CPUs)

There's no need to flush here the DPMI error codes... It's either you can or you cannot enter PMODE. Let's check a little program that should (or at least I hope) go into PMODE and back:

start: mov ax, 1687h ; We call the multiplex int int 2fh ; cmp ax, 0 ; do we have DPMI ? jne no_dpmi ; no... mov switchcall, di ; if so, save the switch call address mov switchcall+2, es; offset and segment cmp si, 0 ; check if we need memory in 16 byte chunks je no_mem ; no... mov bx, si ; otherwise allocate memory mov ah, 48h ; using DOS int 21h ; jc error ; if this occurs you have no memory... ; so you might need to shrink mem using 4Ah first... no_mem: mov es, ax ; put the new segment in ES mov ax, 0 ; choose 16 bit application call switchcall ; and switch to PMODE jc cantswitch ; error ? mov ax, 0400h ; try to use PMODE interrupt 31h int 31h ; ; mov ax, 4c00h ; switch back to REAL mode int 21h ; ; no_dpmi: ; cantswitch: ; mov ax, 4c00h ; and quit int 21h ; ; switchcall dw 0, 0 ; call switch address

So, as you could see, we have the interrupt 31h we can use. In order to use it, you must have a real grip on what a selector, descriptor, etc. is, so better check a DOS32 documentation. The usefull functions are these:

AX | Function Use |
---|---|

0000H | (allocate LDT descriptors) |

0001H | (free LDT descriptor) |

0002H | (segment to descriptor) |

0003H | (query selector increment value) |

0006H | (query segment base address) |

0007H | (set segment base address) |

0008H | (set segment limit) |

0009H | (set descriptor access rights) |

000aH | (create alias descriptor) |

000bH | (query descriptor) |

000cH | (set descriptor) |

000dH | (allocate specific descriptor) |

000eH | (query multiple descriptors) |

000fH | (set multiple descriptors) |

0100H | (allocate DOS memory block) |

0101H | (free DOS memory block) |

0102H | (resize DOS memory block) |

0200H | (query real-mode interrupt vector) |

0201H | (set real-mode interrupt vector) |

0202H | (query processor exception handler vector) |

0203H | (set processor exception handler vector) |

0204H | (query protected-mode interrupt vector) |

0205H | (set protected-mode interrupt vector) |

0300H | (simulate real-mode interrupt) |

0301H | (call real-mode for FAR RET return) |

0302H | (call real-mode for IRET return) |

0303H | (allocate real-mode callback address) |

0304H | (free real-mode callback address) |

0305H | (query state save/restore addresses) |

0306H | (query raw mode switch address) |

0400H | (query DPMI version) |

0401H | (query DPMI capabilities) |

0500H | (query free memory information) |

0501H | (allocate memory block) |

0502H | (free memory block) |

0503H | (resize memory block) |

0504H | (allocate linear memory block) |

0506H | (query page attributes) |

0507H | (set page attributes) |

0508H | (map device in memory block) |

0509H | (map conventional memory in memory block) |

050aH | (query memory block size and base) |

050bH | (query memory information) |

0600H | (lock linear region) |

0601H | (unlock linear region) |

0602H | (mark real-mode region as pageable) |

0603H | (relock real-mode region) |

0604H | (get page size) |

0700H | (mark page as demand paging candidate) |

0701H | (discard page contents) |

0800H | (physical address mapping) |

0801H | (free physical address mapping) |

0900H | (disable virtual interrupt state) |

0901H | (enable virtual interrupt state) |

0a00H | (query vendor-specific API entry address) |

0b00H | (set debug watchpoint) |

0b01H | (clear debug watchpoint) |

0b02H | (query state of debug watchpoint) |

0b03H | (reset debug watchpoint) |

0c00H | (setup DPMI TSR callback) |

0c01H | (protected-mode terminate and stay resident) |

0d00H | (allocate shared memory) |

0d01H | (free shared memory) |

0d02H | (serialize on shared memory) |

0d03H | (free serialization on shared memory) |

0e00H | (query coprocessor status) |

0e01H | (set coprocessor emulation) |

As you can read in the descriptions, quite a few interesting things out there. But, as I said, I don't have time to write on this anymore, so, just go ahead and try using some of the above functions. I think it would bea really neat to allocate DOS memory from protected mode and then return to real mode and use it... although I didn't try it ;-)).

So, as I said, I left this behind me now and I am going towards Win32 programing. In order to do that I spent a lot of time studying, I had to read my butt out and gather utilities and tutorials and tools and everything, so I kinda left this away... So, I guess my next article will be on Win95/98...

Write me anytime with suggestions, ideas or anything at:

From time to time check my page at:

http://members.tripod.com/~lordjulus

If you are interested in virii news and info, you may try to join my virus list by sending a blank e-mail to:

All the Best!

Lord Julus - 1998 (c)

I would like to thank the following: Qark, Quantum, RockSteady, DarkAngel, Hellraiser, MrSandman, Darkman, VirtualDaemon, JackyQwerty, Azrael, B0z0, Neurobasher, NowhereMan, TheUnforgiven, LiquidJesus, a.s.o... Lord Julus

[Back to index] [Comments]By accessing, viewing, downloading or otherwise using this content you agree to be bound by the Terms of Use! vxer.org aka vx.netlux.org