In this thread I will try to present the most sophisticated methods that the malware uses to avoid detection by the antivirus companies, I will start with a broad theoretical introduction on polymorphism, metamorphism and the EPO technique; then we will exemplify the theory with practices and analyze the results to draw conclusions.
First of all, it is not a malware undetected tutorial, but that the information was treated in an objective way for the awareness and training in cyber- defense of readers. Needless to say, it is not the goal to apologize for infection with malware.
Also clarify that as the entry progress will become quite technical, it will not be an informative text.
Once clarified we will start with the Polymorphic malware.
The polymorphism in malware is not new, it has a great trajectory since in 1989 the developer Mark Washburn created the malware known as 1260 or V2PX, as proof of concept to demonstrate the deficiencies of the signatures based on strings. Later this POC would evolve into a new virus known as V2P2.
But before entering concrete examples, the first thing is to know that it means that a virus is polymorphic. To clarify this concept I quote a paragraph from the book
by David Harley, Robert Slade and Urs Gattiker (book that I recommend 100% of better than I have read in relation to this subject):
"Another word that inspires panic in the press is polymorphism, a concept poorly understood by instant experts and generally overestimated in its long-term impact on the malware problem in general. A polymorphic ("many shaped") virus attempts to make detection of its presence more difficult by changing its "shape" from one infection to another. (The mechanisms for achieving such a shape-shifting will be considered later.) This is often mistaken (meaning not only by the press, but by writers of low-grade books on security and / or viruses) as meaning that the virus becomes a different virus or virus variant at each infection. This is not the case. A polymorphic remains the same virus but can not be detected by looking for a characteristic scanning string within the possibly infected file (or other infectable object). The code remains essentially the same, but the expression is different, so that the same program is represented by a different sequence of bytes. This is not meant to indicate, however, that polymorphic viruses are undetectable, although the first examples contributed to the disappearance of a number of early anti-virus products published by vendors who could not handle the problem. It does not mean that the programmer has to think about grep-like scanning of infectable objects using pattern matching with regular expressions."
" In relation to computer viruses a polymorphic code or polymorphism is one that uses a polymorphic motor to mutate itself while keeping its original algorithm intact. This technique is commonly used by computer viruses and worms to hide their presence. "
Well, once we have the concept of clear polymorphism, let's continue with the examples, historical.
After the publication of 1260 and later V2P2, other researchers developed tools for the development of polymorphic malware, an example of which was the creation by "Dark Avenger" of a polymorphic engine, which allowed the generation of polymorphic viruses.
Another example, perhaps one of the most known malware, due to its exponential propagation and its high infection rate, was the VBS / LoveLetter, which obtained an infection rate of 50 million computers in 2000. Scheduled in VisualBasicScript and spread through email, this malware was a work of art, something never seen , coming to infect computers of the Pentagon, the CIA, the British Parliament and numerous multinationals, plus millions of computers particulares.Este case is discussed in depth in the above mentioned book, in chapter 14.
Once we have clear the concept of polymorphism and we have seen some examples we can move on to the next point.
To explain this concept I quote a good description by Kapersky Labs :
"A metamorphic virus is one that can be transformed according to its ability to translate, edit and rewrite its own code. It is considered the most infectious computer virus and, if not quickly detected, can cause serious damage to a system. In addition, it rewrites itself and reprograms itself every time it infects a computer system . Due to its complexity, the creation of metamorphic viruses requires extensive programming knowledge. "
The most famous historical example is the " Black baron SMEG ".
SMEG "Simulated Metamorphic Encryption EnGine" or simulated metamorphic encryption engine, as "encrypt" is not collected in the SAR. Returning to the example above, two variants of the Black baron, SMEG.Pathogen and SMEG.Queeg are highlighted . Both with very similar operations, files of the type DOS that infected files of type .exe , generally a maximum of 32 by equipment to make difficult its study. In later versions, these two sub-variants of SMEG only infected files from 17:00 to 18:00 on Monday , typical behavior of a logic bombhttps://en.wikipedia.org/wiki/Logic_bomb.
Once we know that it is the metamorphism and polymorphism, we come to see their differences, because a priori they may resemble much in concept, but this is not so.
Well, now I will go to show some basic codes with a very didactic example, for this I will use a scripting language called Autoit .
For this example I will use something very very basic, substitution of variables by strings, in this case to show a Msgbox;
As we see the example is easy to understand, instead of writing the command msgbox (0, "Example polymorphism", "sadfuk") , we replace the letters "sadfuk" with variables, something very easy to understand by anyone and more by an analyst, but that inexplicably some antivirus resists this method . The polymorphism is about curling the curl to the end execute the original code, the more laps we give, the better for malware and the more complicated for security solutions.
In the case of metamorphism the thing is already complicated, the metamorphic malware has a slight resemblance to the logic bombs. Basically what the metamorphic malware will do is modify its behavior according to the environment in which it is, for example I will use another script in autoit that will detect if the execution was done inside a sandbox and in function of the result will execute one thing or another :
Looking for an example for a metamorphic code is very difficult, since you have to enter specific cases, another example explained in words could be:
If the operating system is windows7 do X
If the operating system is windows10 beam Y
The possibilities for both polymorphism and metamorphism are endless, it's all a matter of imagination and ingenuity.
Let's move on to the next technique that is not as well known.
The EPO technique known as "Entry Point Ocultation" has something in common with the logic bombs , the malware is activated if a specific circumstance is fulfilled, but in turn has a big difference , the condition depends on the interaction of the user, not the environment.
Basically imagine a scenario in which we have a binary, run it, use it and apparently nothing happens. It may be that we simply have not activated the function that starts the malware. For example, imagine that we have a payload harcodeado inside a legitimate software, this payload only runs when giving the cross close. From there the name, we do not really know when it is going to run, and when it does not know how or why, why to close it and not to open it, why when you press a certain button?
Well, I think the clearest way to explain this is with an example.
We start from the scenario where we created a payload with metasploit and have the cleancode:
The first step would be to insert our shellcode into an area within the legitimate software, wherever we want, without altering its operation. for the example I will use a commercial software that obviously does not say the name, but can be done with anyone.
We open the binary in OLLY and look for a zone with place to put the shellcode
Copy the shellcode in binary mode
Now we have to decide the condition that must be given for the shellcode to be executed, it will do when closing the legitimate software, so the reference to look for (cntrl + N) is "ExitProcess"
Once we have it with the Enter key, it gives us the call references to the function, we just need to redirect with the JMP command (shellcode start offset), and execute the payload, at the end of the shellcode again a jmp of original exitprocess) so that the original software remains fully functional.
As we see this technique helps that the malware goes unnoticed, but not for long, any analyst who opens with IDA or any other tool of the style, you see or should see that there is a payload embedded in the middle of the code.
A catch of this method is that if the payload is detected, the modified software will have that detection, but if it is not, it will go very unnoticed until it is hunted.
Well, once we have examples of EPO and metamorphism; and clear polymorphism concepts we are going to go through the practical examples to evaluate the effectiveness of the polymorphism.
This is the point that I really wanted to get, but it is not worth anything without reading the previous paragraph and what is more important: to have understood the concepts exposed.
If you have skipped directly here, you will lose, in addition to this expensive entry will have remained in your memory as a simple tutorial or guide polymorphic malware development, and above all I want to avoid that.
Having said that, from here we will show methodologies of polymorphic malware, with different programming languages, all of scripting that is where it is best viewed and where the malware has more weight.
These languages ??will be to start with something simple batch, later VisualBasicScript, and we will finish with a deep analysis of polymorphic malware in Autoit.
This topic already treats (ANTIVIRUS EVASION IN BATCH FILES) entry, but here will be carried out a statistical analysis, seeing the level of difference or the rate of polymorphism that we can reach in different files, with different methods.
Go for it.
For these examples I will use a tool that I developed a long time ago with the objective of identifying patterns or tricks in matters of modding for the malware undetecting. This tool will compare two files bit by bit, showing the changes made.
Again I will use the same script as in the previous entry and the same tool.
The scan of the original script is as follows
To begin with, I compare this script with one generated with a polymorphic engine, which will generate a random variable for each lower case and replace it in the text. Example:
set kgxsqwm = a
set pqdbren = b
set yjfhlil = c
set hlpxqnf = d
set taeehog = e
set uhlfkzu = f
set rtbjcpp = g
set mxpirhb = h
set dlveneg = i
set apdxstg = j
set sgmgxoj = k
set amepvto = l
set vzzyxlv = m
set paklefi = n
set cwkyvft = o
set ptcjoef = p
set vqspkla = q
set lneutiz = r
set mdtejui = s
set tnthaax = t
set xzusnjj = u
set qdnymdh = v
set fnpgabm = w
set zztsauu = x
set hllstzi = y
set wmfquxs = z
Therefore the first line that is @echo off would be like this
% taeehog %% yjfhlil %% mxpirhb %% cwkyvft%% cwkyvft %% uhlfkzu %% uhlfkzu%
As we see is a very basic polymorphism method, obviously each variable is generated different every time, that is what makes it a polymorphic engine, and we are not talking about simply obfuscated code.
Well with this method we managed to lower the detection ratio to 1
I think that with this small example you can already perceive something of the power that can have a polymorphic engine and how dangerous it is if it falls into the wrong hands.
Being a polymorphic engine, every time we generate a new sample this has a% difference from the original script and other samples. And here's the important thing, and where the danger of polymorphic malware lies, let's look at it with data.
We will first compare the original sample with the obfuscated one to see how different they are. The red part represents the different zones and the blue zones that coincide
In this case the result is the following:
The sample created by the polymorphic engine based on the original sample is 100% different. In this case being a very basic method of obfuscation would not be a problem for an analyst, but it is a very good data, we will see more detail later.
Now we are going to compare two samples generated with the polymorphic motor
The result indicates that the two samples generated from the polymorphic motor are 98% different. But what does this imply? Well, this in terms of cyber-security from the point of view of a malware analyst, is a headache, only has 2% where to place that detection to detect all samples generated based on this polymorphic engine. But of course, the engines do not have only 1 parameter, in fact, we can re-obfuscate with exactly the same procedure again the obfuscated, bone, iteramos 2 times.
Now if we compare the new sample, configured with the same option, but iterated 2 times, with the sample iterated only 1 we get a 100% difference. Here the analyst has nothing to do, can only refer to manually de-obfuscate the code and try to get to the original.
When iterating twice we get that the declaration of the 1st iteration of the variables
De-obfuscating that would not be difficult, would be simply laborious but we have said that there are more parameters, we will try another combined with the first option iterated 2 times.
We have this: https://pastebin.com/MwwkL85k That there is obviously no analyst there and that may not seem like it, it works.
That comparing to the original file to this last sample returns us to give 100% of poliformismo
And with the sample of option 1 iterated twice more of the same
As we see, with a polymorphic engine we can start from a sample of malware highly detected to generate infinite samples, all different and functional, somewhat dangerous and very complicated to detect.
But as I say that, I also say that it is not so easy, for example with the last method, to change the coding of characters, passing twice the same sample by the polymorphic engine, those two new samples are almost the same all
This happens because they are .bat, batch files and they look like text, but in the end we will get with Autoit binaries, and the thing will be further complicated.
Once seen the easy example, that many people could program without using too much time, and that therefore we are assuming what a polymorphic motor supposes (or hopefully) we will pass to visual basic script, very used in malware, in botnets, like Safeloader, the Gorynich (diamond fox)
As I said before the malware in vbs, this is fashionable, the reason? is executed through wscript and therefore has immense undetecting advantages.
For the tests of this polymorphic engine we will use a payload of houdini worm, a popular malware, is loader type, light, easy to undetectable and stable.
As we can see this more than signed by the antivirus
- The AVs that do not detect it do not know in which world they live, in order
With this polymorphic engine that we will handle now, we get into more serious issues, with more than 30 possible configurations, we are going to go to see that he offers us
Let's first test an option that will place the delimiter we say between each character, will make a series of modifications that will store in random variables to later execute the code eliminating this character and rearranging the whole code. We scan:
Here the antivirus is basically useless, the Kapersky, but not for long.
Let's go to the central part of this, see how different this sample from the original is
Well there is the result, totally different .. With only one option in the polymorphic engine not only we get the majority of antivirus or smell it, in this case is VERY SERIOUS because being a vbs heuristic procedures will not detect , privileges you have to run as a microsoft component.
Let's re-generate another sample, this time by changing the delimiter.
As another sample, it may seem the same as before, but is that the potential for malware that has vbs do not have batch files, and that heuristic and proactive defenses do not control the wscript makes it very dangerous
Now we are going to further curl the loop, we configure it so that it chops the code in X parts, in addition to storing each character in the space of an array and then through a recursive function to do all that at the moment of execution, in case we would not encrypt the result of the first case with Rot-N (same as the cesar cipher)
Again we get a 100% variation in the sample
It may seem that I am always copying the same image, but no, that is the way polymorphism is what it has, which is very effective.
Now we do not overshadow anything, we simply add garbage
As we see we change the detection, but we have only tried 3 combinations, has more than 30 possible this engine, may seem many but still the most powerful engine.
And it is that, if there is a preferred language to develop malware, that is Autoit, for its ease, its undetectability, its macros, its versatility, its compatibility in the equipment ...
And here is where I am going to extend more, when analyzing autoit, how it works, we will use a polymorphic engine with around 50 options, which offers us a power that had never seen, what you are going to see next is an authentic work of art, and I do not say it because I participated in the development, if not in an objective way, is for the moment, for me and for many people the perfect polymorphic engine. The project in principle is an obfuscator, since autoit is a language that is compiled just as easy to compile, no one likes to steal the code, but in this entry we will see how it can be applied as a polymorphic engine with very results good ones.
To clarify that Autoit despite in its terms does not allow the development of malware is widely used, and many antivirus, they sign the binaries simply for being autoit, for example K7, but then we will see that.
Autoit is programmed in .au3 files, which can later be converted to exe, but also .a3x, which would be the part without the interpreter, and are those files. a3x we are going to analyze.
I want to emphasize that the project is developed as a tool to protect code, that none of the involved in the project we are responsible for the use that is given to it.
For the tests we will use a script with some detections
Well, some detections, not many, but here what matters is the polymorphic engine, in this case an obfuscator used as polymorphic engine.
Since there are many options and it is impossible to test all the combinations, we will try with the ones that come by default
And to see what results overturn us
We have 6 samples
Original in .a3x and .exe
Obfuscated1 and obfuscated2 .a3x and .exe
Both obfuscated with the same parameters
We see that despite being obfuscated with the same configuration, the results change, we will see how much the files change
Original - obfuscated1
more than 99% difference, a very good result, the part in which they coincide is the interpreter, we will see later
Original - Obfuscated2
Obfuscated1 - Obfuscated2
In this occasion the same result, but it seems that the generation of blocks was bigger, the size increases and the location of the interpreter moves
Okay, before following a paragraph, the difference between .a3x and .exe is that compialr exe Autoit puts all includes, all functions, even if not used, and that creates executables of much greater weight, that hinders us the time to try to analyze the polymorphism between these two files, too much code we do not want, garbage, that's why we have to compare files with extension a3x
original - Obfuscated1
Here not having all that identical garbage in both, gives us the result that is worth, a totally different file
the original. Obfuscated2
A perfect result as well, as expected
Now comes the interesting, and where the power of this engine is demonstrated, same parameters:
completely different samples
The goal of this post was to show how the most advanced evasion techniques used by malware to date, is nothing new, but has been perfected to such an extent that antiviruses are useless. Imagine a ransomware, pass it through a polymorphic engine, generate 2000 samples, all different, all undetectable and distribute them. Who for that? No one, it would simply be impossible, it is seen that antiviruses are not fit, some are investing in detections by heuristics, like nod32 with HIPS, but it is not enough, they are far behind.
I often ask myself the same question, if I can do this, that cybercriminals will not do, that a person with official studies of programming will not be able to do.
And after more than 5 hours writing the entry, I only have to say goodbye until the next, not before saying that if someone wants or has interest in the tools used, I will value it, some like the last one are open source, and anyone is welcome to the project, always remembering that it is a proof of concept against reverse engineering and to protect intellectual property, the project will in principle be maintained until September 25 that vacation ends: _ (
i have posted all this last year on tomecados but i think it is from the past now hahah >>>>>>