Nowadays malware authors use a lot of techniques to hide malicious payloads in order to bypass security products and to make malware analyst life harder and fun. There are many tools that you can use to extract content from malware and there is not a standard process, you can use different tools, different techniques and different approaches to solve the same problem.
During this post I am going to quickly describe three (well, actually kind of four) of the main flows that takes me in succeed to unpack malware. But let me repeat that there are many ways to perform such a topic, I simply want to share some personal notes on my favorite flows, without pretending to write a full course material on how to Unpack Malware, which it worth of a full university class.
NB: there is a lot to say about packers, how they are, how they behave, there is much to say even on how many packers family are known, but this is not the place for that. What I am doing here is to mostly focusing on quick shot-cuts useful when you are on rush but not such powerful as debugging the entire process.
Method 0: Just Unpack It, I don’t care more
Well, if you are on rush and you just need to try to unpack a sample as quickest as possible, if you don’t care about what is going on, well Sergei Frankoff (@herrcore) and Sean Wilson (@seanmw) did a great job in releasing Unpac.ME. A web application that tries to unpack your sample, there is a limited free plan for using it, it works most of the times especially with known malware families
Method 1: The quick way
One of the quickest way to simply unpack malware is to try to figure out what packer has been used to pack your sample. Once you have the used packer you just need to run the relative un-packer and that’s it, you have done. Detect it Easy or bettern known as DiE would help you in performing such research. It has a wide signature database tracking hundreds different packers. The following image shows DiE spotting a simple (and very didactic, not really real) UPX packer.
Once you know it has been packed through UPX 3.91, just go and grab the used packer (in such case go to https://upx.github.io/) take the relative unpacker and run it against your original sample, you would see a new PE file.
Method 2: The slow but fun way to do !
This is my favorite method since it’s definitely faster than using debug and performing every step by yourself but quite powerful as well getting you the control of many actions happening into memory. Before going into this method you need to know the following main assumptions.
- The packer would performs some operations on bytes (read from external file or from the same file or taken from the network) then it will aggregate such a bytes and later on it will pass execution flow (EIP) to those bytes. We call those bytes the “payload“.
- Injecting control flows is the main strategy used by packers.
- Intercepting the injection flow will abstract us from the used packer
It is now interesting to understand how injection happens on Windows machine. Once we nailed it, we would agree that a quick way to unpack malware is just to grab content from the allocated and injected memory before the main sample (or stub) will make a change of control by passing EIP and Stack to new code.
Main Injection techniques to look for
Fortunately there are not thousands of different possibilities to inject shellcode into memory, so let take a closer look to the main ones. The most used is named process injection.
The process injection schema follows these main steps:
- OpenProcess – The OpenProcess function returns a handle of an existing process object.
- VirtualAllocEX – The VirtualAllocEx function is used to allocate the memory and grant the access permissions to the memory address.
- WriteProcessMemory – The WriteProcessMemory function writes data to an area of memory in a specified process.
- CreateRemoteThread – The CreateRemoteThread function creates a thread that runs in the virtual address space of another process.
Another very used technique is the DLL Injection which follows these steps:
- OpenProcess to Obtain the handle of the target process in which we intend to inject our DLL.
- Find the address of the LoadLibraryA function using GetProcAddress & GetModuleHandleA functions. LoadLibraryA function is used for loading the DLL into the calling process.
- VirtualAllocEX to allocate the memory space for the DLL path from where we will be loading the DLL.
- WriteProcessMemory for writing the DLL path into the allocated memory space.
- CreateRemoteThread for creating a new thread and passed the address of LoadLibraryA as the start address and the address of the DLL file as the parameter for LoadLibraryA function.
Process Hollowing is a nice and very used trick to evade endpoint security and to inject control floes. The main idea is to build a suspended process within un-mapped memory. Then replace the un-mapped memory section with the shellcode and later on map and start the process. The steps follows:
- Create a new target process in suspended state. This can be achieve by passing Create_Suspended value in dwCreationFlags parameter of CreateProcess Windows API.
- Once the process is created in suspended state we will create a new executable section. It wont be bind to any process. This can be done by using ZwCreateSection function.
- We need to locate the base address of the target process. This can be done by querying the target process using ZwQueryInformationProcess function. We can find the address of the process environment block (PEB) and then use ReadProcessMemory function to read the PEB. Once the PEB is read ReadProcessMemory function is used once again to locate the entry point from the buffer.
- We need to bind the section to the target process in order to copy the shellcode in it. To achieve this we need to map the section into current process. This can be done by using ZwMapViewOfSection function and passing handle of the current process by using GetCurrentProcess function.
- Now we will copy each byte of the shellcode into the mapped section which is created in Step 2.
- Once the shellcode is copied we can proceed to map the section into the target process. This can be done by using ZwMapViewOfSection function and passing handle of the target process.
- Once the section is mapped we will locate and construct the patch for the target process so that it can our malicious shellcode instead of the original application code.
- Once the patch is constructed we will use WriteProcessMemory to write the constructed patch into the target process entry point.
- After writing the constructed patch to the target process entry point we need to resume the thread. This can be achieve by using ResumeThread function.
Abusing the Asynchronous Procedure Call (APC) is another way to inject shellcode into processes. The way to exploit this Microsoft functionality follows theses teps:
- Create a new target process in suspended state. This can be achieve by passing Create_Suspended value in dwCreationFlags parameter of CreateProcess Windows API.
- Once the process is created obtain the handle of the target process using OpenProcess Windows API.
- Allocate the memory space for our shellcode in the target process using VirtualAllocEX Windows API.
- Write the shellcode in the allocated memory space using WriteProcessMemory Windows API.
- Obtain the handle of the primary thread from the target process using OpenThread Windows API.
- After obtaining the handle of the thread from the target process we will add a user-mode asynchronous procedure call (APC) object to the APC queue of the specified thread using QueueUserAPC Windows API which will point to the memory address of our shellcode.
- To trigger our shellcode we will resume the suspended thread using ResumeThread Windows API.
The last method that I’am going to describe in my personal notes (but there are many more out there) is called: Process Doppelgänging. Quite a recent technique it uses a very little known API for NTFS transactions.
Briefly speaking, we can create a file inside a transaction, and for no other process this file is visible, as long as our transaction is not committed. It can be used to drop and run malicious payloads in an unnoticed way. If we roll back the transaction in an appropriate moment, the operating system behaves like our file was never created.hasherezade
The process Doppelgänging is a similar technique used to inject control and to evade common AV. It follows these steps:
- Create a new transaction, using the API CreateTransaction.
- Create a dummy file to store our payload (CreateFileTransacted).
- It is used to create a section (a buffer in a special format), which makes a base for our new process.
- Now it’s time to close it and roll back the transaction (RollbackTransaction).
All these methods are useful to inject payload into memory and to run them keeping a very low rate of detection. Our goal is to intercepts those techniques and to dump the just injected paylaod.
Intercepts these techniques and drop the payload
Now we know the main techniques used by malware to unpack themselves into memory, so we are ready to understand how to hook such functions in order to grab the payload (holding the real behavior). Again there are many techniques to perform that memory extractions, I did change at least 4 workflows until now, but the one I prefer so far is using PE-sieve (download from HERE) to extract injected objects. PE-Sieve is not able to judge the dropped file (are they malicious or not?), so you cannot consider every extracted artifact as a malicious one, you rather need to manually analyze them and express your own assumptions on them.
But let’s start with a practical example. The following image represents a PE file pretending to be a PNG image.
Looking for sections and import table (IAT) we might observe the samples imports only some of the well-known functions we ‘ve just seen in the previous section (VirtualProtect, GetProcAddress, MoveMemory, etc..) and very often used to unpack malware in memory without touching hard-drive.
Even the embedded resources are quite “heavy” which would probably hide some piece of code (??). So … we have a PE file which pretends to be an image, it only imports suspicious functions and it has got a quite heavy resource. Would it be a Malware ?
Well we do have ideas and suspects but let’s see if it injects pieces of code into the memory and let’s see what they do. Here PE-sieve comes to help us. First of all you need to sacrifice a system :D. Yep, really… you need to run on your target the sample and on the other side you need to run pe-sieve by giving the PID of the suspicious sample. PE-sieve will hook and monitor the previous injection patterns and as soon as it find the right pattern it will drop whatsoever (good files, malicious implant, etc etc) the sample injects. The following image shows the found implants running that sample.
The dropped files are placed into a directory named with the monitored PID.
We get some files into that directory. We do have .json report in order to automate results and to wrap them into external projects without using the provided PE-sieve.dll. We have a couple of shellcode (.shc) and three PE. Interesting the
400000.cursor.exe since has 600KB of code and it is executable, and a new ICO different from the original one. Let’s check it’s own property (following image)
Now, let’s roll back our scarified VM and run this new file on it. Now let’s check its memory to see if something more is happening there.
It looks like we have clear text, no additional encryption/packing stage as shown in memory. We now can follow with classic malware analyses techniques by staging static and dynamic analysis. And, yes, since you are re-scarify your virtual machine, let maximize your effort to grab network traffic and see where it tries to communicate with.
We are facing a nice example of TrickBot version: 1000512 tag: tot793 . The following image shows the same information but coming from the internal systemcall rather then network traces.
So we nailed it. We’ve just extracted the real payload and later on we figured out it was a TrickBot.
Method 3: The old fashion way (debugger)
Everything can be done from the debugger. You can find the above API patterns by yourself and then follow the System calls and stop and copy whenever you want. you can extract or modify the sample behavior on fly and decide to re-run it as many times you need. Yes, you can, but this would take you a lot of time. Time runs against the economy. More time you need to perform your anlaysis more expensive you are, more expensive you are less customers you could have in both ways: money-wise (expensive = for few ~ cheap = for many) and time-wise (sine you have 24h a day, after that hours you cannot accept more customers). So you would need to mediate between quality/fun and time.
If you are following me since time you would probably remember that I was used to this method years ago, before such a great tools were realized (just few examples: IDA Pro Universal Unpacker or All In Memory CryptoWorm or New way to detect Packers etc..) but today I would not suggest you this method unless you are a student or not a professional Malware analyst.
Today I am so happy to announce a big improvement in the threats observatory (available for here). The main improvement sees the introduction of clustering stereotypes for each tracked malware family in three different behaviors: Domains, Files and Processes.
Every malware does specific actions on domains, files and processes realms by meaning that every sample contacts several domain names, spawns specific processes and eventually saves file on HD (file-less malware are a separate topic here). Collecting everything coming from their execution and clustering on strings similitude would highlight several stereotypes that would be interesting for further studies or similitude blocking lists. The following image shows the current deployment state.
What you find
According to shared information, the Cyber Threats Observatory Dashboard is composed by the following sections:
- Malware Families Trends. Detection distribution over time. In other words what are time-frames in where specific families are most active respect to others.
- Malware Families. Automatic Yara rules classify samples into families. Many samples were not classified in terms of families, this happens when no signatures match the samples or if multiple family signatures match the same sample. In both ways I am not sure where the sample belong with, so it would be classified as “unknown” and not visualized on this graph. Missing slice of the cake is attributed to “unknown”.
- Distribution Types. Based on the magic file bytes this graph would track the percentages of file types that Malware used as carrier.
- Threat Level Distribution. From 0 to 3 is getting more and more dangerous. It would be interesting to understand the threat level of unknown families as well, in order to understand if hidden in unknown families Malware or false positives would hide. For such a reason a dedicated graph named Unknown Families Threat Level Distribution has created.
- Stereotypes. Studying stereotypes would be useful to analyze similarities in clusters. In other words, it could be nice to see what are the patterns used by malware in both: domain names, file names and process names. It would be important for detection and even for preemptive blocking. Due to a vast amount of data, only the last (in term of recent) 10000 entries are included.
- TOP domains, TOP processes and TOP File Names. With a sliding window of 300 last analyzed samples, the backend extracts the TOP (in terms of frequency) contacted domains, spawned processes and utilized file names. Again, there is no filter and no post-processing analysis in that fields, by meaning you could probably find as TOP domain “google.com” or “microsoft update”, which is fine, since if the sample queried them before performing its malicious intent, well, it is simply recorded and took to your attention. Same cup of tea with processes and file names.Indeed those fields are include the term “involved” into their title, if something is involved it does not mean that it is malicious , but that it is accounted to be in a malicious chain.
A simple example
Let’s assume we want to investigate LokiBot. According with any.run: Lokibot, also known as Loki-bot or Loki bot, is an information stealer malware that collects data from most widely used web browsers, FTP, email clients and over a hundred software tools installed on the infected machine.
But let’s start digging a little bit on the Cyber Threats Dashboard and see what we can find. First of all from the Malware Families section we see the overall detection rate. Today, we might easily say that LokiBit has low rate detection percentage 0.32388 if compared to different families such as GrandCrab, Emotet or TrickBot.
From the Family Distribution Over Time section (the following image) we might appreciate the detection distribution rate. By deselecting the unwanted malware families it is possible to track the distribution of the desire one (on our case LokiBot) over the time. In the following case all families but not LokiBot have been disable (by clicking on the Malware name directly from the graph legend). We might appreciate a compelling increment of LokiBot detection on 2020-04-28 and from 2020-04-30 to 2020-05-02. It looks like to be the most active observed period for this well documented family during the 2020. This observation perfectly fits the public mainstream information which sees many security magazines and many vendors observing such an increment as well. Mostly spread over COVID#19 malspam for example: SecurityAffairs, BankInfoSecurity, ThreatPOST, FortiNet.
Digging a little bit into the specific case, we might observe the domain stereotypes. It’s nice to see that many domains stereotypes (in other words the representatives of a wide set of similar domains) have as the Top Level Domain
.cf (Central Africa Republic) and some of them are quire similar: broken1.cf, broken2.cf, and so on and so forth. Something not very original to be blocked such as:
Following on the diagram we might observe one more domain stereotype having as TLD
.ICU, in the particular
frenchman.icu (generic TLD targeting entrepreneurs and business owners) and following on this path one more domain stereotype having
.co.ke (referring to Kenya). Now let’s try to focus a little bit on “Files” and check if there are some patterns in “File section”. So let’s check the following diagram.
The linearity of the composition (every stereotype gets the same score, in that case 3) looks like the malware equally uses the different group of files, by meaning that if it starts on a victim machine it reads/creates/writes every single file at least one time per run. We might appreciate a nice pattern in the temporary file names, but it wont help us in detection since default windows temporary file pattern. However we might associate the presence of such a temporary files to the direct usage of
mrsys.exe and even
explorer.exe. Even if many false positive could be triggered it would be nice to give it a try and see where it takes !
Most interested would be the presence of a specific file
([a..z][0.9]).lck that would be a nice keypoint to check its presence (by using files detection)
In this post I’ve introduced a big improvement of the Cyber Threat Observatory showing up a quick and dirty analysis on LokiBot through stereotypes. Aim of this project is not to give detailed analyses on Malware but rather focusing on general patterns and macro stereotypes in order to perform massive data analysis.
Hope you might find it useful, if so please share it with your fellows.
Trends are interesting since they could tell you where things are going.
I do believe in studying history and behaviors in order to figure out where things are going on, so that every Year my colleagues from Yoroi and I spend several weeks to study and to write what we observed during the past months writing the Yoroi Cybersecurity Annual Report (freely downloadable from here: Yoroi Cybersecurity Report 2019).
The Rise of Targeted Ransomware
2019 was a breakthrough year in the cyber security of the European productive sector. The peculiarity of this year is not strictly related to the number of hacking attempts or in the malware code spread all over the Internet to compromise Companies assets and data but in the evolution and the consolidation of a new, highly dangerous kind of cyber attack. In 2019, we noticed a deep change in a consistent part of the global threat landscape, typically populated by States Sponsored actors, Cyber-Criminals and Hack-tivists, each one having some kind of attributes, both in motivations, objectives, methods and sophistications.
During the 2019 we observed a rapid evolution of Cyber Crime ecosystems hosting a wide range of financially motivated actors. We observed an increased volume of money-driven attacks compared to previous years. But actors are also involved in cyber-espionage, CEO frauds, credential stealing operations, PII (Personally Identifiable Information) and IP (Intellectual Property) theft, but traditionally much more active in the so called “opportunistic” cyber attacks. Attacks opportunistically directed to all the internet population, such as botnets and crypto-miners infection waves, but also involved in regional operations, for instance designed to target European countries like Italy or Germany as branches of major global-scale operations, as we tracked since 2018 with the sLoad case and even earlier with the Ursnif malware propagations waves.
In 2019 like what happened in 2018, Ransomware attacks played a significant role in the cyber arena. In previous years the whole InfoSec community observed the fast increase in o the Ransomware phenomenon, both in term of newborn ransomware families and also in the ransom payment options, driven by the consolidation of the digital cryptocurrencies market that made the traditional tracking techniques – operated by law enforcement agencies – l less effective due to new untrackable crypto currencies. But these increasing volumes weren’t the most worrying aspect we noticed.
Before 2019, most ransomware attacks were conducted in an automated, mostly opportunistic fashion: for instance through drive by download attacks and exploit kits, but also very frequently using the email vector. In fact, the “canonical” ransomware attacks before 2019 were characterized by an incoming email luring the victim to open up an attachment, most of the times an Office Document, carefully obfuscated to avoid detection and weaponized to launch some ransomware malware able to autonomously encrypt local user files and shared documents.
During 2019, we monitored a deep change in this trend. Ransomware attacks became more and more sophisticated. Gradually, even major cyber-criminal botnet operators, moved into this emerging sector leveraging their infection capabilities, their long term hacking experience and their bots to monetize their actions using new malicious business models. Indeed, almost every major malware family populating the cyber criminal landscape was involved in the delivery of follow up ransomware within infected hosts. A typical example is the Gandcrab ransomware installation operated by Ursnif implants during most of 2019. But some criminal groups have gone further. They set the threat level to a new baseline.
Many major cyber criminal groups developed a sort of malicious “RedTeam” units, lest call them “DarkTeams”. These units are able to manually engage high value targets such as private companies or any kind of structured organization, gaining access to their core and owning the whole infrastructure at once, typically installing ransomware tools all across the network just after ensuring the deletion of the backup copies. Many times they are also using industry specific knowledge to tamper with management networks and hypervisors to reach an impressive level of potential damage.
Actually, this kind of behaviour is not new to us. Such methods of operations have been used for a long time, but not by such a large number of actors and not with such kind of objectives. Network penetration was in fact a peculiarity of state sponsored groups and specialized cyber criminal gangs, often threatening the banking and retail sectors, typically referenced as Advanced Persistent Threats and traditionally targeting very large enterprises and organizations.
During 2019, we observed a strong game change in the ransomware attacks panorama.
The special “DarkTeams” replicated advanced intrusion techniques from APT playbooks carrying them into private business sectors which were not traditionally prepared to deal with such kinds of threats. Then, they started to hit organizations with high impact business attacks modeled to be very effective for the victim context. We are facing the evolution of ransomware by introducing Targeted Ransomware Attacks.
We observed and tracked many gangs consolidating the new Targeted Ransomware Attacks model. Many of them have also been cited by mainstream media and press due to the heavy impact on the business operation of prestigious companies, such as the LockerGoga and Ryuk ransomware attacks, but they only were the tip of the iceberg. Many other criminal groups have consolidated this kind of operations such as DoppelPaymer, Nemty, REvil/Sodinokibi and Maze, definitely some of the top targeted ransomware players populating the threat landscape in the last half of 2019.
In the past few months we also observed the emergence of a really worrisome practice by some of these players: the public shame of their victims. Maze was one of the first actors pionering this practice in 2019: the group started to disclose the name of the private companies they hacked into along with pieces of internal data stolen during the network intrusions.
The problem rises when the stolen data includes Intellectual Property and Personal Identifiable Information. In such a case the attacker leaves the victim organization with an additional, infaust position during the cyber-crisis: handling of the data breach and the fines disposed by the Data Protection Authorities. During 2020 we expect these kinds of practices will be more and more common into the criminal criminal ecosystems. Thus, adopting a proactive approach to the Cyber Security Strategy leveraging services like Yoroi’s Cyber Security Defence Center could be crucial to equip the Company with proper technology to acquire visibility on targeted ransomware attacks, knowledge, skills and processes to spot and handle these kind of new class of threats.
Well Known threats are always easier to be recognized and managed since components and intents are very often clear. For example a Ransomware, as known today, performs some standard operations such as (but not limited to): reading file, encrypting file and writing back that file. An early discovery of known threat families would help analysts to perform quick and precise analyses, while unknown threats are always difficult to manage since analysts would need to discover firstly the intentions and then bring back behaviour to standard operations. This is why we track Zero-Day Malware. Yoroi’s technology captures and collects samples before processing them on Yoroi’s shared threat intelligence platform trying to attribute them to known threats.
As part of the automatic analysis pipeline, Yoroi’s technology reports if the malicious files are potentially detected by Anti-Virus technologies during the detection time. This specific analogy is mainly done to figure-out if the incoming threat would be able to bypass perimetral and endpoint defences. As a positive side effect we collect data on detected threats related to their notoriety. In other words we are able to see if a Malware belonging to a
threat actor or related to specific operation (or incident) is detected by AV, Firewall, Next Generation X and used endpoints.
In this context, we shall define what we mean for Zero-Day Malware. We call Zero-Day malware every sample that turns out to be an unknown variant of arbitrary malware families. The following image (Fig:1) shows how most of the analyzed Malware is unknown from the InfoSec community and from common Antivirus vendors. This finding supports the even evolving Malware panorama in where attackers start from a shared code base but modify it depending on their needed to be stealth.
The reported data are collected during the first propagation of the malicious files across organizations. It means Companies are highly exposed to the risk of Zero-Day malware. Detection and response time plays a central role in such cases where the attack becomes stealth for hours or even for days.
Along with the Zero-Day malware observation, most of the known malware at time of delivery have not so high chances of being blocked by security controls. The 8% of the malware is detected by few AV engines and only 33% is actually well identified at time of attack. Even the so-called “known malware” is still a relevant issue due to its capability to maintain a low detection rate during the first infection steps. Indeed only less than 20% of analyzed samples belonging to “not Zero-Day” are detected by more than 15 AV engines.
Drilling down and observing the behavioural classification of the intercepted samples known by less than 5 AntiVirus engines at detection time, we might appreciate that the “Dropper” behaviour (i.e. the downloading or unpacking of other malicious stages or component) lead the way with 54% of cases, slightly decreasing since the 2018. One more interesting trend in the analyzed data is the surprising decrease of Ransomware behaviour, dropping from 17% of 2018 to the current 2%, and the bullish raise of “Trojan” behaviours up to 35% of times, more than doubled respect to the 15% of 2018.
This trend endorses the evidence that ransomware attacks in 2019 begun to follow a targeted approach as described in the “The Rise of Targeted Ransomware” section.
A reasonable interpretation of the darkling changes on these data, could actually conform with the sophistication of the malware infection chain discussed in the previous section. As a matter of fact, many of the delivered malware are actually a single part of a more complex infection chain. A chain able to install even multiple families of malware threats, starting from simple pieces of code behaving like droppers and trojan horses to grant access to a wider range of threats.
This trend gets another validation even in the Zero-Day malware data set: the samples likely unknown to Info.Sec. community – at the time of delivery – substantially shifted their distribution from previous years. In particular, Ransomware behaviour detections dropped from 29% to 7% in 2019, and Trojan raised from 28% to 52% of cases, showing similar macro variations.
If you want to read more details on “DarkTeams” and on what we observed during the past months, please feel free to download the full report HERE.