Category Archives: security intelligence

Patching as a social responsibility

In the wake of the devastating (Not)Petya attack, Microsoft set out to understand why some customers weren’t applying cybersecurity hygiene, such as security patches, which would have helped mitigate this threat. We were particularly concerned with why patches hadn’t been applied, as they had been available for months and had already been used in the WannaCrypt worm—which clearly established a ”real and present danger.”

We learned a lot from this journey, including how important it is to build clearer industry guidance and standards on enterprise patch management. To help make it easier for organizations to plan, implement, and improve an enterprise patch management strategy, Microsoft is partnering with the U.S. National Institute of Standards and Technology (NIST) National Cybersecurity Center of Excellence (NCCoE).

NIST and Microsoft are extending an invitation for you to join this effort if you’re a:

  • Vendor—Any vendor who has technology offerings to help with patch management (scan, report, deploy, measure risk, etc.).
  • Organization or individual—All those who have tips and lessons learned from a successful enterprise management program (or lessons learned from failures, challenges, or any other situations).

If you have pertinent learnings that you can share, please reach out to cyberhygiene@nist.gov.

During this journey, we also worked closely with additional partners and learned from their experience in this space, including the:

  • Center for Internet Security (CIS)
  • U.S. Department of Homeland Security (DHS) Cybersecurity
  • Cybersecurity and Infrastructure Security Agency (CISA) (formerly US-CERT / DHS NCCIC)

A key part of this learning journey was to sit down and listen directly to our customer’s challenges. Microsoft visited a significant number of customers in person (several of which I personally joined) to share what we learned—which became part of the jointly endorsed mitigation roadmap—and to have some really frank and open discussions to learn why organizations really aren’t applying security patches.

While the discussions mostly went in expected directions, we were surprised at how many challenges organizations had on processes and standards, including:

  • “What sort of testing should we actually be doing for patch testing?”
  • “How fast should I be patching my systems?”

This articulated need for good reference processes was further validated by observing that a common practice for “testing” a patch before a deployment often consisted solely of asking whether anyone else had any issues with the patch in an online forum.

This realization guided the discussions with our partners towards creating an initiative in the NIST NCCoE in collaboration with other industry vendors. This project—kicking off soon—will build common enterprise patch management reference architectures and processes, have relevant vendors build and validate implementation instructions in the NCCoE lab, and share the results in the NIST Special Publication 1800 practice guide for all to benefit.

Applying patches is a critical part of protecting your system, and we learned that while it isn’t as easy as security departments think, it isn’t as hard as IT organizations think.

In many ways, patching is a social responsibility because of how much society has come to depend on technology systems that businesses and other organizations provide. This situation is exacerbated today as almost all organizations undergo digital transformations, placing even more social responsibility on technology.

Ultimately, we want to make it easier for everyone to do the right thing and are issuing this call to action. If you’re a vendor that can help or if you have relevant learnings that may help other organizations, please reach out to cyberhygiene@nist.gov. Now!

The post Patching as a social responsibility appeared first on Microsoft Security.

In hot pursuit of elusive threats: AI-driven behavior-based blocking stops attacks in their tracks

Our experience in detecting and blocking threats on millions of endpoints tells us that attackers will stop at nothing to circumvent protections. Even one gap in security can be disastrous to an organization.

At Microsoft, we don’t stop finding new ways to fill in gaps in security. We go beyond strengthening existing defenses by introducing new and innovative layers of protection. While our industry-leading endpoint protection platform stops threats before they can even run, we continue improving protections for instances where sophisticated adversarial attacks manage to slip through.

Multiple layers of protection mean multiple hurdles that attackers need to overcome to perpetrate attacks. We continuously innovate threat and malware prevention engines on the client and in the cloud to add more protection layers that detect and block sophisticated and evasive threats before they can even run.

In recent months, we introduced two machine learning protection features within the behavioral blocking and containment capabilities in Microsoft Defender Advanced Threat Protection. In keeping with the defense in depth strategy, coupled with the “assume breach” mindset, these new protection engines specialize in detecting threats by analyzing behavior, and adding new layers of protection after an attack has successfully started running on a machine:

  • Behavior-based machine learning identifies suspicious process behavior sequences and advanced attack techniques observed on the client, which are used as triggers to analyze the process tree behavior using real-time machine learning models in the cloud
  • AMSI-paired machine learning uses pairs of client-side and cloud-side models that integrate with Antimalware Scan Interface (AMSI) to perform advanced analysis of scripting behavior pre- and post-execution to catch advanced threats like fileless and in-memory attacks

The figure below illustrates how the two behavior-based machine learning protections enrich post-breach detections:

Figure 1. Pre and post-execution detection engines in Microsoft Defender ATP’s antivirus capabilities

The pre-execution and post-execution detection engines make up two important components of comprehensive threat and malware prevention. They reflect the defense in depth principle, which entails multiple layers of protection for thorough, wide-range defense.

In detecting post-execution behavior, using machine learning is critical. Many attack techniques are also used by legitimate applications. For example, a very common, documented method used by both clean applications and malware is creating a service for persistence.

To distinguish between malicious and clean applications when an attack technique is observed, Windows Defender Antivirus monitors and sends suspicious behaviors and process trees to the cloud protection service for real-time classification by machine learning. Cloud-based post-execution detection engines isolate known good behaviors from malicious intent to stop attacks in real time.

Within milliseconds of an attack technique or suspicious script execution being observed, machine learning classifiers return a verdict and the client blocks the threat. The pre-execution models then learn from these malicious blocks afterwards to protect Microsoft Defender ATP customers before attacks can begin executing new cycles of infection.

How behavioral blocking and containment protected 100 organizations from credential theft

In early July, attackers launched a highly targeted credential theft attack against 100 organizations around the world, primarily in the United Arab Emirates, Germany, and Portugal. The goal of the attack was to install the notorious info-stealing backdoor Lokibot and to exfiltrate sensitive data.

Behavioral blocking and containment capabilities in Microsoft Defender ATP detected and foiled the attack in its early stages, protecting customers from damage.

Spear-phishing emails carrying lure documents were sent to the target organizations; in one instance, three distinct highly targeted emails with the same lure document were delivered to a single pharmaceutical ingredient supplier. The attacker used pharmaceutical industry jargon to improve the credibility of the email and in one case requested a quote on an ingredient that the target company was likely to produce.

Figure 2. Multiple spear-phishing emails attempted to deliver the same lure document to the same target

The lure document itself didn’t host any exploit code but used an external relationship to a document hosted on a compromised WordPress website. If recipients opened the attachment, the related remote document, which contained the exploit, was also automatically loaded. This allowed the remote document to take advantage of the previously fixed CVE-2017-11882 vulnerability in Equation Editor and execute code on the computer.

Figure 3. The lure document contains an external reference to the exploit document is hosted on a compromised WordPress website.

Upon successful exploitation, the attack downloaded and loaded the Lokibot malware, which stole credentials, exfiltrated stolen data, and waited for further instructions from a command-and-control (C&C) server.

The behavior-based machine learning models built into Microsoft Defender ATP caught attacker techniques at two points in the attack chain. The first detection layer spotted the exploit behavior. Machine learning classifiers in the cloud correctly identified the threat and immediately instructed the client to block the attack. In cases where the attack had proceeded past this layer of defense to the next stage of the attack, process hollowing would have been attempted. This, too, was detected by behavior-based machine learning models, which instructed the clients to block the attack, marking the second detection layer. As the attacks are blocked, the malicious processes and corresponding files are remediated, protecting targets from credential theft and further backdoor activities.

Figure 4. Credential theft attack chain showing multiple behavior-based protection layers that disrupted the attack

The behavior-based blocking raised an “Initial Access” alert in Microsoft Defender Security Center, the console for SecOps teams that gives complete visibility into their environments and across the suite of Microsoft Defender ATP tools that protect their endpoints:

Figure 5. Alert and process tree on Microsoft Defender Security Center for this targeted attack

This attack demonstrates how behavior-based machine learning models in the cloud add new layers of protection against attacks even after they have started running.

In the next sections, we will describe in detail the two machine learning protection features in behavioral blocking and containment capabilities in Microsoft Defender ATP.

Behavior-based machine learning protection

The behavior engine in the Windows Defender Antivirus client monitors more than 500 attack techniques as triggers for analyzing new and unknown threats. Each time one of the monitored attack techniques is observed, the process tree and behavior sequences are constructed and sent to the cloud, where behavior-based machine learning models classify possible threats. Figure 4 below illustrates a more detailed view of our process tree classification path:

Figure 6. Process tree classification path

Behavior-based detections are named according to the MITRE ATT&CK matrix to help identify the attack stage where the malicious behavior was observed:

 

Tactic Detection threat name
Initial Access Behavior:Win32/InitialAccess.*!ml
Execution Behavior:Win32/Execution.*!ml
Persistence Behavior:Win32/Persistence.*!ml
Privilege Escalation Behavior:Win32/PrivilegeEscalation.*!ml
Defense Evasion Behavior:Win32/DefenseEvasion.*!ml
Credential Access Behavior:Win32/CredentialAccess.*!ml
Discovery Behavior:Win32/Discovery.*!ml
Lateral Movement Behavior:Win32/LateralMovement.*!ml
Collection Behavior:Win32/Collection.*!ml
Command and Control Behavior:Win32/CommandAndControl.*!ml
Exfiltration Behavior:Win32/Exfiltration.*!ml
Impact Behavior:Win32/Impact.*!ml
Uncategorized Behavior:Win32/Generic.*!ml

Since deployment, the behavior-based machine learning models have blocked attacker techniques like the following used by attacks in the wild:

  • Credential dumping from LSASS
  • Cross-process injection
  • Process hollowing
  • UAC bypass
  • Tampering with antivirus (such as disabling it or adding the malware as exclusion)
  • Contacting C&C to download payloads
  • Coin mining
  • Boot record modification
  • Pass-the-hash attacks
  • Installation of root certificate
  • Exploitation attempt for various vulnerabilities

These blocked behaviors show up as alerts in Microsoft Defender Security Center.

Figure 7. Alert for malicious behavior in Microsoft Defender Security Center

Machine learning protection for scripting engines with AMSI

Through the AMSI integration with scripting engines on Windows 10 and Office 365, Windows Defender Antivirus gains rich insight into the execution of PowerShell, VBScript, JavaScript and Office Macro VBA scripts to cut through obfuscation, protect against fileless attacks, and provide robust defenses against malicious script behavior.

To assist with fileless and evasive script attacks, scripting engines are instrumented to provide both behavior calls and dynamic content calls to the antivirus product. The type of integrations available varies based on the scripting engine. Table 1 below illustrates the current support with the Windows 10 and Office 365, and Figure 5 illustrates an example of the scripting engine dynamic script content and behavior calls for malicious scripts.

 

Microsoft AMSI integration point Dynamic script content calls Behavior calls
PowerShell Y
VBScript Y Y
JavaScript Y Y
Office VBA macros Y
WMI Y
MSIL .NET Y

Figure 8. Example dynamic script content and behavior calls for malicious scripts monitored by AMSI

Our scripting machine learning protection design can be seen in Figure 6 below. We deployed paired machine learning models for various scripting scenarios. Each pair of classifiers is made up of (1) a performance-optimized lightweight classifier that runs on the Windows Defender Antivirus client, and (2) a heavy classifier in the cloud. The role of the client-based classifier is to inspect the script content or behavior log to predict whether a script is suspicious. For scripts that are classified as suspicious, metadata describing the behavior or content is featurized and sent up to the cloud for real-time classification; the metadata that describes the content includes expert features, features selected by machine learning, and fuzzy hashes.

Figure 9. AMSI-paired models classification path

The paired machine learning model in the cloud then analyzes the metadata to decide whether the script should be blocked or not. If machine learning decides to block the file, the running script is aborted. This paired model architecture is used to offload the overhead of running intensive machine learning models to the cloud, and to make use of the global information available about the content through the Microsoft Intelligent Security Graph.

Malicious scripts blocked by AMSI-paired machine models are reported in Microsoft Defender Security Center using threat names like the following:

  • Trojan:JS/Mountsi.A!ml
  • Trojan:Script/Mountsi.A!ml
  • Trojan:O97M/Mountsi.A!ml
  • Trojan:VBS/Mountsi.A!ml
  • Trojan:PowerShell/Mountsi.A!ml

Behavioral blocking and containment for disrupting advanced attacks

The two new cloud-based post-execution detection engines we described in this blog are part of the behavioral blocking and containment capabilities that enabled Microsoft Defender ATP to protect the 100 organizations targeted in the credential theft attack we discussed earlier. Recently, we also documented how behavior-based protections are important components of the dynamic protection against the multi-stage, fileless Nodersok campaign.

These engines add to the many layers of machine learning-driven protections in the cloud and add protection against threats after they have begun running. To further illustrate how these behavior-based protections work, here’s a diagram that shows the multiple protection layers against an Emotet attack chain:

Figure 10. Multiple layers of behavior-based protection in Windows Defender Antivirus while executing an Emotet attack (SHA-256: ee2bbe2398be8a1732c0afc318b797f192ce898982bff1b109005615588facb0)

As part of our defense in depth strategy, these new layers of antivirus protection not only expand detection and blocking capabilities; they also provide even richer visibility into malicious behavior sequences, giving security operations more signals to use in investigating and responding to attacks through Microsoft Defender ATP capabilities like endpoint detection and response, threat and vulnerability management, and automated investigation and remediation.

Within milliseconds of an attack technique or suspicious script execution being observed, machine learning classifiers return a verdict and the client blocks the threat. Our pre-execution models then learn from these malicious blocks afterwards to protect Microsoft Defender ATP customers before the threats even begin executing.

Figure 11. Multiple layers of malware and threat prevention engines on the client and in the cloud

The impact of the continuous improvements in antivirus capabilities further show up in Microsoft Threat Protection, Microsoft’s comprehensive security solution for identities, endpoints, email and data, apps, and infrastructure. Through signal-sharing across Microsoft services, the richer machine learning-driven protection in Microsoft Defender ATP is amplified throughout protections for various attack surfaces.

 

Geoff McDonald
with Saad Khan
Microsoft Defender ATP Research

The post In hot pursuit of elusive threats: AI-driven behavior-based blocking stops attacks in their tracks appeared first on Microsoft Security.

Bring your own LOLBin: Multi-stage, fileless Nodersok campaign delivers rare Node.js-based malware

We’ve discussed the challenges that fileless threats pose in security, and how Microsoft Defender Advanced Threat Protection (Microsoft Defender ATP) employs advanced strategies to defeat these sophisticated threats. Part of the slyness of fileless malware is their use of living-off-the-land techniques, which refer to the abuse of legitimate tools, also called living-off-the-land binaries (LOLBins), that already exist on machines through which malware can persist, move laterally, or serve other purposes.

But what happens when attackers require functionality beyond what’s provided by standard LOLBins? A new malware campaign we dubbed Nodersok decided to bring its own LOLBins—it delivered two very unusual, legitimate tools to infected machines:

  • Node.exe, the Windows implementation of the popular Node.js framework used by countless web applications
  • WinDivert, a powerful network packet capture and manipulation utility

Like any LOLBin, these tools are not malicious or vulnerable; they provide important capabilities for legitimate use. It’s not uncommon for attackers to download legitimate third-party tools onto infected machines (for example, PsExec is often abused to run other tools or commands). However, Nodersok went through a long chain of fileless techniques to install a pair of very peculiar tools with one final objective: turn infected machines into zombie proxies.

While the file aspect of the attack was very tricky to detect, its behavior produced is a visible footprint that stands out clearly for anyone who knows where to look. With its array of advanced defensive technologies, Microsoft Defender ATP, defeated the threat at numerous points of dynamic detection throughout the attack chain.

Attack overview

The Nodersok campaign has been pestering thousands of machines in the last several weeks, with most targets located in the United States and Europe. The majority of targets are consumers, but about 3% of encounters are observed in organizations in sectors like education, professional services, healthcare, finance, and retail.

 

Figure 1. Distribution of Nodersok’s enterprise targets by country and by sector

The campaign is particularly interesting not only because it employs advanced fileless techniques, but also because it relies on an elusive network infrastructure that causes the attack to fly under the radar. We uncovered this campaign in mid-July, when suspicious patterns in the anomalous usage of MSHTA.exe emerged from Microsoft Defender ATP telemetry. In the days that followed, more anomalies stood out, showing up to a ten-fold increase in activity:

Figure 2. Trending of Nodersok activity from August to September, 2019

After a process of tracking and analysis, we pieced together the infection chain:

Figure 3. Nodersok attack chain

Like the Astaroth campaign, every step of the infection chain only runs legitimate LOLBins, either from the machine itself (mshta.exe, powershell.exe) or downloaded third-party ones (node.exe, Windivert.dll/sys). All of the relevant functionalities reside in scripts and shellcodes that are almost always coming in encrypted, are then decrypted, and run while only in memory. No malicious executable is ever written to the disk.

This infection chain was consistently observed in several machines attacked by the latest variant of Nodersok. Other campaigns (possibly earlier versions) with variants of this malware (whose main JavaScript payload was named 05sall.js or 04sall.js) were observed installing malicious encoded PowerShell commands in the registry that would end up decoding and running the final binary executable payload.

Initial access: Complex remote infrastructure

The attack begins when a user downloads and runs an HTML application (HTA) file named Player1566444384.hta. The digits in the file name differ in every attack. Analysis of Microsoft Defender ATP telemetry points to compromised advertisements as the most likely infection vector for delivering the HTA files. The mshta.exe tool (which runs when an HTA file runs) was launched with the -embedding command-line parameter, which typically indicates that the launch action was initiated by the browser.

Furthermore, immediately prior to the execution of the HTA file, the telemetry always shows network activity towards suspicious advertisement services (which may vary slightly across infections), and a consistent access to legitimate content delivery service Cloudfront. Cloudfront is not a malicious entity or service, and it was likely used by the attackers exactly for that reason: because it’s not a malicious domain, it won’t likely raise alarms. Examples of such domains observed in several campaigns are:

  • d23cy16qyloios[.]cloudfront[.]net
  • d26klsbste71cl[.]cloudfront [.]net
  • d2d604b63pweib[.]cloudfront [.]net
  • d3jo79y1m6np83[.]cloudfront [.]net
  • d1fctvh5cp9yen[.]cloudfront [.]net
  • d3cp2f6v8pu0j2[.]cloudfront[.]net
  • dqsiu450ekr8q[.]cloudfront [.]net

It’s possible that these domains were abused to deliver the HTA files without alerting the browser. Another content delivery service abused later on in the attack chain is Cdn77. Some examples of observed URLs include:

  • hxxps://1292172017[.]rsc [.]cdn77 [.]org/images/trpl[.]png
  • hxxps://1292172017[.]rsc.cdn77[.]org/imtrack/strkp[.]png

This same strategy was also used by the Astaroth campaign, where the malware authors hosted their malware on the legitimate storage.googleapis.com service.

First-stage JavaScript

When the HTA file runs, it tries to reach out to a randomly named domain to download additional JavaScript code. The domains used in this first stage are short-lived: they are registered and brought online and, after a day or two (the span of a typical campaign), they are dropped and their related DNS entries are removed. This can make it more difficult to investigate and retrieve the components that were delivered to victims. Examples of domains observed include:

  • Du0ohrealgeek[.]org – active from August 12 to 14
  • Hi5urautopapyrus[.]org – active from April 21 to 22
  • Ex9ohiamistanbul[.]net – active from August 1 to 2
  • Eek6omyfilmbiznetwork[.]org – active from July 23 to 24

This stage is just a downloader: it tries to retrieve either a JavaScript or an extensible style language (XSL) file from the command-and-control (C&C) domain. These files have semi-random names like 1566444384.js and 1566444384.xsl, where the digits are different in every download. After this file is downloaded and runs, it contacts the remote C&C domain to download an RC4-encrypted file named 1566444384.mp4 and a decryption key from a file named 1566444384.flv. When decrypted, the MP4 file is an additional JavaScript snippet that starts PowerShell:

Interestingly, it hides the malicious PowerShell script in an environment variable named “deadbeef” (first line), then it launches PowerShell with an encoded command (second line) that simply runs the contents of the “deadbeef” variable. This trick, which is used several times during the infection chain, is usually employed to hide the real malicious script so that it does not appear in the command-line of a PowerShell process.

Second-stage PowerShell

Nodersok’s infection continues by launching several instances of PowerShell to download and run additional malicious modules. All the modules are hosted on the C&C servers in RC4-encrypted form and are decrypted on the fly before they run on the device. The following steps are perpetrated by the various instances of PowerShell:

  • Download module.avi, a module that attempts to:
    • Disable Windows Defender Antivirus
    • Disable Windows updates
    • Run binary shellcode that attempts elevation of privilege by using auto-elevated COM interface
  • Download additional modules trpl.png and strkp.png hosted on a Cdn77 service
  • Download legitimate node.exe tool from the official nodejs.org website
  • Drop the WinDivert packet capture library components WinDivert.dll, WinDivert32.sys, and WinDivert64.sys
  • Execute a shellcode that uses WinDivert to filter and modify certain outgoing packets
  • Finally, drop the JavaScript payload along with some Node.js modules and libraries required by it, and run it via node.exe

This last JavaScript is the actual final payload written for the Node.js framework that turns the device into a proxy. This concludes the infection, at the end of which the network packet filter is active and the machine is working as a potential proxy zombie. When a machine turns into a proxy, it can be used by attackers as a relay to access other network entities (websites, C&C servers, compromised machines, etc.), which can allow them to perform stealthy malicious activities.

Node.js-based proxy engine

This is not the first threat to abuse Node.js. Some cases have been observed in the past (for example this ransomware from early 2016). However, using Node.js is a peculiar way to spread malware. Besides being clean and benign, Node.exe also has a valid digital signature, allowing a malicious JavaScript to operate within the context of a trusted process. The JavaScript payload itself is relatively simple: it only contains a set of basic functions that allows it to act as a proxy for a remote entity.

Figure 4. A portion of the malicious Node.js-based proxy

The code seems to be still in its infancy and in development, but it does work. It has two purposes:

  1. Connect back to the remote C&C, and
  2. Receive HTTP requests to proxy back to it

It supports the SOCKS4A protocol. While we haven’t observed network requests coming from attackers, we wrote what the Node.js-based C&C server application may look like: a server that sends HTTP requests to the infected clients that connect back to it, and receives the responses from said clients. we slightly modified the malicious JavaScript malware to make it log meaningful messages, ran a JavaScript server, ran the JavaScript malware, and it proxied HTTP requests as expected:

Figure 5.The debug messages are numbered to make it easier to follow the execution flow

The server starts, then the client starts and connects to it. In response, the server sends a HTTP request (using the Socks4A protocol) to the client. The request is a simple HTTP GET. The client proxies the HTTP request to the target website and returns the HTTP response (200 OK) and the HTML page back to the server. This test demonstrates that it’s possible to use this malware as a proxy.

05sall.js: A variant of Nodersok

As mentioned earlier, there exist other variants of this malware. For example, we found one named 05sall.js (possibly an earlier version). It’s similar in structure to the one described above, but the payload was not developed in Node.js (rather it was an executable). Furthermore, beyond acting as a proxy, it can run additional commands such as update, terminate, or run shell commands.

Figure 6. The commands that can be processed by the 05sall.js variant.

The malware can also process configuration data in JSON format. For example, this configuration was encoded and stored in the registry in an infected machine:

Figure 7. Configuration data exposing component and file names

The configuration is an indication of the modular nature of the malware. It shows the names of two modules being used in this infection (named block_av_01 and all_socks_05).

The WinDivert network packet filtering

At this point in the analysis, there is one last loose end: what about the WinDivert packet capture library? We recovered a shellcode from one of the campaigns. This shellcode is decoded and run only in memory from a PowerShell command. It installs the following network filter (in a language recognized by WinDivert):

This means Nodersok is intercepting packets sent out to initiate a TCP connection. Once the filter is active, the shellcode is interested only in TCP packets that match the following specific format:

Figure 8. Format of TCP packets that Nodersok is interested in

The packet must have standard Ethernet, IP, and 20 bytes TCP headers, plus an additional 20 bytes of TCP extra options. The options must appear exactly in the order shown in the image above:

  • 02 04 XX XX – Maximum segment size
  • 01 – No operation
  • 03 03 XX – Windows Scale
  • 04 02 – SACK permitted
  • 08 0A XX XX XX XX XX XX XX XX – Time stamps

If packets matching this criterion are detected, Nodersok modifies them by moving the “SACK Permitted” option to the end of the packet (whose size is extended by four bytes), and replacing the original option bytes with two “No operation” bytes.

Figure 9. The format of TCP packets after Nodersok has altered it: the “SACK permitted” bytes (in red) have been moved to the end of the packet, and their original location has been replaced by “No operation” (in yellow)

It’s possible that this modification benefits the attackers; for example, it may help evade some HIPS signatures.

Stopping the Nodersok campaign with Microsoft Defender ATP

Both the distributed network infrastructure and the advanced fileless techniques allowed this campaign fly under the radar for a while, highlighting how having the right defensive technologies is of utmost importance in order to detect and counter these attacks in a timely manner.

If we exclude all the clean and legitimate files leveraged by the attack, all that remains are the initial HTA file, the final Node.js-based payload, and a bunch of encrypted files. Traditional file-based signatures are inadequate to counter sophisticated threats like this. We have known this for quite a while, that’s why we have invested a good deal of resources into developing powerful dynamic detection engines and delivering a state-of-the-art defense-in-depth through Microsoft Defender ATP:

Figure 10. Microsoft Defender ATP protections against Nodersok

Machine learning models in the Windows Defender Antivirus client generically detects suspicious obfuscation in the initial HTA file used in this attack. Beyond this immediate protection, behavioral detection and containment capabilities can spot anomalous and malicious behaviors, such as the execution of scripts and tools. When the behavior monitoring engine in the client detects one of the more than 500 attack techniques, information like the process tree and behavior sequences are sent to the cloud, where behavior-based machine learning models classify files and identify potential threats.

Meanwhile, scripts that are decrypted and run directly in memory are exposed by Antimalware Scan Interface (AMSI) instrumentation in scripting engines, while launching PowerShell with a command-line that specifies encoded commands is defeated by command-line scanning. Tamper protection in Microsoft Defender ATP protects systems modifications that attempt to disable Windows Defender Antivirus.

These multiple layers of protection are part of the threat and malware prevention capabilities in Microsoft Defender ATP. The complete endpoint protection platform provides multiple capabilities that empower security teams to defend their organizations against attacks like Nodersok. Attack surface reduction shuts common attack surfaces. Threat and vulnerability management, endpoint detection and response, and automated investigation and remediation help organizations detect and respond to cyberattacks. Microsoft Threat Experts, Microsoft Defender ATP’s managed detection and response service, further helps security teams by providing expert-level monitoring and analysis.

With Microsoft Threat Protection, these endpoint protection capabilities integrate with the rest of Microsoft security solutions to deliver comprehensive protection for comprehensive security for identities, endpoints, email and data, apps, and infrastructure.

 

Andrea Lelli
Microsoft Defender ATP Research

The post Bring your own LOLBin: Multi-stage, fileless Nodersok campaign delivers rare Node.js-based malware appeared first on Microsoft Security.

Are students prepared for real-world cyber curveballs?

With a projected “skills gap” numbering in the millions for open cyber headcount, educating a diverse workforce is critical to corporate and national cyber defense moving forward. However, are today’s students getting the preparation they need to do the cybersecurity work of tomorrow?

To help educators prepare meaningful curricula, the National Institute of Standards and Technology (NIST) has developed the National Initiative for Cybersecurity Education (NICE) Cybersecurity Workforce Framework. The U.S. Department of Energy (DOE) is also doing its part to help educate our future cybersecurity workforce through initiatives like the CyberForce Competition,™ designed to support hands-on cyber education for college students and professionals. The CyberForce Competition™ emulates real-world, critical infrastructure scenarios, including “cyber-physical infrastructure and lifelike anomalies and constraints.”

As anyone who’s worked in cybersecurity knows, a big part of operational reality are the unexpected curveballs ranging from an attacker’s pivot while escalating privileges through a corporate domain to a request from the CEO to provide talking points for an upcoming news interview regarding a recent breach. In many “capture the flag” and “cyber-range exercises,” these unexpected anomalies are referred to as “injects,” the curveballs of the training world.

For the CyberForce Competition™ anomalies are mapped across the seven NICE Framework Workforce Categories illustrated below:

Image showing seven categories of cybersecurity: Operate and Maintain, Oversee and Govern, Collect and Operate, Securely Provision, Analayze, Protect and Defend, and Investigate.

NICE Framework Workforce categories, NIST SP 800-181.

Students were assessed based on how many and what types of anomalies they responded to and how effective/successful their responses were.

Tasks where students excelled

  • Threat tactic identification—Students excelled in identifying threat tactics and corresponding methodologies. This was shown through an anomaly that required students to parse through and analyze a log file to identify aspects of various identifiers of insider threat; for example, too many sign-ins at one time, odd sign-in times, or sign-ins from non-standard locations.
  • Log file analysis and review—One task requires students to identify non-standard browsing behavior of agents behind a firewall. To accomplish this task, students had to write code to parse and analyze the log files of a fictitious company’s intranet web servers. Statistical evidence from the event indicates that students are comfortable writing code to parse log file data and performing data analysis.
  • Insider threat investigations—Students seemed to gravitate towards the anomalies and tasks connected to insider threat identification that maps to the Security Provision pillar. Using log analysis techniques described above, students were able to determine at a high rate of success individuals with higher than average sign-in failure rates and those with anomalous successful logins, such as from many different devices or locations.
  • Network forensics—The data indicated that overall the students had success with the network packet capture (PCAP) forensics via analysis of network traffic full packet capture streams. They also had a firm grasp on related tasks, including file system forensic analysis and data carving techniques.
  • Trivia—Students were not only comfortable with writing code and parsing data, but also showed they have solid comprehension and intelligence related to cybersecurity history and trivia. Success in this category ranked in the higher percentile of the overall competition.

Pillar areas for improvement

  • Collect and Operate—This pillar “provides specialized denial and deception operations and collection of cybersecurity information that may be used to develop intelligence.” Statistical analysis gathered during the competition indicated that students had hesitancies towards the activities in this pillar, including for some tasks that they were successful with in other exercises. For example, some fairly simple tasks, such as analyzing logs for specific numbers of entries and records on a certain date, had a zero percent completion rate. Reasons for non-completion could be technical inability on the part of the students but could also have been due to a poorly written anomaly/task or even an issue with sign-ins to certain lab equipment.
  • Investigate—Based on the data, the Investigate pillar posed some challenges for the students. Students had a zero percent success rate on image analysis and an almost zero percent success rate on malware analysis. In addition, students had a zero percent success rate in this pillar for finding and identifying a bad file in the system.

Key takeaways

Frameworks like NIST NICE and competitions like the DOE CyberForce Competition™ are helping to train up the next generation of cybersecurity defenders. Analysis from the most recent CyberForce Competition™ indicates that students are comfortable with tasks in the “Protect and Defend” pillar and are proficient in many critical tasks, including network forensics and log analysis. The data points to areas for improvement especially in the “Collect and Operate” and “Investigate” pillars, and for additional focus on forensic skills and policy knowledge.

Bookmark the Security blog to keep up with our expert coverage on security matters. Also, follow us at @MSFTSecurity for the latest news and updates on cybersecurity.

The CyberForce work was partially supported by the U.S. Department of Energy Office of Science under contract DE-AC02-06CH11357.

The post Are students prepared for real-world cyber curveballs? appeared first on Microsoft Security.

Deep learning rises: New methods for detecting malicious PowerShell

Scientific and technological advancements in deep learning, a category of algorithms within the larger framework of machine learning, provide new opportunities for development of state-of-the art protection technologies. Deep learning methods are impressively outperforming traditional methods on such tasks as image and text classification. With these developments, there’s great potential for building novel threat detection methods using deep learning.

Machine learning algorithms work with numbers, so objects like images, documents, or emails are converted into numerical form through a step called feature engineering, which, in traditional machine learning methods, requires a significant amount of human effort. With deep learning, algorithms can operate on relatively raw data and extract features without human intervention.

At Microsoft, we make significant investments in pioneering machine learning that inform our security solutions with actionable knowledge through data, helping deliver intelligent, accurate, and real-time protection against a wide range of threats. In this blog, we present an example of a deep learning technique that was initially developed for natural language processing (NLP) and now adopted and applied to expand our coverage of detecting malicious PowerShell scripts, which continue to be a critical attack vector. These deep learning-based detections add to the industry-leading endpoint detection and response capabilities in Microsoft Defender Advanced Threat Protection (Microsoft Defender ATP).

Word embedding in natural language processing

Keeping in mind that our goal is to classify PowerShell scripts, we briefly look at how text classification is approached in the domain of natural language processing. An important step is to convert words to vectors (tuples of numbers) that can be consumed by machine learning algorithms. A basic approach, known as one-hot encoding, first assigns a unique integer to each word in the vocabulary, then represents each word as a vector of 0s, with 1 at the integer index corresponding to that word. Although useful in many cases, the one-hot encoding has significant flaws. A major issue is that all words are equidistant from each other, and semantic relations between words are not reflected in geometric relations between the corresponding vectors.

Contextual embedding is a more recent approach that overcomes these limitations by learning compact representations of words from data under the assumption that words that frequently appear in similar context tend to bear similar meaning. The embedding is trained on large textual datasets like Wikipedia. The Word2vec algorithm, an implementation of this technique, is famous not only for translating semantic similarity of words to geometric similarity of vectors, but also for preserving polarity relations between words. For example, in Word2vec representation:

Madrid – Spain + Italy ≈ Rome

Embedding of PowerShell scripts

Since training a good embedding requires a significant amount of data, we used a large and diverse corpus of 386K distinct unlabeled PowerShell scripts. The Word2vec algorithm, which is typically used with human languages, provides similarly meaningful results when applied to PowerShell language. To accomplish this, we split the PowerShell scripts into tokens, which then allowed us to use the Word2vec algorithm to assign a vectorial representation to each token .

Figure 1 shows a 2-dimensional visualization of the vector representations of 5,000 randomly selected tokens, with some tokens of interest highlighted. Note how semantically similar tokens are placed near each other. For example, the vectors representing -eq, -ne and -gt, which in PowerShell are aliases for “equal”, “not-equal” and “greater-than”, respectively, are clustered together. Similarly, the vectors representing the allSigned, remoteSigned, bypass, and unrestricted tokens, all of which are valid values for the execution policy setting in PowerShell, are clustered together.

Figure 1. 2D visualization of 5,000 tokens using Word2vec

Examining the vector representations of the tokens, we found a few additional interesting relationships.

Token similarity: Using the Word2vec representation of tokens, we can identify commands in PowerShell that have an alias. In many cases, the token closest to a given command is its alias. For example, the representations of the token Invoke-Expression and its alias IEX are closest to each other. Two additional examples of this phenomenon are the Invoke-WebRequest and its alias IWR, and the Get-ChildItem command and its alias GCI.

We also measured distances within sets of several tokens. Consider, for example, the four tokens $i, $j, $k and $true (see the right side of Figure 2). The first three are usually used to represent a numeric variable and the last naturally represents a Boolean constant. As expected, the $true token mismatched the others – it was the farthest (using the Euclidean distance) from the center of mass of the group.

More specific to the semantics of PowerShell in cybersecurity, we checked the representations of the tokens: bypass, normal, minimized, maximized, and hidden (see the left side of Figure 2). While the first token is a legal value for the ExecutionPolicy flag in PowerShell, the rest are legal values for the WindowStyle flag. As expected, the vector representation of bypass was the farthest from the center of mass of the vectors representing all other four tokens.

Figure 2. 3D visualization of selected tokens

Linear Relationships: Since Word2vec preserves linear relationships, computing linear combinations of the vectorial representations results in semantically meaningful results. Below are a few interesting relationships we found:

high – $false + $true ≈’ low
‘-eq’ – $false + $true ‘≈ ‘-neq’
DownloadFile – $destfile + $str ≈’ DownloadString ‘
Export-CSV’ – $csv + $html ‘≈ ‘ConvertTo-html’
‘Get-Process’-$processes+$services ‘≈ ‘Get-Service’

In each of the above expressions, the sign ≈ signifies that the vector on the right side is the closest (among all the vectors representing tokens in the vocabulary) to the vector that is the result of the computation on the left side.

Detection of malicious PowerShell scripts with deep learning

We used the Word2vec embedding of the PowerShell language presented in the previous section to train deep learning models capable of detecting malicious PowerShell scripts. The classification model is trained and validated using a large dataset of PowerShell scripts that are labeled “clean” or “malicious,” while the embeddings are trained on unlabeled data. The flow is presented in Figure 3.

Figure 3 High-level overview of our model generation process

Using GPU computing in Microsoft Azure, we experimented with a variety of deep learning and traditional ML models. The best performing deep learning model increases the coverage (for a fixed low FP rate of 0.1%) by 22 percentage points compared to traditional ML models. This model, presented in Figure 4, combines several deep learning building blocks such as Convolutional Neural Networks (CNNs) and Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN). Neural networks are ML algorithms inspired by biological neural systems like the human brain. In addition to the pretrained embedding described here, the model is provided with character-level embedding of the script.

Figure 4 Network architecture of the best performing model

Real-world application of deep learning to detecting malicious PowerShell

The best performing deep learning model is applied at scale using Microsoft ML.Net technology and ONNX format for deep neural networks to the PowerShell scripts observed by Microsoft Defender ATP through the AMSI interface. This model augments the suite of ML models and heuristics used by Microsoft Defender ATP to protect against malicious usage of scripting languages.

Since its first deployment, this deep learning model detected with high precision many cases of malicious and red team PowerShell activities, some undiscovered by other methods. The signal obtained through PowerShell is combined with a wide range of ML models and signals of Microsoft Defender ATP to detect cyberattacks.

The following are examples of malicious PowerShell scripts that deep learning can confidently detect but can be challenging for other detection methods:

Figure 5. Heavily obfuscated malicious script

Figure 6. Obfuscated script that downloads and runs payload

Figure 7. Script that decrypts and executes malicious code

Enhancing Microsoft Defender ATP with deep learning

Deep learning methods significantly improve detection of threats. In this blog, we discussed a concrete application of deep learning to a particularly evasive class of threats: malicious PowerShell scripts. We have and will continue to develop deep learning-based protections across multiple capabilities in Microsoft Defender ATP.

Development and productization of deep learning systems for cyber defense require large volumes of data, computations, resources, and engineering effort. Microsoft Defender ATP combines data collected from millions of endpoints with Microsoft computational resources and algorithms to provide industry-leading protection against attacks.

Stronger detection of malicious PowerShell scripts and other threats on endpoints using deep learning mean richer and better-informed security through Microsoft Threat Protection, which provides comprehensive security for identities, endpoints, email and data, apps, and infrastructure.

 

Shay Kels and Amir Rubin
Microsoft Defender ATP team

 

Additional references:

The post Deep learning rises: New methods for detecting malicious PowerShell appeared first on Microsoft Security.