Category Archives: artificial intelligence

Fighting Fire With Fire: AI’s Role in Cybersecurity

Last year, it was predicted by Gartner that Artificial Intelligence (AI) will be implemented in almost every new software product by 2020. As AI capabilities grow at a rapid pace, such technologies

The post Fighting Fire With Fire: AI’s Role in Cybersecurity appeared first on The Cyber Security Place.

Seizing the Benefits of AI and IoT Through Data Security

As Artificial Intelligence (AI) and Internet of Things (IoT) applications are becoming more prominent, businesses must consider how to best process and analyze such data. Think of the countless business

The post Seizing the Benefits of AI and IoT Through Data Security appeared first on The Cyber Security Place.

Short on security analysts? AI can help

Most of us are aware of the skills shortage plaguing the cyber security industry, whether you are experiencing it first-hand or have read countless headlines. The 2017 Global Information Security Workforce

The post Short on security analysts? AI can help appeared first on The Cyber Security Place.

Enhancing Office 365 Advanced Threat Protection with detonation-based heuristics and machine learning

Email, coupled with reliable social engineering techniques, continues to be one of the primary entry points for credential phishing, targeted attacks, and commodity malware like ransomware and, increasingly in the last few months, cryptocurrency miners.

Office 365 Advanced Threat Protection (ATP) uses a comprehensive and multi-layered solution to protect mailboxes, files, online storage, and applications against a wide range of threats. Machine learning technologies, powered by expert input from security researchers, automated systems, and threat intelligence, enable us to build and scale defenses that protect customers against threats in real-time.

Modern email attacks combine sophisticated social engineering techniques with malicious links or non-portable executable (PE) attachments like HTML or document files to distribute malware or steal user credentials. Attackers use non-PE file formats because these can be easily modified, obfuscated, and made polymorphic. These file types allow attackers to constantly tweak email campaigns to try slipping past security defenses. Every month, Office 365 ATP blocks more than 500,000 email messages that use malicious HTML and document files that open a website with malicious content.

Figure 1. Typical email attack chain

Detonation-based heuristics and machine learning

Attackers employ several techniques to evade file-based detection of attachments and blocking of malicious URLs. These techniques include multiple redirections, large dynamic and obfuscated scripts, HTML for tag manipulation, and others.

Office 365 ATP protects customers from unknown email threats in real-time by using intelligent systems that inspect attachments and links for malicious content. These automated systems include a robust detonation platform, heuristics, and machine learning models.

Detonation in controlled environments exposes thousands of signals about a file, including behaviors like dropped and downloaded files, registry manipulation for persistence and storing stolen information, outbound network connections, etc. The volume of detonated threats translate to millions of signals that need to be inspected. To scale protection, we employ machine learning technologies to sort through this massive amount of information and determine a verdict for analyzed files.

Machine learning models examine detonation artifacts along with various signals from the following:

  • Static code analysis
  • File structure anomaly
  • Phish brand impersonation
  • Threat intelligence
  • Anomaly-based heuristic detections from security researchers

Figure 2. Classifying unknown threats using detonation, heuristics, and machine learning

Our machine learning models are trained to find malicious content using hundreds of thousands of samples. These models use raw signals as features with small modifications to allow for grouping signals even when they occur in slightly different contexts. To further enhance detection, some models are built using three-gram models that use raw signals sorted by timestamps recorded during detonation. The three-gram models tend to be more sparse than raw signals, but they can act as mini-signatures that can then be scored. These types of models fill in some of the gaps, resulting in better coverage, with little impact to false positives.

Machine learning can capture and expose even uncommon threat behavior by using several technologies and dynamic featurization. Features like image similarity matching, domain reputation, web content extraction, and others enable machine learning to effectively separate malicious or suspicious behavior from the benign.

Figure 3. Machine learning expands on traditional detection capabilities

Over time, as our systems automatically process and make a verdict on millions of threats, these machine learning models will continue to improve. In the succeeding sections, well describe some interesting malware and phishing campaigns detected recently by Office 365 ATP machine learning models.

Phishing campaigns: Online banking credentials

One of the most common types of phishing attacks use HTML and document files to steal online banking credentials. Gaining access to online bank accounts is one of the easiest ways that attackers can profit from illicit activities.

The email messages typically mimic official correspondence from banks. Phishers have become very good at crafting phishing emails. They can target global banks but also localize email content for local banks.
The HTML or document attachment are designed to look like legitimate sign-in pages or forms. Online banking credentials and other sensitive information entered into these files or websites are sent to attackers. Office 365s machine learning models detect this behavior, among other signals, to determine that such attachments are malicious and block offending email messages.

Figure 4. Sample HTML files that mimic online banking sign in pages. (Click to enlarge)

Phishing campaigns: Cloud storage accounts

Another popular example of phishing campaigns uses HTML or document attachments to steal cloud storage or email account details. The email messages imply that the recipient has received a document hosted in a cloud storage service. In order to supposedly open the said document, the recipient has to enter the cloud storage or email user name and password.

This type of phishing is very rampant because gaining access to either email or cloud storage opens a lot of opportunities for attackers to access sensitive documents or compromise the victims other accounts.

Figure 5. Sample HTML files that pose as cloud storage sign in pages. (Click to enlarge)

Tax-themed phishing and malware attacks

Tax-themed social engineering attacks circulate year-round as cybercriminals take advantage of the different country and region tax schedules. These campaigns use various messages related to tax filing to convincer users to click a link or open an attachment. The social engineering messages may say the recipient is eligible for tax refund, confirm that tax payment has been completed, or declare that payments are overdue, among others.

For example, one campaign intercepted by Office 365 ATP using machine learning implied that the recipient has not completed tax filing and is due for penalty. The campaign targeted taxpayers in Colombia, where tax filing ended in October. The email message aimed to alarm taxpayers by suggesting that they have not filed their taxes.

Figure 6. Tax-themed email campaign targeting taxpayers in Colombia. The subject line translates to: You have been fined for not filing your income tax returns

The attachment is a .rar file containing an HTML file. The HTML file contains the logo of Direccin de Impuestos y Aduanas Nacionales (DIAN), the Colombianes tax and customs organization, and a link to download a file.

Figure 7. Social engineering document with a malicious link

The link points to a shortened URL hxxps://bit[.]ly/2IuYkcv that redirects to hxxp://dianmuiscaingreso[.]com/css/sanci%C3%B3n%20declaracion%20de%20renta.doc, which downloads a malicious document.

Figure 8: Malicious URL information

The malicious document carries a downloader macro code. When opened, Microsoft Word issues a security warning. In the document are instructions to Enable content, which executes the embedded malicious VBA code.

Figure 9: Malicious document with malicious macro code

If the victim falls for this social engineering attack, the macro code downloads and executes a file from hxxp:// The downloaded executable file (despite the file name) is a file injector and password-stealing malware detected by Windows Defender AV as Trojan:Win32/Tiggre!rfn.

Because Office 365 ATP machine learning detects the malicious attachment and blocks the email, the rest of the attack chain is stopped, protecting customers at the onset.

Artificial intelligence in Office 365 ATP

As threats rapidly evolve and become increasingly complex, we continuously invest in expanding capabilities in Office 365 Advanced Threat Protection to secure mailboxes from attacks. Using artificial intelligence and machine learning, Office 365 ATP can constantly scale coverage for unknown and emerging threats in-real time.

Office 365 ATPs machine learning models leverage Microsofts wide network of threat intelligence, as well as seasoned threat experts who have deep understanding of malware, cyberattacks, and attacker motivation, to combat a wide range of attacks.

This enhanced protection from Office 365 ATP contributes to and enriches the integrated Microsoft 365 threat protection, which provides intelligent, integrated, and secure solution for the modern workplace. Microsoft 365 combines the benefits and security technologies of Office 365, Windows, and Enterprise Mobility Suite (EMS) platforms.

Office 365 ATP also shares threat signals to the Microsoft Intelligent Security Graph, which uses advanced analytics to link threat intelligence and security signals across Office 365, the Windows Defender ATP stack of defenses, and other sensors. For example, when a malicious file is detected by Office 365 ATP, that threat can also be blocked on endpoints protected by Windows Defender ATP and vice versa. Connecting security data and systems allows Microsoft security technologies like Office 365 ATP to continuously improve threat protection, detection, and response.



Office 365 Threat Research

The AI’ker’s Guide to the (cybersecurity) Galaxy

As a security veteran I find myself from time to time having to explain to newbies the importance of adopting a hacker’s way of thinking’, and the difference between hacker’s and builder’s thinking.

If you can’t think like an attacker, how are you going to build solutions to defend against them? For the last 4 years I was involved in several research projects (most successful but some not so much) aimed at incorporating AI technology into Imperva products. The most significant challenge we had to cope with was making sure that our use of AI worked safely in adversarial settings; assuming that adversaries are out there, investing their brain power in understanding our solutions and adapting to them, polluting training data and trying to sneak through security mechanisms.

In recent years we’ve seen a surge in Artificial Intelligence technology being incorporated into almost every aspect of our lives. Having made remarkable leaps forward in areas like visual object recognition, semantic segmentation, and speech recognition, it’s only natural to see other industries race to adopt AI as their solution to, well… everything really.

The Security Lifecycle

Unsurprisingly, most vendors using AI don’t think of security. I remember one of the most interesting DefCon sessions I’ve seen: a research on the security of smart traffic sensors. Researcher Cesar Cerrudo saw a scene in a movie where hackers cleared a route of green lights through traffic and wondered whether this was possible. The answer was, of course, yes. In probing this ecosystem for security mechanisms to circumvent, Cerrudo wasn’t able to find any.  What he did find, however, was a disturbingly easy way to take control of sensors buried in the roads and how to disable them.

I remember this session not because of the sophisticated hacking techniques used, but because it reminded me of a meeting with a large car manufacturer I had a few years earlier. As they started venturing further into the digitization of automotive systems – moving from purely mechanical systems into a wired/wireless network of digital devices, connected to external entities like garages and service centers – they discovered severe vulnerabilities and new cyber threats to the industry, with potentially lethal results.

This is the unfortunate, but inevitable security lifecycle of new technologies – closely tied to the Gartner hype-cycle. In its early days, the community talks innovation and opportunity, expectations and excitement couldn’t be higher. Then comes disillusionment. Once security researchers find ways to make the system do things it wasn’t supposed to, in particular, if the drops of vulnerabilities turn into a flood, the excitement is replaced with FUD – fear, uncertainty and doubt around the risk associated with the new technology.

Just like automotive systems and smart cities, there are no security exceptions when it comes to AI.

When it comes to security, AI is no different than other technologies. Analyzing a system designed without considering what attackers look for, it’s likely that the attacker will find ways to make that system do things it’s not supposed to. Some of you might be familiar with Google research from 2015, where they added human-invisible noise to an image of a school bus and had the AI classify it as an ostrich; more recently, research into the same field produced some pretty interesting new applications based on the 2015 exercise.

The Houdini framework was able to fool pose estimation, speech recognition and semantic segmentation AI. Facial recognition, used in many control and surveillance systems – like those used in airports –, was completely confused by colorful glasses with deliberately embedded patterns meant to puzzle the system. Next-Gen Anti-Virus software using AI for malware detection was circumvented by another AI in a super-cool bot vs. bot research, presented at Blackhat last year. Several bodies of research show that in many cases, AI deception has transferability, where deceptive samples for model A, are found to be effective against another model ‘A’ that solves the same problem.

It’s not all doom and gloom

That was the bad news, the good news is that AI can be used safely. Our product portfolio uses AI technology is used extensively to improve protection against a variety of threats against web applications and data systems. We do this effectively and, perhaps most important, safely. Based on our experience, I’ve gathered some guidelines for safe usage of AI.

While these are not binary rules, we find these guidelines effective in estimating the risk associated with using AI due to adversarial behavior.

  1. Opt for robust models – when using non-robust AI models, small insignificant differences in the input may have a significant impact on the model decision. Using non-robust models allows attackers to generate same-essence different-look input, which is essential to most attacks.
  2. Choose explainable models – in many cases AI This situation is optimal from the attacker’s perspective. The deception is expressed only in the decision, reducing the chances of someone noticing that something doesn’t look right with the decision.
  3. Training data sanitization – data used for training the model must be sanitized, assuming that an attacker may have control over part of it. Sanitization in most cases means filtering out suspicious data.
  4. Internal use – Input: consider reducing the influence of the attacker on the data entering the AI
  5. Internal usage – Output: opt for AI in functions where the output is not exposed to the attackers, reducing the attacker’s ability to learn whether a deception attempt was successful.
  6. Threat Detection – opt for positive security: when using AI for threat detection, choose a positive security model. Negative security models exonerate everything except what was identified as a known attack, usually based on specific attack patterns. This model is susceptible to deception if the attacker is able to modify the input looks without impacting its essence. Positive security models by nature are more robust, giving the attacker much less slackness and reducing his chances to find a same-essence different-look input that will get undetected.
  7. Threat Detection — combine with other mechanisms: when using in a negative security model, use AI as another detection layer, aimed at detecting threats that pass other “less intelligent” security mechanisms, and not as the sole measure.