Category Archives: data

Improving privacy of a global genomic data sharing network

A Case Western Reserve University computer and data sciences researcher is working to shore up privacy protections for people whose genomic information is stored in a vast global collection of vital, personal data. Erman Ayday pursued novel methods for identifying and analyzing privacy vulnerabilities in the genomic data sharing network known commonly as “the Beacons.” Personal genomic data refers to each person’s unique genome, his or her genetic makeup, information that can be gleaned from … More

The post Improving privacy of a global genomic data sharing network appeared first on Help Net Security.

Ongoing and initial costs top list of barriers to 5G implementation

5G is set to deliver higher data transfer rates for mission-critical communications and will allow massive broadband capacities, enabling high-speed communication across various applications such as the Internet of Things (IoT), robotics, advanced analytics and artificial intelligence. According to a study from CommScope, only 46% of respondents feel their current network infrastructure is capable of supporting 5G, but 68% think 5G will have a significant impact on their agency operations within one to four years. … More

The post Ongoing and initial costs top list of barriers to 5G implementation appeared first on Help Net Security.

Mobile messengers expose billions of users to privacy attacks

Popular mobile messengers expose personal data via discovery services that allow users to find contacts based on phone numbers from their address book, according to researchers. When installing a mobile messenger like WhatsApp, new users can instantly start texting existing contacts based on the phone numbers stored on their device. For this to happen, users must grant the app permission to access and regularly upload their address book to company servers in a process called … More

The post Mobile messengers expose billions of users to privacy attacks appeared first on Help Net Security.

GAIA-X to strenghten European digital infrastructure sovereignity

The GAIA-X Initiative announced that it is one step closer to its goal of a trustworthy, sovereign digital infrastructure for Europe, with the official signing of incorporation papers for GAIA-X AISBL, a non-profit association that will take the project to the next level. GAIA-X: A vision for Europe The initiative’s twenty-two founding members signed the documents in Brussels to create an association for securing funding and commitment from members to fulfill the initiative’s vision for … More

The post GAIA-X to strenghten European digital infrastructure sovereignity appeared first on Help Net Security.

What are the most vulnerable departments and sectors to phishing attacks?

While cyber attackers chase down system vulnerabilities and valuable data each passing day, the business world has taken the measures against them. The latest trends and cybersecurity statistics reveal that data from various sources, especially mobile and IoT devices, is targeted and attacked. Organizations face the risk of data loss due to unprotected data and weak cyber security practices. In the first half of last year, 4.1 billion of data records were exposed, while the … More

The post What are the most vulnerable departments and sectors to phishing attacks? appeared first on Help Net Security.

Seven questions to ask before selecting a cloud provider

Cloud adoption is essential for digital transformation, but too often, there are unwanted surprises in the process. How can this be avoided? Choosing the right cloud provider to meet your business needs is key. With an increasing number of companies offering a growing menu of solutions, that can be a challenge. To sort it out,…

The post Seven questions to ask before selecting a cloud provider first appeared on IT World Canada.

Devices and Distancing: What Digital Data Says About Life From Home

Devices and Distancing: What Digital Data Says About Life From Home

With millions of us keeping life closer to home in these past months, what can our devices and apps tell us about how we’ve passed that time? Plenty.

Usage stats, location data, app downloads, and daily active users, all drawn from anonymized data, are all common statistics that get reported on a regular basis. What makes them particularly insightful this year is to see how they’ve increased, decreased, or remained steady as nations and communities have put distancing measures in place. How are we living differently and what role are our devices playing in them?

That’s a rather large question, and different data sets, measurements, and methodologies will point to different insights. However, looking at a few of them together can help us associate some figures with the way our day-to-day experience has changed and continues to evolve.

Our own data shows people are using their desktop and laptop computers more

Using the McAfee PC app, which is always running and protecting (our customers) people  in the background, we’re able to look at general PC use. The inference here is that increased use of a desktop or laptop PC (especially during weekdays) indicates an uptick in people engaging in remote work, learning, or play. Our figures are drawn from pseudonymized or anonymized device records aggregated to a country level, with at least 1,000 devices counted.

What did our numbers specifically show? You can visit our Safer Together page and take a country-by-country view of the data, which starts in February. (See our interactive heat map at the bottom of the page.) A quick capsule summary of select nations is below:

PC Usage by Month


Unsurprisingly, the most marked jump in home PC use occurs during the stretch that measures March to April, which marks the period when stay at home guidance rolled into place for many. From there, those increases held relatively steady. Looking at the change from April to May, it appears that people largely stayed at home as well.

Beyond that, June’s week-by-week trends saw usage in Australia and India both increase steadily. The U.S., UK, and Germany also trended upward overall, while France and Italy trended downward.

Other apps and technologies point to other trends

Dating apps saw a big spike in downloads and usage during the same stretch of time. According to dating app Bumble, the end of March saw an 84% increase in the number of its video calls and voice chats. On March 29th, the Tinder dating app reported the highest number of swipes ever in one day up to that point—some 3 billion. As we shared in an article earlier this year about safely dating from home, perhaps this shouldn’t come as any surprise because dating apps are designed to bring people together. In periods of isolation, it follows that people would use them to reach out and make connections where they can.

There’ve been plenty of similar stories (and some surprises) in the news in recent weeks, as various firms, publications, and service providers share the some of the digital trends they’ve spotted, such as:

  • In April, online analysis firm Apptopia reported a marked decrease in mobile phone screen time and an increase in time on desktop browsers as people switched to bigger screens. They also tracked a major spike in the download of home improvement retailer apps in the U.S., such as Lowe’s, Home Depot, and Menards—up 69% year-over-year.
  • PC Magazine reports that internet usage surged 47% in January-March of this year. One statistic that underscores this increase is the percentage of people who consume more than 1TB of data in a month. This went from 4.2% of subscribers in the start of 2019 to 10% in the first quarter of 2020. That’s a more than 2x increase in so-called power users.
  • The same report shared further insights, such as collaboration tool Microsoft Teams setting a record for 2.7 billion meeting minutes in a single day and collaboration platform Slack seeing an 80% increase in paid customers over the previous quarter. Likewise, video conferencing tool Zoom saw its daily participants increase by 2,900% in the quarter compared to December 2019.
  • OpenTable, which provides online restaurant reservations across nearly 60,000 restaurants globally and seats 134 million diners monthly, have put out their own data as well. Their “State of the Restaurant Industry” figures offer few surprises as to hard-hit restaurants around the world have been. By making week-to-week comparisons between 2019 and 2020, it shows that seatings in early June are down roughly 75% globally compared to last year. Later in the month, they are still down 63% compared to the time same last year as well.

 

Looking ahead: more working from home?

While these statistics each provide their own snapshot of life during lockdown in retrospective, what remains to be seen is how the time we’ve spent at home will shape the way we work, learn, socialize, and entertain ourselves in the months to come. At least right now, it seems that people are wanting or expecting to see change. A new study from McAfee surveyed 1,000 working adults in the U.S. between the ages of 18 and 74 in May 2020 and found that nearly half (47%) of employees do not want to go back to working how they were before stay-at-home measures were put in place.

However that plays out in the future, it’s important to protect ourselves today while we continue to rely on our devices so heavily. Comprehensive security protection, like McAfee Total Protection, can help protect devices against malware, phishing attacks, and other threats. Additionally, it includes McAfee WebAdvisor that can help identify malicious websites.

And one last stat: according to Nielsen, there was an 85% increase in American streaming rates in the first three weeks of March this year compared to March 2019 reports. Again, no surprise. Yet one thing to be on the lookout for are phishing and malware attacks associated with movies and shows that are offered for a “free” stream or download. It’s a common method of attack, and we’ve compiled our Top 10 U.S. List of TV and Movie Titles That Could Lead You to a Dangerous Download. Give the article a look. Not only does it name the titles, it offers you great advice for keeping safe.

Stay Updated 

To stay updated on all things McAfee and for more resources on staying secure from home, follow @McAfee_Home on Twitter, listen to our podcast Hackable?, and ‘Like’ us on Facebook.

 

The post Devices and Distancing: What Digital Data Says About Life From Home appeared first on McAfee Blogs.

SCANdalous! (External Detection Using Network Scan Data and Automation)

Real Quick

In case you’re thrown by that fantastic title, our lawyers made us change the name of this project so we wouldn’t get sued. SCANdalous—a.k.a. Scannah Montana a.k.a. Scanny McScanface a.k.a. “Scan I Kick It? (Yes You Scan)”—had another name before today that, for legal reasons, we’re keeping to ourselves. A special thanks to our legal team who is always looking out for us, this blog post would be a lot less fun without them. Strap in folks.

Introduction

Advanced Practices is known for using primary source data obtained through Mandiant Incident Response, Managed Defense, and product telemetry across thousands of FireEye clients. Regular, first-hand observations of threat actors afford us opportunities to learn intimate details of their modus operandi. While our visibility from organic data is vast, we also derive value from third-party data sources. By looking outwards, we extend our visibility beyond our clients’ environments and shorten the time it takes to detect adversaries in the wild—often before they initiate intrusions against our clients.

In October 2019, Aaron Stephens gave his “Scan’t Touch This” talk at the annual FireEye Cyber Defense Summit (slides available on his Github). He discussed using network scan data for external detection and provided examples of how to profile command and control (C2) servers for various post-exploitation frameworks used by criminal and intelligence organizations alike. However, manual application of those techniques doesn’t scale. It may work if your role focuses on one or two groups, but Advanced Practices’ scope is much broader. We needed a solution that would enable us to track thousands of groups, malware families and profiles. In this blog post we’d like to talk about that journey, highlight some wins, and for the first time publicly, introduce the project behind it all: SCANdalous.

Pre-SCANdalous Case Studies

Prior to any sort of system or automation, our team used traditional profiling methodologies to manually identify servers of interest. The following are some examples. The success we found in these case studies served as the primary motivation for SCANdalous.

APT39 SSH Tunneling

After observing APT39 in a series of intrusions, we determined they frequently created Secure Shell (SSH) tunnels with PuTTY Link to forward Remote Desktop Protocol connections to internal hosts within the target environment. Additionally, they preferred using BitVise SSH servers listening on port 443. Finally, they were using servers hosted by WorldStream B.V.

Independent isolation of any one of these characteristics would produce a lot of unrelated servers; however, the aggregation of characteristics provided a strong signal for newly established infrastructure of interest. We used this established profile and others to illuminate dozens of servers we later attributed to APT39, often before they were used against a target.

APT34 QUADAGENT

In February 2018, an independent researcher shared a sample of what would later be named QUADAGENT. We had not observed it in an intrusion yet; however, by analyzing the characteristics of the C2, we were able to develop a strong profile of the servers to track over time. For example, our team identified the server 185.161.208\.37 and domain rdppath\.com within hours of it being established. A week later, we identified a QUADAGENT dropper with the previously identified C2. Additional examples of QUADAGENT are depicted in Figure 1.


Figure 1: QUADAGENT C2 servers in the Shodan user interface

Five days after the QUADAGENT dropper was identified, Mandiant was engaged by a victim that was targeted via the same C2. This activity was later attributed to APT34. During the investigation, Mandiant uncovered APT34 using RULER.HOMEPAGE. This was the first time our consultants observed the tool and technique used in the wild by a real threat actor. Our team developed a profile of servers hosting HOMEPAGE payloads and began tracking their deployment in the wild. Figure 2 shows a timeline of QUADAGENT C2 servers discovered between February and November of 2018.


Figure 2: Timeline of QUADAGENT C2 servers discovered throughout 2018

APT33 RULER.HOMEPAGE, POSHC2, and POWERTON

A month after that aforementioned intrusion, Managed Defense discovered a threat actor using RULER.HOMEPAGE to download and execute POSHC2. All the RULER.HOMEPAGE servers were previously identified due to our efforts. Our team developed a profile for POSHC2 and began tracking their deployment in the wild. The threat actor pivoted to a novel PowerShell backdoor, POWERTON. Our team repeated our workflow and began illuminating those C2 servers as well. This activity was later attributed to APT33 and was documented in our OVERRULED post.

SCANdalous

Scanner, Better, Faster, Stronger

Our use of scan data was proving wildly successful, and we wanted to use more of it, but we needed to innovate. How could we leverage this dataset and methodology to track not one or two, but dozens of active groups that we observe across our solutions and services? Even if every member of Advanced Practices was dedicated to external detection, we would still not have enough time or resources to keep up with the amount of manual work required. But that’s the key word: Manual. Our workflow consumed hours of individual analyst actions, and we had to change that. This was the beginning of SCANdalous: An automated system for external detection using third-party network scan data.

A couple of nice things about computers: They’re great at multitasking, and they don’t forget. The tasks that were taking us hours to do—if we had time, and if we remembered to do them every day—were now taking SCANdalous minutes if not seconds. This not only afforded us additional time for analysis, it gave us the capability to expand our scope. Now we not only look for specific groups, we also search for common malware, tools and frameworks in general. We deploy weak signals (or broad signatures) for software that isn’t inherently bad, but is often used by threat actors.

Our external detection was further improved by automating additional collection tasks, executed by SCANdalous upon a discovery—we call them follow-on actions. For example, if an interesting open directory is identified, acquire certain files. These actions ensure the team never misses an opportunity during “non-working hours.” If SCANdalous finds something interesting on a weekend or holiday, we know it will perform the time-sensitive tasks against the server and in defense of our clients.

The data we collect not only helps us track things we aren’t seeing at our clients, it allows us to provide timely and historical context to our incident responders and security analysts. Taking observations from Mandiant Incident Response or Managed Defense and distilling them into knowledge we can carry forward has always been our bread and butter. Now, with SCANdalous in the mix, we can project that knowledge out onto the Internet as a whole.

Collection Metrics

Looking back on where we started with our manual efforts, we’re pleased to see how far this project has come, and is perhaps best illustrated by examining the numbers. Today (and as we write these continue to grow), SCANdalous holds over five thousand signatures across multiple sources, covering dozens of named malware families and threat groups. Since its inception, SCANdalous has produced over two million hits. Every single one of those, a piece of contextualized data that helps our team make analytical decisions. Of course, raw volume isn’t everything, so let’s dive a little deeper.

When an analyst discovers that an IP address has been used by an adversary against a named organization, they denote that usage in our knowledge store. While the time at which this observation occurs does not always correlate with when it was used in an intrusion, knowing when we became aware of that use is still valuable. We can cross-reference these times with data from SCANdalous to help us understand the impact of our external detection.

Looking at the IP addresses marked by an analyst as observed at a client in the last year, we find that 21.7% (more than one in five) were also found by SCANdalous. Of that fifth, SCANdalous has an average lead time of 47 days. If we only consider the IP addresses that SCANdalous found first, the average lead time jumps to 106 days. Going even deeper and examining this data month-to-month, we find a steady upward trend in the percentage of IP addresses identified by SCANdalous before being observed at a client (Figure 3).


Figure 3: Percentage of IP addresses found by SCANdalous before being marked as observed at a client by a FireEye analyst

A similar pattern can be seen for SCANdalous’ average lead time over the same data (Figure 4).


Figure 4: Average lead time in days for SCANdalous over the same data shown in Figure 3

As we continue to create signatures and increase our external detection efforts, we can see from these numbers that the effectiveness and value of the resulting data grow as well.

SCANdalous Case Studies

Today in Advanced Practices, SCANdalous is a core element of our external detection work. It has provided us with a new lens through which we can observe threat activity on a scale and scope beyond our organic data, and enriches our workflows in support of Mandiant. Here are a few of our favorite examples:

FIN6

In early 2019, SCANdalous identified a Cobalt Strike C2 server that we were able to associate with FIN6. Four hours later, the server was used to target a Managed Defense client, as discussed in our blog post, Pick-Six: Intercepting a FIN6 Intrusion, an Actor Recently Tied to Ryuk and LockerGoga Ransomware.

FIN7

In late 2019, SCANdalous identified a BOOSTWRITE C2 server and automatically acquired keying material that was later used to decrypt files found in a FIN7 intrusion worked by Mandiant consultants, as discussed in our blog post, Mahalo FIN7: Responding to the Criminal Operators’ New Tools and Techniques.

UNC1878 (financially motivated)

Some of you may also remember our recent blog post on UNC1878. It serves as a great case study for how we grow an initial observation into a larger set of data, and then use that knowledge to find more activity across our offerings. Much of the early work that went into tracking that activity (see the section titled “Expansion”) happened via SCANdalous. The quick response from Managed Defense gave us just enough information to build a profile of the C2 and let our automated system take it from there. Over the next couple months, SCANdalous identified numerous servers matching UNC1878’s profile. This allowed us to not only analyze and attribute new network infrastructure, it also helped us observe when and how they were changing their operations over time.

Conclusion

There are hundreds more stories to tell, but the point is the same. When we find value in an analytical workflow, we ask ourselves how we can do it better and faster. The automation we build into our tools allows us to not only accomplish more of the work we were doing manually, it enables us to work on things we never could before. Of course, the conversion doesn’t happen all at once. Like all good things, we made a lot of incremental improvements over time to get where we are today, and we’re still finding ways to make more. Continuing to innovate is how we keep moving forward – as Advanced Practices, as FireEye, and as an industry.

Example Signatures

The following are example Shodan queries; however, any source of scan data can be used.

Used to Identify APT39 C2 Servers

  • product:“bitvise” port:“443” org:“WorldStream B.V.”

Used to Identify QUADAGENT C2 Servers

  • “PHP/7.2.0beta2”

RULER.HOMEPAGE Payloads

  • html:“clsid:0006F063-0000-0000-C000-000000000046”

Excelerating Analysis, Part 2 — X[LOOKUP] Gon’ Pivot To Ya

In December 2019, we published a blog post on augmenting analysis using Microsoft Excel for various data sets for incident response investigations. As we described, investigations often include custom or proprietary log formats and miscellaneous, non-traditional forensic artifacts. There are, of course, a variety of ways to tackle this task, but Excel stands out as a reliable way to analyze and transform a majority of data sets we encounter.

In our first post, we discussed summarizing verbose artifacts using the CONCAT function, converting timestamps using the TIME function, and using the COUNTIF function for log baselining. In this post, we will cover two additional versatile features of Excel: LOOKUP functions and PivotTables.

For this scenario, we will use a dataset of logon events for an example Microsoft Office 365 (O365) instance to demonstrate how an analyst can enrich information in the dataset. Then we will demonstrate some examples of how to use PivotTables to summarize information and highlight anomalies in the data quickly.

Our data contains the following columns:

  • Description – Event description
  • User – User’s name
  • User Principle Name – email address
  • App – such as Office 365, Sharepoint, etc.
  • Location – Country
  • Date
  • IP address
  • User agent (simplified)
  • Organization – associated with IP address (as identified by O365)


Figure 1: O365 data set

LOOKUP for Data Enrichment

It may be useful to add more information to the data that could help us in analysis that isn’t provided by the original log source. A step FireEye Mandiant often performs during investigations is to take all unique IP addresses and query threat intelligence sources for each IP address for reputation, WHOIS information, connections to known threat actor activity, etc. This grants more information about each IP address that we can take into consideration in our analysis.

While FireEye Mandiant is privy to historical engagement data and Mandiant Threat Intelligence, if security teams or organizations do not have access to commercial threat intelligence feeds, there are numerous open source intelligence services that can be leveraged.

We can also use IP address geolocation services to obtain latitude and longitude related to each source IP address. This information may be useful in identifying anomalous logons based on geographical location.

After taking all source IP addresses, running them against threat intelligence feeds and geolocating them, we have the following data added to a second sheet called “IP Address Intel” in our Excel document:


Figure 2: IP address enrichment

We can already see before we even dive into the logs themselves that we have suspicious activity: The five IP addresses in the 203.0.113.0/24 range in our data are known to be associated with activity connected to a fictional threat actor tracked as TMP.OGRE.

To enrich our original dataset, we will add three columns to our data to integrate the supplementary information: “Latitude,” “Longitude,” and “Threat Intel” (Figure 3). We can use the VLOOKUP or XLOOKUP functions to quickly retrieve the supplementary data and integrate it into our main O365 log sheet.


Figure 3: Enrichment columns

VLOOKUP

The traditional way to look up particular data in another array is by using the VLOOKUP function. We will use the following formula to reference the “Latitude” values for a given IP address:


Figure 4: VLOOKUP formula for Latitude

There are four parts to this formula:

  1. Value to look up:
    • This dictates what cell value we are going to look up more information for. In this case, it is cell G2, which is the IP address.
  2. Table array:
    • This defines the entire array in which we will look up our value and return data from. The first column in the array must contain the value being looked up. In the aforementioned example, we are searching in ‘IP Address Intel’!$A$2:$D:$15. In other words, we are looking in the other sheet in this workbook we created earlier titled “IP Address Intel”, then in that sheet, search in the cell range of A2 to D15.

      Figure 5: VLOOKUP table array

      Note the use of the “$” to ensure these are absolute references and will not be updated by Excel if we copy this formula to other cells.
  3. Column index number:
    • This identifies the column number from which to return data. The first column is considered column 1. We want to return the “Latitude” value for the given IP address, so in the aforementioned example, we tell Excel to return data from column 2.
  4. Range lookup (match type)
    • This part of the formula tells Excel what type of matching to perform on the value being looked up. Excel defaults to “Approximate” matching, which assumes the data is sorted and will match the closest value. We want to perform “Exact” matching, so we put “0” here (“FALSE” is also accepted).

With the VLOOKUP function complete for the “Latitude” data, we can use the fill handle to update this field for the rest of the data set.

To get the values for the “Longitude” and “Threat Intel” columns, we repeat the process by using a similar function and, adjusting the column index number to reference the appropriate columns, then use the fill handle to fill in the rest of the column in our O365 data sheet:

  • For Longitude:
    • =VLOOKUP(G2,'IP Address Intel'!$A$2:$D$15,3,0)
  • For Threat Intel:
    • =VLOOKUP(G2,'IP Address Intel'!$A$2:$D$15,4,0)

Bonus Option: XLOOKUP

The XLOOKUP function in Excel is a more efficient way to reference the threat intelligence data sheet. XLOOKUP is a newer function introduced to Excel to replace the legacy VLOOKUP function and, at the time of writing this post, is only available to “O365 subscribers in the Monthly channel”, according to Microsoft. In this instance, we will also leverage Excel’s dynamic arrays and “spilling” to fill in this data more efficiently, instead of making an XLOOKUP function for each column.

NOTE: To utilize dynamic arrays and spilling, the data we are seeking to enrich cannot be in the form of a “Table” object. Instead, we will apply filters to the top row of our O365 data set by selecting the “Filter” option under “Sort & Filter” in the “Home” ribbon:


Figure 6: Filter option

To reference the threat intelligence data sheet using XLOOKUP, we will use the following formula:


Figure 7: XLOOKUP function for enrichment

There are three parts to this XLOOKUP formula:

  1. Value to lookup:
    • This dictates what cell value we are going to look up more information for. In this case, it is cell G2, which is the IP address.
  2. Array to look in:
    • This will be the array of data in which Excel will search for the value to look up. Excel does exact matching by default for XLOOKUP. In the aforementioned example, we are searching in ‘IP Address Intel’!$A$2:$A:$15. In other words, we are looking in the other sheet in this workbook titled “IP Address Intel”, then in that sheet, search in the cell range of A2 to A15:

      Figure 8: XLOOKUP array to look in

      Note the use of the “$” to ensure these are absolute references and will not be updated by Excel if we copy this formula to other cells.
  3. Array of data to return:
    • This part will be the array of data from which Excel will return data. In this case, Excel will return the data contained within the absolute range of B2 to D15 from the “IP Address Intel” sheet for the value that was looked up. In the aforementioned example formula, it will return the values in the row for the IP address 198.51.100.126:

      Figure 9: Data to be returned from ‘IP Address Intel’ sheet

      Because this is leveraging dynamic arrays and spilling, all three cells of the returned data will populate, as seen in Figure 4.

Now that our dataset is completely enriched by either using VLOOKUP or XLOOKUP, we can start hunting for anomalous activity. As a quick first step, since we know at least a handful of IP addresses are potentially malicious, we can filter on the “Threat Intel” column for all rows that match “TMP.OGRE” and reveal logons with source IP addresses related to known threat actors. Now we have timeframes and suspected compromised accounts to pivot off of for additional hunting through other data.

PIVOT! PIVOT! PIVOT!

One of the most useful tools for highlighting anomalies by summarizing data, performing frequency analysis and quickly obtaining other statistics about a given dataset is Excel’s PivotTable function.

Location Anomalies

Let’s utilize a PivotTable to perform frequency analysis on the location from which users logged in. This type of technique may highlight activity where a user account logged in from a location which is unusual for them.

To create a PivotTable for our data, we can select any cell in our O365 data and select the entire range with Ctrl+A. Then, under the “Insert” tab in the ribbon, select “PivotTable”:


Figure 10: PivotTable selection

This will bring up a window, as seen in Figure 11, to confirm the data for which we want to make a PivotTable (Step 1 in Figure 11). Since we selected our O365 log data set with Ctrl+A, this should be automatically populated. It will also ask where we want to put the PivotTable (Step 2 in Figure 11). In this instance, we created another sheet called “PivotTable 1” to place the PivotTable:


Figure 11: PivotTable creation

Now that the PivotTable is created, we must select how we want to populate the PivotTable using our data. Remember, we are trying to determine the locations from which all users logged in. We will want a row for each user and a sub-row for each location the user has logged in from. Let’s add a count of how many times they logged in from each location as well. We will use the “Date” field to do this for this example:


Figure 12: PivotTable field definitions

Examining this table, we can immediately see there are two users with source location anomalies: Ginger Breadman and William Brody have a small number of logons from “FarFarAway”, which is abnormal for these users based on this data set.

We can add more data to this PivotTable to get a timeframe of this suspicious activity by adding two more “Date” fields to the “Values” area. Excel defaults to “Count” of whatever field we drop in this area, but we will change this to the “Minimum” and “Maximum” values by using the “Value Field Settings”, as seen in Figure 13.


Figure 13: Adding min and max dates

Now we have a PivotTable that shows us anomalous locations for logons, as well as the timeframe in which the logons occurred, so we can hone our investigation. For this example, we also formatted all cells with timestamp values to reflect the format FireEye Mandiant typically uses during analysis by selecting all the appropriate cells, right-clicking and choosing “Format Cells”, and using a “Custom” format of “YYYY-MM-DD HH:MM:SS”.


Figure 14: PivotTable with suspicious locations and timeframe

IP Address Anomalies

Geolocation anomalies may not always be valuable. However, using a similar configuration as the previous example, we can identify suspicious source IP addresses. We will add “User Principle Name” and “IP Address” fields as Rows, and “IP Address” as Values. Let’s also add the “App” field to Columns. Our field settings and resulting table are displayed in Figure 15:


Figure 15: PivotTable with IP addresses and apps

With just a few clicks, we have a summarized table indicating which IP addresses each user logged in from, and which app they logged into. We can quickly identify two users logged in from IP addresses in the 203.0.113.0/24 range six times, and which applications they logged into from each of these IP addresses.

While these are just a couple use cases, there are many ways to format and view evidence using PivotTables. We recommend trying PivotTables on any data set being reviewed with Excel and experimenting with the Rows, Columns, and Values parameters.

We also recommend adjusting the PivotTable options, which can help reformat the table itself into a format that might fit requirements.

Conclusion

These Excel functions are used frequently during investigations at FireEye Mandiant and are considered important forensic analysis techniques. The examples we give here are just a glimpse into the utility of LOOKUP functions and PivotTables. LOOKUP functions can be used to reference a multitude of data sources and can be applied in other situations during investigations such as tracking remediation and analysis efforts.

PivotTables may be used in a variety of ways as well, depending on what data is available, and what sort of information is being analyzed to identify suspicious activity. Employing these techniques, alongside the ones we highlighted previously, on a consistent basis will go a long way in "excelerating" forensic analysis skills and efficiency.