Author Archives: Richard Bejtlich

Dissecting Weird Packets

I was investigating traffic in my home lab yesterday, and noticed that about 1% of the traffic was weird. Before I describe the weird, let me show you a normal frame for comparison's sake.


This is a normal frame with Ethernet II encapsulation. It begins with 6 bytes of the destination MAC address, 6 bytes of the source MAC address, and 2 bytes of an Ethertype, which in this case is 0x0800, indicating an IP packet follows the Ethernet header. There is no TCP payload as this is an ACK segment.

You can also see this in Tshark.

$ tshark -Vx -r frame4238.pcap

Frame 1: 66 bytes on wire (528 bits), 66 bytes captured (528 bits)
    Encapsulation type: Ethernet (1)
    Arrival Time: May  7, 2019 18:19:10.071831000 UTC
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1557253150.071831000 seconds
    [Time delta from previous captured frame: 0.000000000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 0.000000000 seconds]
    Frame Number: 1
    Frame Length: 66 bytes (528 bits)
    Capture Length: 66 bytes (528 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:ethertype:ip:tcp]
Ethernet II, Src: IntelCor_12:7d:bb (38:ba:f8:12:7d:bb), Dst: Ubiquiti_49:e0:10 (fc:ec:da:49:e0:10)
    Destination: Ubiquiti_49:e0:10 (fc:ec:da:49:e0:10)
        Address: Ubiquiti_49:e0:10 (fc:ec:da:49:e0:10)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: IntelCor_12:7d:bb (38:ba:f8:12:7d:bb)
        Address: IntelCor_12:7d:bb (38:ba:f8:12:7d:bb)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.4.96, Dst: 52.21.18.219
    0100 .... = Version: 4
    .... 0101 = Header Length: 20 bytes (5)
    Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
        0000 00.. = Differentiated Services Codepoint: Default (0)
        .... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    Total Length: 52
    Identification: 0xd98c (55692)
    Flags: 0x4000, Don't fragment
        0... .... .... .... = Reserved bit: Not set
        .1.. .... .... .... = Don't fragment: Set
        ..0. .... .... .... = More fragments: Not set
        ...0 0000 0000 0000 = Fragment offset: 0
    Time to live: 64
    Protocol: TCP (6)
    Header checksum: 0x553f [validation disabled]
    [Header checksum status: Unverified]
    Source: 192.168.4.96
    Destination: 52.21.18.219
Transmission Control Protocol, Src Port: 38828, Dst Port: 443, Seq: 1, Ack: 1, Len: 0
    Source Port: 38828
    Destination Port: 443
    [Stream index: 0]
    [TCP Segment Len: 0]
    Sequence number: 1    (relative sequence number)
    [Next sequence number: 1    (relative sequence number)]
    Acknowledgment number: 1    (relative ack number)
    1000 .... = Header Length: 32 bytes (8)
    Flags: 0x010 (ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 0... = Push: Not set
        .... .... .0.. = Reset: Not set
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
        [TCP Flags: ·······A····]
    Window size value: 296
    [Calculated window size: 296]
    [Window size scaling factor: -1 (unknown)]
    Checksum: 0x08b0 [unverified]
    [Checksum Status: Unverified]
    Urgent pointer: 0
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - No-Operation (NOP)
            Kind: No-Operation (1)
        TCP Option - Timestamps: TSval 26210782, TSecr 2652693036
            Kind: Time Stamp Option (8)
            Length: 10
            Timestamp value: 26210782
            Timestamp echo reply: 2652693036
    [Timestamps]
        [Time since first frame in this TCP stream: 0.000000000 seconds]
        [Time since previous frame in this TCP stream: 0.000000000 seconds]

0000  fc ec da 49 e0 10 38 ba f8 12 7d bb 08 00 45 00   ...I..8...}...E.
0010  00 34 d9 8c 40 00 40 06 55 3f c0 a8 04 60 34 15   .4..@.@.U?...`4.
0020  12 db 97 ac 01 bb e3 42 2a 57 83 49 c2 ea 80 10   .......B*W.I....
0030  01 28 08 b0 00 00 01 01 08 0a 01 8f f1 de 9e 1c   .(..............
0040  e2 2c   

You can see Wireshark understands what it is seeing. It decodes the IP header and the TCP header.

So far so good. Here is an example of the weird traffic I was seeing.



Here is what Tshark thinks of it.

$ tshark -Vx -r frame4241.pcap
Frame 1: 66 bytes on wire (528 bits), 66 bytes captured (528 bits)
    Encapsulation type: Ethernet (1)
    Arrival Time: May  7, 2019 18:19:10.073296000 UTC
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1557253150.073296000 seconds
    [Time delta from previous captured frame: 0.000000000 seconds]
    [Time delta from previous displayed frame: 0.000000000 seconds]
    [Time since reference or first frame: 0.000000000 seconds]
    Frame Number: 1
    Frame Length: 66 bytes (528 bits)
    Capture Length: 66 bytes (528 bits)
    [Frame is marked: False]
    [Frame is ignored: False]
    [Protocols in frame: eth:llc:data]
IEEE 802.3 Ethernet
    Destination: Ubiquiti_49:e0:10 (fc:ec:da:49:e0:10)
        Address: Ubiquiti_49:e0:10 (fc:ec:da:49:e0:10)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: IntelCor_12:7d:bb (38:ba:f8:12:7d:bb)
        Address: IntelCor_12:7d:bb (38:ba:f8:12:7d:bb)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Length: 56
        [Expert Info (Error/Malformed): Length field value goes past the end of the payload]
            [Length field value goes past the end of the payload]
            [Severity level: Error]
            [Group: Malformed]
Logical-Link Control
    DSAP: Unknown (0x45)
        0100 010. = SAP: Unknown
        .... ...1 = IG Bit: Group
    SSAP: LLC Sub-Layer Management (0x02)
        0000 001. = SAP: LLC Sub-Layer Management
        .... ...0 = CR Bit: Command
    Control field: U, func=Unknown (0x0B)
        000. 10.. = Command: Unknown (0x02)
        .... ..11 = Frame type: Unnumbered frame (0x3)
Data (49 bytes)
    Data: 84d98d86b5400649eec0a80460341512db97ac0d0be3422a...
    [Length: 49]

0000  fc ec da 49 e0 10 38 ba f8 12 7d bb 00 38 45 02   ...I..8...}..8E.
0010  0b 84 d9 8d 86 b5 40 06 49 ee c0 a8 04 60 34 15   ......@.I....`4.
0020  12 db 97 ac 0d 0b e3 42 2a 57 83 49 c2 ea c8 ec   .......B*W.I....
0030  01 28 17 6f 00 00 01 01 08 0a 01 8f f1 de ed 7f   .(.o............
0040  a5 4a                                             .J

What's the problem? This frame begins with 6 bytes of the destination MAC address and 6 bytes of the source MAC address, as we saw before. However, the next two bytes are 0x0038, which is not the same as the Ethertype of 0x0800 we saw earlier. 0x0038 is decimal 56, which would seem to indicate a frame length (even though the frame here is a total of 66 bytes).

Wireshark decides to treat this frame as not being Ethernet II, but instead as IEEE 802.3 Ethernet. I had to refer to appendix A of my first book to see what this meant.

For comparison, here is the frame format for Ethernet II (page 664):

This was what we saw with frame 4238 earlier -- Dst MAC, Src MAC, Ethertype, then data.

Here is the frame format for IEEE 802.3 Ethernet.


This is much more complicated: Dst MAC, Src MAC, length, and then DSAP, SSAP, Control, and data.

It turns out that this format doesn't seem to fit what is happening in frame 4241, either. While the length field appears to be in the ballpark, Wireshark's assumption that the next bytes are DSAP, SSAP, Control, and data doesn't fit. The clue for me was seeing that 0x45 followed the presumed length field. I recognized 0x45 as the beginning of an IP header, with 4 meaning IPv4 and 5 meaning 5 words (40 bytes) in the IP header.

If we take a manual byte-by-byte comparative approach we can better understand what may be happening with these two frames. (I broke the 0x45 byte into two "nibbles" in one case.)

Note that I have bolded the parts of each frame that are exactly the same.


This analysis shows that these two frames are very similar, especially in places where I would not expect them to be similar. This caused me to hypothesize that frame 4241 was a corrupted version of frame 4238.

I can believe that the frames would share MAC addresses, IP addresses, and certain IP and TCP defaults. However, it is unusual for them to have the same high source ports (38828) but not the same destination ports (443 and 3339).  Very telling is the fact that they have the same TCP sequence and acknowledgement numbers. They also share the same source timestamp.

Notice one field that I did not bold, because they are not identical -- the IP ID value. Frame 4238 has 0xd98c and frame 4241 has 0xd98d. The perfectly incremented IP ID prompted me to believe that frame 4241 is a corrupted retransmission, at the IP layer, of the same TCP segment.

However, I really don't know what to think. These frames were captured in a Linux 16.04 VirtualBox VM by netsniff-ng. Is this a problem with netsniff-ng, or Linux, or VirtualBox, or the Linux host operating system running VirtualBox?

I'd like to thank the folks at ask.wireshark.org for their assistance with my attempts to decode this (and other) frames as 802.3 raw Ethernet. What's that? It's basically a format that Novell used with IPX, where the frame is Dst MAC, Src MAC, length, data.

I wanted to see if I could tell Wireshark to decode the odd frames as 802.3 raw Ethernet, rather than IEEE 802.3 Ethernet with LLC headers.

Sake Blok helpfully suggested I change the pcap's link layer type to User0, and then tell Wireshark how to interpret the frames. I did it this way, per his direction:

$ editcap -T user0 excerpt.pcap excerpt-user0.pcap

Next I opened the trace in Wireshark and saw frame 4241 (here listed as frame 3) as shown below:


DLT 147 corresponds to the link layer type for User0. Wireshark doesn't know how to handle it. We fix that by right-clicking on the yellow field and selecting Protocol Preferences -> Open DLT User preferences:

Next I created an entry fpr User 0 (DLT-147) with Payload protocol "ip" and Header size "14" as shown:

After clicking OK, I returned to Wireshark. Here is how frame 4241 (again listed here as frame 3) appeared:


You can see Wireshark is now making sense of the IP header, but it doesn't know how to handle the TCP header which follows. I tried different values and options to see if I could get Wireshark to understand the TCP header too, but this went far enough for my purposes.

The bottom line is that I believe there is some sort of packet capture problem, either with the softare used or the traffic that is presented to the software by the bridged NIC created by VirtualBox. As this is a lab environment and the traffic is 1% of the overall capture, I am not worried about the results.

I am fairly sure that the weird traffic is not on the wire. I tried capturing on the host OS sniffing NIC and did not see anything resembling this traffic.

Have you seen anything like this? Let me know in a comment here on on Twitter.

PS: I found the frame.number=X Wireshark display filter helpful, along with the frame.len>Y display filter, when researching this activity.

Troubleshooting NSM Virtualization Problems with Linux and VirtualBox

I spent a chunk of the day troubleshooting a network security monitoring (NSM) problem. I thought I would share the problem and my investigation in the hopes that it might help others. The specifics are probably less important than the general approach.

It began with ja3. You may know ja3 as a set of Zeek scripts developed by the Salesforce engineering team to profile client and server TLS parameters.

I was reviewing Zeek logs captured by my Corelight appliance and by one of my lab sensors running Security Onion. I had coverage of the same endpoint in both sensors.

I noticed that the SO Zeek logs did not have ja3 hashes in the ssl.log entries. Both sensors did have ja3s hashes. My first thought was that SO was misconfigured somehow to not record ja3 hashes. I quickly dismissed that, because it made no sense. Besides, verifying that intution required me to start troubleshooting near the top of the software stack.

I decided to start at the bottom, or close to the bottom. I had a sinking suspicion that, for some reason, Zeek was only seeing traffic sent from remote systems, and not traffic originating from my network. That would account for the creation of ja3s hashes, for traffic sent by remote systems, but not ja3 hashes, as Zeek was not seeing traffic sent by local clients.

I was running SO in VirtualBox 6.0.4 on Ubuntu 18.04. I started sniffing TCP network traffic on the SO monitoring interface using Tcpdump. As I feared, it didn't look right. I ran a new capture with filters for ICMP and a remote IP address. On another system I tried pinging the remote IP address. Sure enough, I only saw ICMP echo replies, and no ICMP echoes. Oddly, I also saw doubles and triples of some of the ICMP echo replies. That worried me, because unpredictable behavior like that could indicate some sort of software problem.

My next step was to "get under" the VM guest and determine if the VM host could see traffic properly. I ran Tcpdump on the Ubuntu 18.04 host on the monitoring interface and repeated my ICMP tests. It saw everything properly. That meant I did not need to bother checking the switch span port that was feeding traffic to the VirtualBox system.

It seemed I had a problem somewhere between the VM host and guest. On the same VM host I was also running an instance of RockNSM. I ran my ICMP tests on the RockNSM VM and, sadly, I got the same one-sided traffic as seen on SO.

Now I was worried. If the problem had only been present in SO, then I could fix SO. If the problem is present in both SO and RockNSM, then the problem had to be with VirtualBox -- and I might not be able to fix it.

I reviewed my configurations in VirtualBox, ensuring that the "Promiscuous Mode" under the Advanced options was set to "Allow All". At this point I worried that there was a bug in VirtualBox. I did some Google searches and reviewed some forum posts, but I did not see anyone reporting issues with sniffing traffic inside VMs. Still, my use case might have been weird enough to not have been reported.

I decided to try a different approach. I wondered if running VirtualBox with elevated privileges might make a difference. I did not want to take ownership of my user VMs, so I decided to install a new VM and run it with elevated privileges.

Let me stop here to note that I am breaking one of the rules of troubleshooting. I'm introducing two new variables, when I should have introduced only one. I should have built a new VM but run it with the same user privileges with which I was running the existing VMs.

I decided to install a minimal edition of Ubuntu 9, with VirtualBox running via sudo. When I started the VM and sniffed traffic on the monitoring port, lo and behold, my ICMP tests revealed both sides of the traffic as I had hoped. Unfortunately, from this I erroneously concluded that running VirtualBox with elevated privileges was the answer to my problems.

I took ownership of the SO VM in my elevated VirtualBox session, started it, and performed my ICMP tests. Womp womp. Still broken.

I realized I needed to separate the two variables that I had entangled, so I stopped VirtualBox, and changed ownership of the Debian 9 VM to my user account. I then ran VirtualBox with user privileges, started the Debian 9 VM, and ran my ICMP tests. Success again! Apparently elevated privileges had nothing to do with my problem.

By now I was glad I had not posted anything to any user forums describing my problem and asking for help. There was something about the monitoring interface configurations in both SO and RockNSM that resulted in the inability to see both sides of traffic (and avoid weird doubles and triples).

I started my SO VM again and looked at the script that configured the interfaces. I commented out all the entries below the management interface as shown below.

$ cat /etc/network/interfaces

# This configuration was created by the Security Onion setup script.
#
# The original network interface configuration file was backed up to:
# /etc/network/interfaces.bak.
#
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# loopback network interface
auto lo
iface lo inet loopback

# Management network interface
auto enp0s3
iface enp0s3 inet static
  address 192.168.40.76
  gateway 192.168.40.1
  netmask 255.255.255.0
  dns-nameservers 192.168.40.1
  dns-domain localdomain

#auto enp0s8
#iface enp0s8 inet manual
#  up ip link set $IFACE promisc on arp off up
#  down ip link set $IFACE promisc off down
#  post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done
#  post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6

#auto enp0s9
#iface enp0s9 inet manual
#  up ip link set $IFACE promisc on arp off up
#  down ip link set $IFACE promisc off down
#  post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done
#  post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6

I rebooted the system and brought the enp0s8 interface up manually using this command:

$ sudo ip link set enp0s8 promisc on arp off up

Fingers crossed, I ran my ICMP sniffing tests, and voila, I saw what I needed -- traffic in both directions, without doubles or triples no less.

So, there appears to be some sort of problem with the way SO and RockNSM set parameters for their monitoring interfaces, at least as far as they interact with VirtualBox 6.0.4 on Ubuntu 18.04. You can see in the network script that SO disables a bunch of NIC options. I imagine one or more of them is the culprit, but I didn't have time to work through them individually.

I tried taking a look at the network script in RockNSM, but it runs CentOS, and I'll be darned if I can't figure out where to look. I'm sure it's there somewhere, but I didn't have the time to figure out where.

The moral of the story is that I should have immediately checked after installation that both SO and RockNSM were seeing both sides of the traffic I expected them to see. I had taken that for granted for many previous deployments, but something broke recently and I don't know exactly what. My workaround will hopefully hold for now, but I need to take a closer look at the NIC options because I may have introduced another fault.

A second moral is to be careful of changing two or more variables when troubleshooting. When you do that you might fix a problem, but not know what change fixed the issue.

Thoughts on OSSEC Con 2019

Last week I attended my first OSSEC conference. I first blogged about OSSEC in 2007, and wrote other posts about it in the following years.

OSSEC is a host-based intrusion detection and log analysis system with correlation and active response features. It is cross-platform, such that I can run it on my Windows and Linux systems. The moving force behind the conference was a company local to me called Atomicorp.

In brief, I really enjoyed this one-day event. (I had planned to attend the workshop on the second day but my schedule did not cooperate.) The talks were almost uniformly excellent and informative. I even had a chance to talk jiu-jitsu with OSSEC creator Daniel Cid, who despite hurting his leg managed to travel across the country to deliver the keynote.

I'd like to share a few highlights from my notes.

First, I had been worried that OSSEC was in some ways dead. I saw that the Security Onion project had replaced OSSEC with a fork called Wazuh, which I learned is apparently pronounced "wazoo." To my delight, I learned OSSEC is decidedly not dead, and that Wazuh has been suffering stability problems. OSSEC has a lot of interesting development ahead of it, which you can track on their Github repo.

For example, the development roadmap includes eliminating Logstash from the pipeline used by many OSSEC users. OSSEC would feed directly into Elasticsearch. One speaker noted that Logstash has a 1.7 GB memory footprint, which astounded me.

On a related note, the OSSEC team is planning to create a new Web console, with a design goal to have it run in an "AWS t2.micro" instance. The team noted that instance offers 2 GB memory, which doesn't match what AWS says. Perhaps they meant t2.micro and 1 GB memory, or t2.small with 2 GB memory. I think they mean t2.micro with 1 GB RAM, as that is the free tier. Either way, I'm excited to see this later in 2019.

Second, I thought the presentation by security personnel from USA Today offered an interesting insight. One design goal they had for monitoring their Google Cloud Platform (GCP) was to not install OSSEC on every container or on Kubernetes worker nodes. Several times during the conference, speakers noted that the transient nature of cloud infrastructure is directly antithetical to standard OSSEC usage, whereby OSSEC is installed on servers with long uptime and years of service. Instead, USA Today used OSSEC to monitor HTTP logs from the GCP load balancer, logs from Google Kubernetes Engine, and monitored processes by watching output from successive kubectl invocations.

Third, a speaker from Red Hat brought my attention to an aspect of containers that I had not considered. Docker and containers had made software testing and deployment a lot easier for everyone. However, those who provide containers have effectively become Linux distribution maintainers. In other words, who is responsible when a security or configuration vulnerability in a Linux component is discovered? Will the container maintainers be responsive?

Another speaker emphasized the difference between "security of the cloud," offered by cloud providers, and "security in the cloud," which is supposed to be the customer's responsibility. This makes sense from a technical point of view, but I expect that in the long term this differentiation will no longer be tenable from a business or legal point of view.

Customers are not going to have the skills or interest to secure their software in the cloud, as they outsource ever more technical talent to the cloud providers and their infrastructure. I expect cloud providers to continue to develop, acquire, and offer more security services, and accelerate their competition on a "complete security environment."

I look forward to more OSSEC development and future conferences.

Thoughts on Cloud Security

Recently I've been reading about cloud security and security with respect to DevOps. I'll say more about the excellent book I'm reading, but I had a moment of déjà vu during one section.

The book described how cloud security is a big change from enterprise security because it relies less on IP-address-centric controls and more on users and groups. The book talked about creating security groups, and adding users to those groups in order to control their access and capabilities.

As I read that passage, it reminded me of a time long ago, in the late 1990s, when I was studying for the MCSE, then called the Microsoft Certified Systems Engineer. I read the book at left, Windows NT Security Handbook, published in 1996 by Tom Sheldon. It described the exact same security process of creating security groups and adding users. This was core to the new NT 4 role based access control (RBAC) implementation.

Now, fast forward a few years, or all the way to today, and consider the security challenges facing the majority of legacy enterprises: securing Windows assets and the data they store and access. How could this wonderful security model, based on decades of experience (from the 1960s and 1970s no less), have failed to work in operational environments?

There are many reasons one could cite, but I think the following are at least worthy of mention.

The systems enforcing the security model are exposed to intruders.

Furthermore:

Intruders are generally able to gain code execution on systems participating in the security model.

Finally:

Intruders have access to the network traffic which partially contains elements of the security model.

From these weaknesses, a large portion of the security countermeasures of the last two decades have been derived as compensating controls and visibility requirements.

The question then becomes:

Does this change with the cloud?

In brief, I believe the answer is largely "yes," thankfully. Generally, the systems upon which the security model is being enforced are not able to access the enforcement mechanism, thanks to the wonders of virtualization.

Should an intruder find a way to escape from their restricted cloud platform and gain hypervisor or management network access, then they find themselves in a situation similar to the average Windows domain network.

This realization puts a heavy burden on the cloud infrastructure operators. They major players are likely able to acquire and apply the expertise and resources to make their infrastructure far more resilient and survivable than their enterprise counterparts.

The weakness will likely be their personnel.

Once the compute and network components are sufficiently robust from externally sourced compromise, then internal threats become the next most cost-effective and return-producing vectors for dedicated intruders.

Is there anything users can do as they hand their compute and data assets to cloud operators?

I suggest four moves.

First, small- to mid-sized cloud infrastructure users will likely have to piggyback or free-ride on the initiatives and influence of the largest cloud customers, who have the clout and hopefully the expertise to hold the cloud operators responsible for the security of everyone's data.

Second, lawmakers may also need improved whistleblower protection for cloud employees who feel threatened by revealing material weaknesses they encounter while doing their jobs.

Third, government regulators will have to ensure no cloud provider assumes a monopoly, or no two providers assume a duopoloy. We may end up with the three major players and a smattering of smaller ones, as is the case with many mature industries.

Fourth, users should use every means at their disposal to select cloud operators not only on their compute features, but on their security and visibility features. The more logging and visibility exposed by the cloud provider, the better. I am excited by new features like the Azure network tap and hope to see equivalent features in other cloud infrastructure.

Remember that security has two main functions: planning/resistance, to try to stop bad things from happening, and detection/respond, to handle the failures that inevitably happen. "Prevention eventually fails" is one of my long-time mantras. We don't want prevention to fail silently in the cloud. We need ways to know that failure is happening so that we can plan and implement new resistance mechanisms, and then validate their effectiveness via detection and response.

Update: I forgot to mention that the material above assumed that the cloud users and operators made no unintentional configuration mistakes. If users or operators introduce exposures or vulnerabilities, then those will be the weaknesses that intruders exploit. We've already seen a lot of this happening and it appears to be the most common problem. Procedures and tools which constantly assess cloud configurations for exposures and vulnerabilities due to misconfiguration or poor practices are a fifth move which all involved should make.

A corollary is that complexity can drive problems. When the cloud infrastructure offers too many knobs to turn, then it's likely the users and operators will believe they are taking one action when in reality they are implementing another.

Forcing the Adversary to Pursue Insider Theft

Jack Crook pointed me toward a story by Christopher Burgess about intellectual property theft by "Hongjin Tan, a 35 year old Chinese national and U.S. legal permanent resident... [who] was arrested on December 20 and charged with theft of trade secrets. Tan is alleged to have stolen the trade secrets from his employer, a U.S. petroleum company," according to the criminal complaint filed by the US DoJ.

Tan's former employer and the FBI allege that Tan "downloaded restricted files to a personal thumb drive." I could not tell from the complaint if Tan downloaded the files at work or at home, but the thumb drive ended up at Tan's home. His employer asked Tan to bring it to their office, which Tan did. However, he had deleted all the files from the drive. Tan's employer recovered the files using commercially available forensic software.

This incident, by definition, involves an "insider threat." Tan was an employee who appears to have copied information that was outside the scope of his work responsibilities, resigned from his employer, and was planning to return to China to work for a competitor, having delivered his former employer's intellectual property.

When I started GE-CIRT in 2008 (officially "initial operating capability" on 1 January 2009), one of the strategies we pursued involved insider threats. I've written about insiders on this blog before but I couldn't find a description of the strategy we implemented via GE-CIRT.

We sought to make digital intrusions more expensive than physical intrusions.

In other words, we wanted to make it easier for the adversary to accomplish his mission using insiders. We wanted to make it more difficult for the adversary to accomplish his mission using our network.

In a cynical sense, this makes security someone else's problem. Suddenly the physical security team is dealing with the worst of the worst!

This is a win for everyone, however. Consider the many advantages the physical security team has over the digital security team.

The physical security team can work with human resources during the hiring process. HR can run background checks and identify suspicious job applicants prior to granting employment and access.

Employees are far more exposed than remote intruders. Employees, even under cover, expose their appearance, likely residence, and personalities to the company and its workers.

Employees can be subject to far more intensive monitoring than remote intruders. Employee endpoints can be instrumented. Employee workspaces are instrumented via access cards, cameras at entry and exit points, and other measures.

Employers can cooperate with law enforcement to investigate and prosecute employees. They can control and deter theft and other activities.

In brief, insider theft, like all "close access" activities, is incredibly risky for the adversary. It is a win for everyone when the adversary must resort to using insiders to accomplish their mission. Digital and physical security must cooperate to leverage these advantages, while collaborating with human resources, legal, information technology, and business lines to wring the maximum results from this advantage.

Fixing Virtualbox RDP Server with DetectionLab

Yesterday I posted about DetectionLab, but noted that I was having trouble with the RDP servers offered by Virtualbox. If you remember, DetectionLab builds four virtual machines:

root@LAPTOP-HT4TGVCP C:\Users\root>"c:\Program Files\Oracle\VirtualBox\VBoxManage" list runningvms
"logger" {3da9fffb-4b02-4e57-a592-dd2322f14245}
"dc.windomain.local" {ef32d493-845c-45dc-aff7-3a86d9c590cd}
"wef.windomain.local" {7cd008b7-c6e0-421d-9655-8f92ec98d9d7}
"win10.windomain.local" {acf413fb-6358-44df-ab9f-cc7767ed32bd}

I was having a problem with two of the VMs sharing the same port for the RDP server offered by Virtualbox. This meant I could not access one of them. (Below, port 5932 has the conflict.)

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo logger | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 5955, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address  = "0.0.0.0"

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo dc.windomain.local | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 5932, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address = "0.0.0.0"

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo wef.windomain.local | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 5932, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address = "0.0.0.0"

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo win10.windomain.local | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 5981, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address = "0.0.0.0"

To fix this, I explicitly added port values to the configuration in the Vagrantfile. Here is one example:

      vb.customize ["modifyvm", :id, "--vrde", "on"]
      vb.customize ["modifyvm", :id, "--vrdeaddress", "0.0.0.0"]
      vb.customize ["modifyvm", :id, "--vrdeport", "60101"]

After a 'vagrant reload', the RDP servers were now listening on new ports, as I hoped.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo logger | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 60101, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address  = "0.0.0.0"

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo dc.windomain.local | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 60102, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address = "0.0.0.0"

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo wef.windomain.local | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 60103, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address = "0.0.0.0"

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage" showvminfo win10.windomain.local | findstr /I vrde | findstr /I address
VRDE:                        enabled (Address 0.0.0.0, Ports 60104, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
VRDE property               : TCP/Address = "0.0.0.0"

This is great, but I am still encountering a problem with avoiding port collisions when Vagrant remaps ports for services on the VMs.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant status
Current machine states:

logger                    running (virtualbox)
dc                        running (virtualbox)
wef                       running (virtualbox)
win10                     running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant port logger
The forwarded ports for the machine are listed below. Please note that
these values may differ from values configured in the Vagrantfile if the
provider supports automatic port collision detection and resolution.

    22 (guest) => 2222 (host)

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant port dc
The forwarded ports for the machine are listed below. Please note that
these values may differ from values configured in the Vagrantfile if the
provider supports automatic port collision detection and resolution.

  3389 (guest) => 3389 (host)
    22 (guest) => 2200 (host)
  5985 (guest) => 55985 (host)
  5986 (guest) => 55986 (host)

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant port wef
The forwarded ports for the machine are listed below. Please note that
these values may differ from values configured in the Vagrantfile if the
provider supports automatic port collision detection and resolution.

  3389 (guest) => 2201 (host)
    22 (guest) => 2202 (host)
  5985 (guest) => 2203 (host)
  5986 (guest) => 2204 (host)

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant port win10
The forwarded ports for the machine are listed below. Please note that
these values may differ from values configured in the Vagrantfile if the
provider supports automatic port collision detection and resolution.

  3389 (guest) => 2205 (host)
    22 (guest) => 2206 (host)
  5985 (guest) => 2207 (host)
  5986 (guest) => 2208 (host)

The entry in bold is the problem. Vagrant should not be mapping port 3389, which is already in use by the RDP server on the Windows 10 host, such that it tries to be available to the guest.

I tried telling Vagrant by hand in the Vagrantfile to map port 3389 elsewhere, but nothing worked. (I tried entries like the following.)

    config.vm.network :forwarded_port, guest: 3389, host: 5789

I also searched to see if there might be a configuration outside the Vagrantfile that I was missing. Here is what I found:

ds61@ds61:~/DetectionLab-master$ find . | xargs grep "3389" *
./Terraform/Method1/main.tf:    from_port   = 3389
./Terraform/Method1/main.tf:    to_port     = 3389
./Packer/vagrantfile-windows_2016.template:    config.vm.network :forwarded_port, guest: 3389, host: 3389, id: "rdp", auto_correct: true
./Packer/scripts/enable-rdp.bat:netsh advfirewall firewall add rule name="Open Port 3389" dir=in action=allow protocol=TCP localport=3389
./Packer/vagrantfile-windows_10.template:    config.vm.network :forwarded_port, guest: 3389, host: 3389, id: "rdp", auto_correct: true

I wonder if those Packer templates have anything to do with it, or if I am encountering a problem with Vagrant? I have seen many people experience similar issues, so I don't know.

It's not a big deal, though. Now that I can directly access the virtual screens for each VM on Virtualbox via the RDP server, I don't need to RDP to port 3389 on each Windows VM in order to interact with it.

If anyone has any ideas, though, I'm interested!

Trying DetectionLab

Many security professionals run personal labs. Trying to create an environment that includes fairly modern Windows systems can be a challenge. In the age of "infrastructure as code," there should be a simpler way to deploy systems in a repeatable, virtualized way -- right?

Enter DetectionLab, a project by Chris Long. Briefly, Chris built a project that uses Packer and Vagrant to create an instrumented lab environment. Chris explained the project in late 2017 in a Medium post, which I recommend reading.

I can't even begin to describe all the functionality packed into this project. So much of it is new, but this is a great way to learn about it. In this post, I would like to show how I got a version of DetectionLab running.

My build environment included a modern laptop with 16 GB RAM and Windows 10 professional. I had already installed Virtualbox 6.0 with the appropriate VirtualBox Extension Pack. I had also enabled the native OpenSSH server and performed all DetectionLab installation functions over an OpenSSH session.

Install Chocolatey

My first step was to install Chocolatey, a package manager for Windows. I wanted to use this to install the Git client I wanted to use to clone the DetectionLab repo. Commands I typed at each stage are in bold below.

root@LAPTOP-HT4TGVCP C:\Users\root>@"%SystemRoot%\System32\WindowsPowerShell\v1.0\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "ie
x ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"
Getting latest version of the Chocolatey package for download.
Getting Chocolatey from https://chocolatey.org/api/v2/package/chocolatey/0.10.11.
Downloading 7-Zip commandline tool prior to extraction.
Extracting C:\Users\root\AppData\Local\Temp\chocolatey\chocInstall\chocolatey.zip to C:\Users\root\AppData\Local\Temp\chocolatey\chocInstall...
Installing chocolatey on this machine
Creating ChocolateyInstall as an environment variable (targeting 'Machine')
  Setting ChocolateyInstall to 'C:\ProgramData\chocolatey'
WARNING: It's very likely you will need to close and reopen your shell
  before you can use choco.
Restricting write permissions to Administrators
We are setting up the Chocolatey package repository.
The packages themselves go to 'C:\ProgramData\chocolatey\lib'
  (i.e. C:\ProgramData\chocolatey\lib\yourPackageName).
A shim file for the command line goes to 'C:\ProgramData\chocolatey\bin'
  and points to an executable in 'C:\ProgramData\chocolatey\lib\yourPackageName'.

Creating Chocolatey folders if they do not already exist.

WARNING: You can safely ignore errors related to missing log files when
  upgrading from a version of Chocolatey less than 0.9.9.
  'Batch file could not be found' is also safe to ignore.
  'The system cannot find the file specified' - also safe.
chocolatey.nupkg file not installed in lib.
 Attempting to locate it from bootstrapper.
PATH environment variable does not have C:\ProgramData\chocolatey\bin in it. Adding...
WARNING: Not setting tab completion: Profile file does not exist at 'C:\Users\root\Documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1'.
Chocolatey (choco.exe) is now ready.
You can call choco from anywhere, command line or powershell by typing choco.
Run choco /? for a list of functions.
You may need to shut down and restart powershell and/or consoles
 first prior to using choco.
Ensuring chocolatey commands are on the path
Ensuring chocolatey.nupkg is in the lib folder

root@LAPTOP-HT4TGVCP C:\Users\root>choco
Chocolatey v0.10.11
Please run 'choco -?' or 'choco -?' for help menu.

Install Git

With Chocolatey installed, I could install Git.

root@LAPTOP-HT4TGVCP C:\Users\root>choco install git -params '"/GitAndUnixToolsOnPath"'
Chocolatey v0.10.11
Installing the following packages:
git
By installing you accept licenses for the packages.
Progress: Downloading git.install 2.20.1... 100%
Progress: Downloading chocolatey-core.extension 1.3.3... 100%
Progress: Downloading git 2.20.1... 100%

chocolatey-core.extension v1.3.3 [Approved]
chocolatey-core.extension package files install completed. Performing other installation steps.
 Installed/updated chocolatey-core extensions.
 The install of chocolatey-core.extension was successful.
  Software installed to 'C:\ProgramData\chocolatey\extensions\chocolatey-core'

git.install v2.20.1 [Approved]
git.install package files install completed. Performing other installation steps.
The package git.install wants to run 'chocolateyInstall.ps1'.
Note: If you don't run this script, the installation will fail.
Note: To confirm automatically next time, use '-y' or consider:
choco feature enable -n allowGlobalConfirmation
Do you want to run the script?([Y]es/[N]o/[P]rint): y

@{Inno Setup CodeFile: Path Option=CmdTools; PSPath=Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall\Git_
is1; PSParentPath=Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall; PSChildName=Git_is1; PSDrive=HKLM; PS
Provider=Microsoft.PowerShell.Core\Registry}
Using Git LFS
Installing 64-bit git.install...
git.install has been installed.
git.install installed to 'C:\Program Files\Git'
  git.install can be automatically uninstalled.
Environment Vars (like PATH) have changed. Close/reopen your shell to
 see the changes (or in powershell/cmd.exe just type `refreshenv`).
 The install of git.install was successful.
  Software installed to 'C:\Program Files\Git\'

git v2.20.1 [Approved]
git package files install completed. Performing other installation steps.
 The install of git was successful.
  Software install location not explicitly set, could be in package or
  default install location if installer.

Chocolatey installed 3/3 packages.
 See the log for details (C:\ProgramData\chocolatey\logs\chocolatey.log).

Clone DetectionLab

With Git installed, I can clone the DetectionLab repo from Github.

root@LAPTOP-HT4TGVCP C:\Users\root>mkdir git

root@LAPTOP-HT4TGVCP C:\Users\root>cd git

root@LAPTOP-HT4TGVCP C:\Users\root\git>mkdir detectionlab

root@LAPTOP-HT4TGVCP C:\Users\root\git>cd detectionlab

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab>git clone https://github.com/clong/DetectionLab.git
'git' is not recognized as an internal or external command,
operable program or batch file.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab>refreshenv
Refreshing environment variables from registry for cmd.exe. Please wait...Finished..

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab>git clone https://github.com/clong/DetectionLab.git
Cloning into 'DetectionLab'...
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1163 (delta 0), reused 0 (delta 0), pack-reused 1162R
Receiving objects: 100% (1163/1163), 11.81 MiB | 12.24 MiB/s, done.
Resolving deltas: 100% (568/568), done.

Install Vagrant

Before going any further, I needed to install Vagrant.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab>cd ..\..\

root@LAPTOP-HT4TGVCP C:\Users\root>choco install vagrant
Chocolatey v0.10.11
Installing the following packages:
vagrant
By installing you accept licenses for the packages.
Progress: Downloading vagrant 2.2.3... 100%

vagrant v2.2.3 [Approved]
vagrant package files install completed. Performing other installation steps.
The package vagrant wants to run 'chocolateyinstall.ps1'.
Note: If you don't run this script, the installation will fail.
Note: To confirm automatically next time, use '-y' or consider:
choco feature enable -n allowGlobalConfirmation
Do you want to run the script?([Y]es/[N]o/[P]rint): y

Downloading vagrant 64 bit
  from 'https://releases.hashicorp.com/vagrant/2.2.3/vagrant_2.2.3_x86_64.msi'
Progress: 100% - Completed download of C:\Users\root\AppData\Local\Temp\chocolatey\vagrant\2.2.3\vagrant_2.2.3_x86_64.msi (229.22 MB).
Download of vagrant_2.2.3_x86_64.msi (229.22 MB) completed.
Hashes match.
Installing vagrant...
vagrant has been installed.
Repairing currently installed global plugins. This may take a few minutes...
Installed plugins successfully repaired!
  vagrant may be able to be automatically uninstalled.
Environment Vars (like PATH) have changed. Close/reopen your shell to
 see the changes (or in powershell/cmd.exe just type `refreshenv`).
 The install of vagrant was successful.
  Software installed as 'msi', install location is likely default.

Chocolatey installed 1/1 packages.
 See the log for details (C:\ProgramData\chocolatey\logs\chocolatey.log).

Packages requiring reboot:
 - vagrant (exit code 3010)

The recent package changes indicate a reboot is necessary.
 Please reboot at your earliest convenience.

root@LAPTOP-HT4TGVCP C:\Users\root>shutdown /r /t 0

Installing DetectionLab

Now we are finally at the point where we can install DetectionLab. Note that in my example, I downloaded boxes already built by Chris. I did not build my own, in order to save time. You can follow his instructions to build boxes yourself.

I saw an error regarding the win10 host, but that did not appear to be a real problem.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab>powershell
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.

PS C:\Users\root\git\detectionlab\DetectionLab> .\build.ps1 -ProviderName virtualbox -VagrantOnly
[preflight_checks] Running..
[preflight_checks] Checking if Vagrant is installed
[preflight_checks] Checking for pre-existing boxes..
[preflight_checks] Checking for vagrant instances..
[preflight_checks] Checking disk space..
[preflight_checks] Checking if vagrant-reload is installed..
The vagrant-reload plugin is required and not currently installed. This script will attempt to install it now.
Installing the 'vagrant-reload' plugin. This can take a few minutes...
Installed the plugin 'vagrant-reload (0.0.1)'!
[preflight_checks] Finished.
[download_boxes] Running..
[download_boxes] Downloading windows_10_virtualbox.box

[download_boxes] Downloading windows_2016_virtualbox.box
[download_boxes] Getting filehash for: windows_10_virtualbox.box
[download_boxes] Getting filehash for: windows_2016_virtualbox.box
[download_boxes] Checking Filehashes..
[download_boxes] Finished.
[main] Running vagrant_up_host for: logger
[vagrant_up_host] Running for logger
Attempting to bring up the logger host using Vagrant
[vagrant_up_host] Finished for logger. Got exit code: 0
[main] vagrant_up_host finished. Exitcode: 0
Good news! logger was built successfully!
[main] Finished for: logger
[main] Running vagrant_up_host for: dc
[vagrant_up_host] Running for dc
Attempting to bring up the dc host using Vagrant
[vagrant_up_host] Finished for dc. Got exit code: 0
[main] vagrant_up_host finished. Exitcode: 0
Good news! dc was built successfully!
[main] Finished for: dc
[main] Running vagrant_up_host for: wef
[vagrant_up_host] Running for wef
Attempting to bring up the wef host using Vagrant
[vagrant_up_host] Finished for wef. Got exit code: 0
[main] vagrant_up_host finished. Exitcode: 0
Good news! wef was built successfully!
[main] Finished for: wef
[main] Running vagrant_up_host for: win10
[vagrant_up_host] Running for win10
Attempting to bring up the win10 host using Vagrant
[vagrant_up_host] Finished for win10. Got exit code: 1
[main] vagrant_up_host finished. Exitcode: 1
WARNING: Something went wrong while attempting to build the win10 box.
Attempting to reload and reprovision the host...
[main] Running vagrant_reload_host for: win10
[vagrant_reload_host] Running for win10
[vagrant_reload_host] Finished for win10. Got exit code: 1
C:\Users\root\git\detectionlab\DetectionLab\build.ps1 : Failed to bring up win10 after a reload. Exiting
At line:1 char:1
+ .\build.ps1 -ProviderName virtualbox -VagrantOnly
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Write-Error], WriteErrorException
    + FullyQualifiedErrorId : Microsoft.PowerShell.Commands.WriteErrorException,build.ps1

[main] Running post_build_checks
[post_build_checks] Running Caldera Check.
[download] Running for https://192.168.38.105:8888, looking for
[download] Found at https://192.168.38.105:8888
[post_build_checks] Cladera Result: True
[post_build_checks] Running Splunk Check.
[download] Running for https://192.168.38.105:8000/en-US/account/login?return_to=%2Fen-US%2F, looking for This browser is not supported by Splunk
[download] Found This browser is not supported by Splunk at https://192.168.38.105:8000/en-US/account/login?return_to=%2Fen-US%2F
[post_build_checks] Splunk Result: True
[post_build_checks] Running Fleet Check.
[download] Running for https://192.168.38.105:8412, looking for Kolide Fleet
[download] Found Kolide Fleet at https://192.168.38.105:8412
[post_build_checks] Fleet Result: True
[post_build_checks] Running MS ATA Check.
[download] Running for https://192.168.38.103, looking for
[post_build_checks] ATA Result: True
[main] Finished post_build_checks

Checking the VMs

I used the Virtualbox command line program to check the status of the new VMs.

root@LAPTOP-HT4TGVCP c:\Program Files\Oracle\VirtualBox>VBoxManage list runningvms
"logger" {3da9fffb-4b02-4e57-a592-dd2322f14245}
"dc.windomain.local" {ef32d493-845c-45dc-aff7-3a86d9c590cd}
"wef.windomain.local" {7cd008b7-c6e0-421d-9655-8f92ec98d9d7}
"win10.windomain.local" {acf413fb-6358-44df-ab9f-cc7767ed32bd}

Interacting with Vagrant and the Logger Host

Next I decided to use Vagrant to check on the status of the boxes, and to interact with one if I could. I wanted to find the Bro and Suricata logs.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab>cd Vagrant

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant status
Current machine states:

logger                    running (virtualbox)
dc                        running (virtualbox)
wef                       running (virtualbox)
win10                     running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.


root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant ssh logger

Welcome to Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-131-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

29 packages can be updated.
24 updates are security updates.


Last login: Sun Jan 27 19:24:05 2019 from 10.0.2.2

root@logger:~# /opt/bro/bin/broctl status
Name         Type    Host             Status    Pid    Started
manager      manager localhost        running   9848   27 Jan 17:19:15
proxy        proxy   localhost        running   9893   27 Jan 17:19:16
worker-eth1-1 worker  localhost        running   9945   27 Jan 17:19:17
worker-eth1-2 worker  localhost        running   9948   27 Jan 17:19:17

vagrant@logger:~$ ls -al /opt/bro
total 32
drwxr-xr-x 8 root root 4096 Jan 27 17:19 .
drwxr-xr-x 5 root root 4096 Jan 27 17:19 ..
drwxr-xr-x 2 root root 4096 Jan 27 17:19 bin
drwxrwsr-x 2 root bro  4096 Jan 27 17:19 etc
drwxr-xr-x 3 root root 4096 Jan 27 17:19 lib
drwxrws--- 3 root bro  4096 Jan 27 18:00 logs
drwxr-xr-x 4 root root 4096 Jan 27 17:19 share
drwxrws--- 8 root bro  4096 Jan 27 17:19 spool

vagrant@logger:~$ ls -al /opt/bro/logs
ls: cannot open directory '/opt/bro/logs': Permission denied

vagrant@logger:~$ sudo bash

root@logger:~# ls -al /opt/bro/logs/
2019-01-27/ current/

root@logger:~# ls -al /opt/bro/logs/current/
total 3664
drwxr-sr-x 2 root bro    4096 Jan 27 19:20 .
drwxrws--- 8 root bro    4096 Jan 27 17:19 ..
-rw-r--r-- 1 root bro     475 Jan 27 19:19 capture_loss.log
-rw-r--r-- 1 root bro     127 Jan 27 17:19 .cmdline
-rw-r--r-- 1 root bro   83234 Jan 27 19:30 communication.log
-rw-r--r-- 1 root bro 1430714 Jan 27 19:30 conn.log
-rw-r--r-- 1 root bro    1340 Jan 27 19:00 dce_rpc.log
-rw-r--r-- 1 root bro  185114 Jan 27 19:28 dns.log
-rw-r--r-- 1 root bro     310 Jan 27 17:19 .env_vars
-rw-r--r-- 1 root bro  139387 Jan 27 19:30 files.log
-rw-r--r-- 1 root bro  544416 Jan 27 19:30 http.log
-rw-r--r-- 1 root bro     224 Jan 27 19:05 known_services.log
-rw-r--r-- 1 root bro     956 Jan 27 19:19 notice.log
-rw-r--r-- 1 root bro       5 Jan 27 17:19 .pid
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.capture_loss
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.communication
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.conn
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.conn-summary
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.dce_rpc
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.dns
-rw-r--r-- 1 root bro      18 Jan 27 18:00 .rotated.dpd
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.files
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.http
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.kerberos
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.known_certs
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.known_hosts
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.known_services
-rw-r--r-- 1 root bro      18 Jan 27 18:00 .rotated.loaded_scripts
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.notice
-rw-r--r-- 1 root bro      18 Jan 27 18:00 .rotated.packet_filter
-rw-r--r-- 1 root bro      18 Jan 27 18:00 .rotated.reporter
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.smb_files
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.smb_mapping
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.software
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.ssl
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.stats
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.weird
-rw-r--r-- 1 root bro      18 Jan 27 19:00 .rotated.x509
-rw-r--r-- 1 root bro    1311 Jan 27 19:24 smb_mapping.log
-rw-r--r-- 1 root bro   15767 Jan 27 19:30 ssl.log
-rw-r--r-- 1 root bro      58 Jan 27 17:19 .startup
-rw-r--r-- 1 root bro   11326 Jan 27 19:30 stats.log
-rwx------ 1 root bro      18 Jan 27 17:19 .status
-rw-r--r-- 1 root bro      80 Jan 27 19:00 stderr.log
-rw-r--r-- 1 root bro     188 Jan 27 17:19 stdout.log
-rw-r--r-- 1 root bro 1141860 Jan 27 19:30 weird.log
-rw-r--r-- 1 root bro    2799 Jan 27 19:20 x509.log

root@logger:~# cd /opt/bro/logs/

root@logger:/opt/bro/logs# ls
2019-01-27  current

root@logger:/opt/bro/logs# cd current

root@logger:/opt/bro/logs/current# ls

capture_loss.log   conn.log     dns.log    http.log            notice.log     smb_mapping.log  stats.log   stdout.log  x509.log
communication.log  dce_rpc.log  files.log  known_services.log  smb_files.log  ssl.log          stderr.log  weird.log

root@logger:/opt/bro/logs/current# jq  -c . dce_rpc.log  | head

{"ts":1548615615.836272,"uid":"CEmNr31qusp3G7GFg4","id.orig_h":"192.168.38.104","id.orig_p":49758,"id.resp_h":"192.168.38.102","id.resp_p":135,"named_pipe":
"135","endpoint":"epmapper","operation":"ept_map"}
{"ts":1548615615.83961,"uid":"CJO7xe4JUo43IjGG01","id.orig_h":"192.168.38.104","id.orig_p":49759,"id.resp_h":"192.168.38.102","id.resp_p":49667,"rtt":0.0003
57,"named_pipe":"49667","endpoint":"lsarpc","operation":"LsarLookupSids3"}
{"ts":1548615615.851544,"uid":"CJO7xe4JUo43IjGG01","id.orig_h":"192.168.38.104","id.orig_p":49759,"id.resp_h":"192.168.38.102","id.resp_p":49667,"rtt":0.000
596,"named_pipe":"49667","endpoint":"lsarpc","operation":"LsarLookupSids3"}
{"ts":1548615615.835628,"uid":"CgEcizh05xJ1ricP8","id.orig_h":"192.168.38.104","id.orig_p":49758,"id.resp_h":"192.168.38.102","id.resp_p":135,"named_pipe":"
135","endpoint":"epmapper","operation":"ept_map"}
{"ts":1548615615.839587,"uid":"CVV6WZ3vgpE673rl6a","id.orig_h":"192.168.38.104","id.orig_p":49759,"id.resp_h":"192.168.38.102","id.resp_p":49667,"rtt":0.000
382,"named_pipe":"49667","endpoint":"lsarpc","operation":"LsarLookupSids3"}
{"ts":1548615615.852193,"uid":"CVV6WZ3vgpE673rl6a","id.orig_h":"192.168.38.104","id.orig_p":49759,"id.resp_h":"192.168.38.102","id.resp_p":49667,"rtt":0.000
382,"named_pipe":"49667","endpoint":"lsarpc","operation":"LsarLookupSids3"}
{"ts":1548618003.200283,"uid":"CGYDww34wYz8eCYr96","id.orig_h":"192.168.38.103","id.orig_p":63295,"id.resp_h":"192.168.38.102","id.resp_p":135,"named_pipe":
"135","endpoint":"epmapper","operation":"ept_map"}
{"ts":1548618003.200403,"uid":"CrTZMz2nCXsiY5WOF8","id.orig_h":"192.168.38.103","id.orig_p":63295,"id.resp_h":"192.168.38.102","id.resp_p":135,"named_pipe":
"135","endpoint":"epmapper","operation":"ept_map"}

root@logger:~# head /var/log/suricata/fast.log

01/27/2019-17:19:08.133030  [**] [1:2013028:4] ET POLICY curl User-Agent Outbound [**] [Classification: Attempted Information Leak] [Priority: 2] {TCP} 10.0
.2.15:51574 -> 195.135.221.134:80
01/27/2019-17:19:08.292747  [**] [1:2013504:5] ET POLICY GNU/Linux APT User-Agent Outbound likely related to package management [**] [Classification: Not Su
spicious Traffic] [Priority: 3] {TCP} 10.0.2.15:55260 -> 99.84.178.103:80
01/27/2019-17:19:08.356618  [**] [1:2013504:5] ET POLICY GNU/Linux APT User-Agent Outbound likely related to package management [**] [Classification: Not Su
spicious Traffic] [Priority: 3] {TCP} 10.0.2.15:55260 -> 99.84.178.103:80
01/27/2019-17:19:08.432477  [**] [1:2013504:5] ET POLICY GNU/Linux APT User-Agent Outbound likely related to package management [**] [Classification: Not Su
spicious Traffic] [Priority: 3] {TCP} 10.0.2.15:46630 -> 91.189.95.83:80
01/27/2019-17:19:08.448249  [**] [1:2013504:5] ET POLICY GNU/Linux APT User-Agent Outbound likely related to package management [**] [Classification: Not Su
spicious Traffic] [Priority: 3] {TCP} 10.0.2.15:53932 -> 91.189.88.161:80
...trimmed...

Docker on the Logger Host

Chris is using Docker to provide some of the Logger host functions, e.g.:

root@logger:~# docker container ls
CONTAINER ID        IMAGE                    COMMAND                   CREATED             STATUS              PORTS                              NAMES
343c18f933d9        kolide/fleet:latest      "sh -c 'echo '\\n' | …"   3 hours ago         Up 3 hours          0.0.0.0:8412->8412/tcp             kolidequic
kstart_fleet_1
513cb0d61401        mysql:5.7                "docker-entrypoint.s…"    3 hours ago         Up 3 hours          3306/tcp, 33060/tcp                kolidequic
kstart_mysql_1
b0278855b130        mailhog/mailhog:latest   "MailHog"                 3 hours ago         Up 3 hours          1025/tcp, 0.0.0.0:8025->8025/tcp   kolidequic
kstart_mailhog_1
ddcd3e59dda2        redis:3.2.4              "docker-entrypoint.s…"    3 hours ago         Up 3 hours          6379/tcp                           kolidequic
kstart_redis_1

Troubleshooting Localhost Bindings

One of the issues I encountered involved the IP addresses to which VMs bound their Virtualbox Remote Display Protocol servers. The default configuration bound them to localhost on my Windows laptop. That was ok if I was interacting with that laptop in person, but I was doing this work remotely.

I could RDP to the laptop, and then RDP from the laptop to the VMs. This works, but it was a slight hassle to log into the Windows 2016 server which required a ctrl-alt-del sequence. This is a nuanced issue that can be solved by using the on screen keyboard (osk.exe) to enter the ctrl-alt-end sequence on the remote laptop, but I wanted an easier solution.

Dustin Lee, who has done a lot of work customized DetectionLab to include Security Onion (a future post maybe?) suggested I modify the Vagrantfile with the following bolded content. This example is for the wef host in the Vagrantfile.

    cfg.vm.provider "virtualbox" do |vb, override|
      vb.gui = true
      vb.name = "wef.windomain.local"
      vb.default_nic_type = "82545EM"
      vb.customize ["modifyvm", :id, "--vrde", "on"]
      vb.customize ["modifyvm", :id, "--vrdeaddress", "0.0.0.0"]
      vb.customize ["modifyvm", :id, "--memory", 2048]
      vb.customize ["modifyvm", :id, "--cpus", 2]
      vb.customize ["modifyvm", :id, "--vram", "32"]
      vb.customize ["modifyvm", :id, "--clipboard", "bidirectional"]
      vb.customize ["setextradata", "global", "GUI/SuppressMessages", "all" ]
    end
Basically, add the bold entries wherever you see a "virtualbox" option, to enable the VRDP server binding to 0.0.0.0 (which should be all IP addresses, to include the public IP, as I want).

Before I restarted the wef host, you can see below how the VRDE server is listening only on localhost on port 5932 (in bold).

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage.exe" showvminfo wef.windomain.local
| findstr /I vrde

VRDE:                        enabled (Address 127.0.0.1, Ports 5932, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
...trimmed...

After changing the Vagrant file, I restarted the wef host using Vagrant.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant reload wef
==> wef: Attempting graceful shutdown of VM...
==> wef: Clearing any previously set forwarded ports...
==> wef: Fixed port collision for 3389 => 3389. Now on port 2201.
==> wef: Fixed port collision for 22 => 2222. Now on port 2202.
==> wef: Fixed port collision for 5985 => 55985. Now on port 2203.
==> wef: Fixed port collision for 5986 => 55986. Now on port 2204.
==> wef: Clearing any previously set network interfaces...
==> wef: Preparing network interfaces based on configuration...
    wef: Adapter 1: nat
    wef: Adapter 2: hostonly
==> wef: Forwarding ports...
    wef: 3389 (guest) => 2201 (host) (adapter 1)
    wef: 22 (guest) => 2202 (host) (adapter 1)
    wef: 5985 (guest) => 2203 (host) (adapter 1)
    wef: 5986 (guest) => 2204 (host) (adapter 1)
==> wef: Running 'pre-boot' VM customizations...
==> wef: Booting VM...
==> wef: Waiting for machine to boot. This may take a few minutes...
    wef: WinRM address: 127.0.0.1:2203
    wef: WinRM username: vagrant
    wef: WinRM execution_time_limit: PT2H
    wef: WinRM transport: negotiate
==> wef: Machine booted and ready!
==> wef: Checking for guest additions in VM...
    wef: The guest additions on this VM do not match the installed version of
    wef: VirtualBox! In most cases this is fine, but in rare cases it can
    wef: prevent things such as shared folders from working properly. If you see
    wef: shared folder errors, please make sure the guest additions within the
    wef: virtual machine match the version of VirtualBox you have installed on
    wef: your host and reload your VM.
    wef:
    wef: Guest Additions Version: 5.2.16
    wef: VirtualBox Version: 6.0
==> wef: Setting hostname...
==> wef: Configuring and enabling network interfaces...
==> wef: Mounting shared folders...

    wef: /vagrant =>  

The bolded entry about port 2201 means I can log into the wef host as user vagrant / password vagrant, over RDP, directly from another computer.


After restarting the wef host, I check to see what IP address and port the RDP server is listening (in bold):


root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>"c:\Program Files\Oracle\VirtualBox\VBoxManage.exe" showvminfo wef.windomain.local
| findstr /I vrde
VRDE:                        enabled (Address 0.0.0.0, Ports 5932, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
...trimmed...

This should allow me to access the "screen" of the VM via port 5932 and the IP of the host laptop. Unfortunately, there is some sort of conflict, as I saw the domain controller also reserved the same port for its VRDE.

root@LAPTOP-HT4TGVCP C:\Users\root>"c:\Program Files\Oracle\VirtualBox\VBoxManage.exe" showvminfo dc.windomain.local | findstr /I vrde

VRDE:                        enabled (Address 0.0.0.0, Ports 5932, MultiConn: off, ReuseSingleConn: off, Authentication type: null)
...trimmed...

I encountered a similar issue with the domain controller also failing to resolve a conflict with the host system RDP port.

root@LAPTOP-HT4TGVCP C:\Users\root\git\detectionlab\DetectionLab\Vagrant>vagrant reload dc
==> dc: Attempting graceful shutdown of VM...
==> dc: Clearing any previously set forwarded ports...
==> dc: Fixed port collision for 22 => 2222. Now on port 2200.
==> dc: Clearing any previously set network interfaces...
==> dc: Preparing network interfaces based on configuration...
    dc: Adapter 1: nat
    dc: Adapter 2: hostonly
==> dc: Forwarding ports...
    dc: 3389 (guest) => 3389 (host) (adapter 1)
    dc: 22 (guest) => 2200 (host) (adapter 1)
    dc: 5985 (guest) => 55985 (host) (adapter 1)
    dc: 5986 (guest) => 55986 (host) (adapter 1)

I haven't solved these problems yet. I wonder if it's a result of using pre-built VMs, which use the 5.x series of Virtualbox extensions, while my Virtualbox system runs 6.0?

Summary

These are minor issues, for the result of all this work is four systems offering Windows client and server features, plus instrumentation. That could be another topic for discussion. I'm also excited by the prospect of running all of this in the cloud. Furthermore, Dustin Lee has a fork of DetectionLab that replaces some of the instrumentation with Security Onion!

Happy 16th Birthday TaoSecurity Blog

Today, 8 January 2019, is TaoSecurity Blog's 16th birthday! This is also my 3,041st blog post.

I wrote my first post on 8 January 2003 while working as an incident response consultant for Foundstone.

Here are a few statistics on the blog. Blogger started providing statistics in May 2010, so these apply to roughly the past 9 years only.

As of today, since May 2010 the blog has nearly 9.4 million all time page views, up from 7.7 million a year ago.

Here are the most popular posts of the last 9 years, as of today:


I'm blogging a bit more recently, with 22 posts in 2018 -- more than my total for 2016 and 2017 combined, but still not half as much as 2015, which saw 55 posts.

Twitter continues to play a role in the way I communicate. Last year @taosecurity had nearly 49,000 followers with less than 18,000 Tweets. Today I have nearly 53,000 followers with 19,000 Tweets.

My rule is generally this: if I start wondering how to fit an idea in 280 characters on Twitter, then a blog post is a better idea. If I start a Twitter "thread," then I really need to write a blog post!

I continue to blog about martial arts and related topics at Rejoining the Tao, which incidentally will be three years old later this month, and is currently 11 posts shy of 100. You can see that during my burnout period I shifted my writing and creativity outside of security.

Thank you to everyone who has been part of this blog's journey since 2003!

Notes on Self-Publishing a Book


In this post I would like to share a few thoughts on self-publishing a book, in case anyone is considering that option.

As I mentioned in my post on burnout, one of my goals was to publish a book on a subject other than cyber security. A friend from my Krav Maga school, Anna Wonsley, learned that I had published several books, and asked if we might collaborate on a book about stretching. The timing was right, so I agreed.

I published my first book with Pearson and Addison-Wesley in 2004, and my last with No Starch in 2013. 14 years is an eternity in the publishing world, and even in the last 5 years the economics and structure of book publishing have changed quite a bit.

To better understand the changes, I had dinner with one of the finest technical authors around, Michael W. Lucas. We met prior to my interest in this book, because I had wondered about publishing books on my own. MWL started in traditional publishing like me, but has since become a full-time author and independent publisher. He explained the pros and cons of going it alone, which I carefully considered.

By the end of 2017, Anna and I were ready to begin work on the book. I believe our first "commits" occurred in December 2017.

For this stretching book project, I knew my strengths included organization, project management, writing to express another person's message, editing, and access to a skilled lead photographer. I learned that my co-author's strengths included subject matter expertise, a willingness to be photographed for the book's many pictures, and friends who would also be willing to be photographed.

None of us was very familiar with the process of transforming a raw manuscript and photos into a finished product. When I had published with Pearson and No Starch, they took care of that process, as well as copy-editing.

Beyond turning manuscript and photos into a book, I also had to identify a publication platform. Early on we decided to self-publish using one of the many newer companies offering that service. We wanted a company that could get our book into Amazon, and possibly physical book stores as well. We did not want to try working with a traditional publisher, as we felt that we could manage most aspects of the publishing process ourselves, and augment with specialized help where needed.

After a lot of research we chose Blurb. One of the most attractive aspects of Blurb was their expert ecosystem. We decided that we would hire one of these experts to handle the interior layout process. We contacted Jennifer Linney, who happened to be local and had experience publishing books to Amazon. We met in person, discussed the project, and agreed to move forward together.

I designed the structure of the book. As a former Air Force officer, I was comfortable with the "rule of threes," and brought some recent writing experience from my abandoned PhD thesis.

I designed the book to have an introduction, the main content, and a conclusion. Within the main content, the book featured an introduction and physical assessment, three main sections, and a conclusion. The three main sections consisted of a fundamental stretching routine, an advanced stretching routine, and a performance enhancement section -- something with Indian clubs, or kettle bells, or another supplement to stretching.

Anna designed all of the stretching routines and provided the vast majority of the content. She decided to focus on three physical problem areas -- tight hips, shoulders/back, and hamstrings. We encouraged the reader to "reach three goals" -- open your hips, expand your shoulders, and touch your toes. Anna designed exercises that worked in a progression through the body, incorporating her expertise as a certified trainer and professional martial arts instructor.

Initially we tried a process whereby she would write section drafts, and I would edit them, all using Google Docs. This did not work as well as we had hoped, and we spent a lot of time stalled in virtual collaboration.

By the spring of 2018 we decided to try meeting in person on a regular basis. Anna would explain her desired content for a section, and we would take draft photographs using iPhones to serve as placeholders and to test the feasibility of real content. We made a lot more progress using these methods, although we stalled again mid-year due to schedule conflicts.

By October our text was ready enough to try taking book-ready photographs. We bought photography lights from Amazon and used my renovated basement game room as a studio. We took pictures over three sessions, with Anna and her friend Josh as subjects. I spent several days editing the photos to prepare for publication, then handed the bundled manuscript and photographs to Jennifer for a light copy-edit and layout during November.

Our goal was to have the book published before the end of the year, and we met that goal. We decided to offer two versions. The first is a "collector's edition" featuring all color photographs, available exclusively via Blurb as Reach Your Goal: Collector's Edition. The second will be available at Amazon in January, and will feature black and white photographs.

While we were able to set the price of the book directly via Blurb, we could basically only suggest a price to Ingram and hence to Amazon. Ingram is the distributor that feeds Amazon and physical book stores. I am curious to see how the book will appear in those retail locations, and how much it will cost readers. We tried to price it competitively with older stretching books of similar size. (Ours is 176 pages with over 200 photographs.)

Without revealing too much of the economic structure, I can say that it's much cheaper to sell directly from Blurb. Their cost structure allows us to price the full color edition competitively. However, one of our goals was to provide our book through Amazon, and to keep the price reasonable we had to sell the black and white edition outside of Blurb.

Overall I am very pleased with the writing process, and exceptionally happy with the book itself. The color edition is gorgeous and the black and white version is awesome too.

The only change I would have made to the writing process would have been to start the in-person collaboration from the beginning. Working together in person accelerated the transfer of ideas to paper and played to our individual strengths of Anna as subject matter expert and me as a writer.

In general, I would not recommend self-publishing if you are not a strong writer. If writing is not your forte, then I highly suggest you work with a traditional publisher, or contract with an editor. I have seen too many self-published books that read terribly. This usually happens when the author is a subject matter expert, but has trouble expressing ideas in written form.

The bottom line is that it's never been easier to make your dream of writing a book come true. There are options for everyone, and you can leverage them to create wonderful products that scale with demand and can really help your audience reach their goals!

If you want to start the new year with better flexibility and fitness, consider taking a look at our book on Blurb! When the Amazon edition is available I will update this post with a link.

Update: Here is the Amazon listing.

Cross-posted from Rejoining the Tao Blog.

Managing Burnout

This is not strictly an information security post, but the topic likely affects a decent proportion of my readership.

Within the last few years I experienced a profound professional "burnout." I've privately mentioned this to colleagues in the industry, and heard similar stories or requests for advice on how to handle burnout.

I want to share my story in the hopes that it helps others in the security scene, either by coping with existing burnout or preparing for a possible burnout.

How did burnout manifest for me? It began with FireEye's acquisition of Mandiant, almost exactly five years ago. 2013 was a big year for Mandiant, starting with the APT1 report in early 2013 and concluding with the acquisition in December.

The prospect of becoming part of a Silicon Valley software company initially seemed exciting, because we would presumably have greater resources to battle intruders. Soon, however, I found myself at odds with FireEye's culture and managerial habits, and I wondered what I was doing inside such a different company.

(It's important to note that the appointment of Kevin Mandia as CEO in June 2016 began a cultural and managerial shift. I give Kevin and his lieutenants credit for helping transform the company since then. Kevin's appointment was too late for me, but I applaud the work he has done over the last few years.)

Starting in late 2014 and progressing in 2015, I became less interested in security. I was aggravated every time I saw the same old topics arise in social or public media. I did not see the point of continuing to debate issues which were never solved. I was demoralized and frustrated.

At this time I was also working on my PhD with King's College London. I had added this stress myself, but I felt like I could manage it. I had earned two major and two minor degrees in four years as an Air Force Academy cadet. Surely I could write a thesis!

Late in 2015 I realized that I needed to balance the very cerebral art of information security with a more physical activity. I took a Krav Maga class the first week of January 2016. It was invigorating and I began a new blog, Rejoining the Tao, that month. I began to consider options outside of informations security.

In early 2016 my wife began considering ways to rejoin the W-2 workforce, after having stayed home with our kids for 12 years. We discussed the possibility of me leaving my W-2 job and taking a primary role with the kids. By mid-2016 she had a new job and I was open to departing FireEye.

By late 2016 I also realized that I was not cut out to be a PhD candidate. Although I had written several books, I did not have the right mindset or attitude to continue writing my thesis. After two years I quit my PhD program. This was the first time I had quit anything significant in my life, and it was the right decision for me. (The Churchill "never, never, never give up" speech is fine advice when defending your nation's existence, but it's stupid advice if you're not happy with the path you're following.)

In March 2017 I posted Bejtlich Moves On, where I said I was leaving FireEye. I would offer security consulting in the short term, and would open a Krav Maga school in the long-term. This was my break with the security community and I was happy to make it. I blogged on security only five more times in 2017.

(Incidentally, one very public metric for my burnout experience can be seen in my blog output. In 2015 I posted 55 articles, but in 2016 I posted only 8, and slightly more, 12, in 2017. This is my 21st post of 2018.)

I basically took a year off from information security. I did some limited consulting, but Mrs B paid the bills, with some support from my book royalties and consulting. This break had a very positive effect on my mental health. I stayed aware of security developments through Twitter, but I refused to speak to reporters and did not entertain job offers.

During this period I decided that I did not want to open a Krav Maga school and quit my school's instructor development program. For the second time, I had quit something I had once considered very important.

I started a new project, though -- writing a book that had nothing to do with information security. I will post about it shortly, as I am finalizing the cover with the layout team this weekend!

By the spring of 2018 I was able to consider returning to security. In May I blogged that I was joining Splunk, but that lasted only two months. I realized I had walked into another cultural and managerial mismatch. Near the end of that period, Seth Hall from Corelight contacted me, and by July 20th I was working there. We kept it quiet until September. I have been very happy at Corelight, finally finding an environment that matches my temperament, values, and interests.

My advice to those of you who have made it this far:

If you're feeling burnout now, you're not alone. It happens. We work in a stressful industry that will take everything that you can give, and then try to take more. It's healthy and beneficial to push back. If you can, take a break, even if it means only a partial break.

Even if you can't take a break, consider integrating non-security activities into your lifestyle -- the more physical, the better. Security is a very cerebral activity, often performed in a sedentary manner. You have a body and taking care of it will make your mind happier too.

If you're not feeling burnout now, I recommend preparing for a possible burnout in the future. In addition to the advice in the previous paragraphs, take steps now to be able to completely step away from security for a defined period. Save a proportion of your income to pay your bills when you're not working in security. I recommend at least a month, but up to six months if you can manage it.

This is good financial advice anyway, in the event you were to lose your job. This is not an emergency fund, though -- this is a planned reprieve from burnout. We are blessed in security to make above-average salaries, so I suggest saving for retirement, saving for layoffs, and saving for burnout.

Finally, it's ok to talk to other people about this. This will likely be a private conversation. I don't see too many people saying "I'm burned out!" on Twitter or in a blog post. I only felt comfortable writing this post months after I returned to regular security work.

I'm very interested in hearing what others have to say on this topic. Replying to my Twitter announcement for the blog post is probably the easiest step. I moderate the comments here and might not get to them in a timely manner.

The Origin of the Quote “There Are Two Types of Companies”

While listening to a webcast this morning, I heard the speaker mention

There are two types of companies: those who have been hacked, and those who don’t yet know they have been hacked.

He credited Cisco CEO John Chambers but didn't provide any source.

That didn't sound right to me. I could think of two possible antecedents. so I did some research. I confirmed my memory and would like to present what I found here.

John Chambers did indeed offer the previous quote, in a January 2015 post for the World Economic Forum titled What does the Internet of Everything mean for security? Unfortunately, neither Mr Chambers nor the person who likely wrote the article for him decided to credit the author of this quote.

Before providing proper credit for this quote, we need to decide what the quote actually says. As noted in this October 2015 article by Frank Johnson titled Are there really only “two kinds of enterprises”?, there are really (at least) two versions of this quote:

A popular meme in the information security industry is, “There are only two types of companies: those that know they’ve been compromised, and those that don’t know.”

And the second is like unto it: “There are only two kinds of companies: those that have been hacked, and those that will be.”

We see that the first is a version of what Mr Chambers said. Let's call that 2-KNOW. The second is different. Let's call that 2-BE.

The first version, 2-KNOW, can be easily traced and credited to Dmitri Alperovitch. He stated this proposition as part of the publicity around his Shady RAT report, written while he worked at McAfee. For example, this 3 August 2011 story by Ars Technica, Operation Shady RAT: five-year hack attack hit 14 countries, quotes Dmitri in the following:

So widespread are the attacks that Dmitri Alperovitch, McAfee Vice President of Threat Research, said that the only companies not at risk are those who have nothing worth taking, and that of the world's biggest firms, there are just two kinds: those that know they've been compromised, and those that still haven't realized they've been compromised.

Dmitri used slightly different language in this popular Vanity Fair article from September 2011, titled Enter the Cyber-Dragon:

Dmitri Alperovitch, who discovered Operation Shady rat, draws a stark lesson: “There are only two types of companies—those that know they’ve been compromised, and those that don’t know. If you have anything that may be valuable to a competitor, you will be targeted, and almost certainly compromised.”

No doubt former FBI Director Mueller read this report (and probably spoke with Dmitri). He delivered a speech at RSA on 1 March 2012 that introduced question 2-BE into the lexicon, plus a little more:

For it is no longer a question of “if,” but “when” and “how often.”

I am convinced that there are only two types of companies: those that have been hacked and those that will be. 

And even they are converging into one category: companies that have been hacked and will be hacked again.  

Here we see Mr Mueller morphing Dmitri's quote, 2-KNOW, into the second, 2-BE. He also introduced a third variant -- "companies that have been hacked and will be hacked again." Let's call this version 2-AGAIN.

The very beginning of Mr Mueller's quote is surely a play on Kevin Mandia's long-term commitment to the inevitability of compromise. However, as far as I could find, Kevin did not use the "two companies" language.

One article that mentions version 2-KNOW and Kevin is this December 2014 Ars Technica article titled “Unprecedented” cyberattack no excuse for Sony breach, pros say. However, the article is merely citing other statements by Kevin along with the aphorism of version 2-KNOW.

Finally, there's a fourth version introduced by Mr Mueller's successor, James Comey, as well! In a 6 October 2014 story, FBI Director: China Has Hacked Every Big US Company Mr Comey said:

Speaking to CBS' 60 Minutes, James Comey had the following to say on Chinese hackers: 

There are two kinds of big companies in the United States. There are those who've been hacked by the Chinese and those who don't know they've been hacked by the Chinese.

Let's call this last variant 2-CHINA.

To summarize, there are four versions of the "two companies" quote:

  • 2-KNOW, credited to Dmitri Alperovitch in 2011, says "There are only two types of companies—those that know they’ve been compromised, and those that don’t know."
  • 2-BE, credited to Robert Mueller in 2012, says "[T]here are only two types of companies: those that have been hacked and those that will be."
  • 2-AGAIN, credited to Robert Mueller in 2012, says "[There are only two types of companies:] companies that have been hacked and will be hacked again."
  • 2-CHINA, credited to James Comey in 2014, says "There are two kinds of big companies in the United States. There are those who've been hacked by the Chinese and those who don't know they've been hacked by the Chinese."
Now you know!


The Origin of the Term Indicators of Compromise (IOCs)

I am an historian. I practice digital security, but I earned a bachelor's of science degree in history from the United States Air Force Academy. (1)

Historians create products by analyzing artifacts, among which the most significant is the written word.

In my last post, I talked about IOCs, or indicators of compromise. Do you know the origin of the term? I thought I did, but I wanted to rely on my historian's methodology to invalidate or confirm my understanding.

I became aware of the term "indicator" as an element of indications and warning (I&W), when I attended Air Force Intelligence Officer's school in 1996-1997. I will return to this shortly, but I did not encounter the term "indicator" in a digital security context until I encountered the work of Kevin Mandia.

In August 2001, shortly after its publication, I read Incident Response: Investigating Computer Crime, by Kevin Mandia, Chris Prosise, and Matt Pepe (Osborne/McGraw-Hill). I was so impressed by this work that I managed to secure a job with their company, Foundstone, by April 2002. I joined the Foundstone incident response team, which was led by Kevin and consisted of Matt Pepe, Keith Jones, Julie Darmstadt, and me.

I Tweeted earlier today that Kevin invented the term "indicator" (in the IR context) in that 2001 edition, but a quick review of the hard copy in my library does not show its usage, at least not prominently. I believe we were using the term in the office but that it had not appeared in the 2001 book. Documentation would seem to confirm that, as Kevin was working on the second edition of the IR book (to which I contributed), and that version, published in 2003, features the term "indicator" in multiple locations.

In fact, the earliest use of the term "indicators of compromise," appearing in print in a digital security context, appears on page 280 in Incident Response & Computer Forensics, 2nd Edition.


From other uses of the term "indicators" in that IR book, you can observe that IOC wasn't a formal, independent concept at this point, in 2003. In the same excerpt above you see "indicators of attack" mentioned.

The first citation of the term "indicators" in the 2003 book shows it is meant as an investigative lead or tip:


Did I just give up my search at this point? Of course not.

If you do time-limited Google searches for "indicators of compromise," after weeding out patent filings that reference later work (from FireEye, in 2013), you might find this document, which concludes with this statement:

Indicators of compromise are from Lynn Fischer, Lynn, "Looking for the Unexpected," Security Awareness Bulletin, 3-96, 1996. Richmond, VA: DoD Security Institute.

Here the context is the compromise of a person with a security clearance.

In the same spirit, the earliest reference to "indicator" in a security-specific, detection-oriented context appears in the patent Method and system for reducing the rate of infection of a communications network by a software worm (6 Dec 2002). Stuart Staniford is the lead author; he was later chief scientist at FireEye, although he left before FireEye acquired Mandiant (and me).

While Kevin, et al were publishing the second edition of their IR book in 2003, I was writing my first book, The Tao of Network Security Monitoring. I began chapter two with a discussion of indicators, inspired by my Air Force intelligence officer training in I&W and Kevin's use of the term at Foundstone.

You can find chapter two in its entirety online. In the chapter I also used the term "indicators of compromise," in the spirit Kevin used it; but again, it was not yet a formal, independent term.

My book was published in 2004, followed by two more in rapid succession.

The term "indicators" didn't really make a splash until 2009, when Mike Cloppert published a series on threat intelligence and the cyber kill chain. The most impactful in my opinion was Security Intelligence: Attacking the Cyber Kill Chain. Mike wrote:


I remember very much enjoying these posts, but the Cyber Kill Chain was the aspect that had the biggest impact on the security community. Mike does not say "IOC" in the post. Where he does say "compromise," he's using it to describe a victimized computer.

The stage is now set for seeing indicators of compromise in a modern context. Drum roll, please!

The first documented appearance of the term indicators of compromise, or IOCs, in the modern context, appears in basically two places simultaneously, with ultimate credit going to the same organziation: Mandiant.

The first Mandiant M-Trends report, published on 25 Jan 2010, provides the following description of IOCs on page 9:


The next day, 26 Jan 2010, Matt Frazier published Combat the APT by Sharing Indicators of Compromise to the Mandiant blog. Matt wrote to introduce an XML-based instantiation of IOCs, which could be read and created using free Mandiant tools.


Note how complicated Matt's IOC example is. It's not a file hash (alone), or a file name (alone), or an IP address, etc. It's a Boolean expression of many elements. You can read in the text that this original IOC definition rejects what some commonly consider "IOCs" to be. Matt wrote:

Historically, compromise data has been exchanged in CSV or PDFs laden with tables of "known bad" malware information - name, size, MD5 hash values and paragraphs of imprecise descriptions... (emphasis added)

On a related note, I looked for early citations of work on defining IOCs, and found a paper by Simson Garfinkel, well-respected forensic analyst. He gave credit to Matt Frazier and Mandiant, writing in 2011:

Frazier (2010) of MANDIANT developed Indicators of Compromise (IOCs), an XML-based language designed to express signatures of malware such as files with a particular MD5 hash value, file length, or the existence of particular registry entries. There is a free editor for manipulating the XML. MANDIANT has a tool that can use these IOCs to scan for malware and the so-called “Advanced Persistent Threat.”

Starting in 2010, the debate was initially about the format for IOCs, and how to produce and consume them. We can see in this written evidence from 2010, however, a definition of indicators of compromise and IOCs that contains all the elements that would be recognized in current usage.

tl;dr Mandiant invented the term indicators of compromise, or IOCs, in 2010, building off the term "indicator," introduced widely in a detection context by Kevin Mandia, no later than his 2003 incident response book.

(1) Yes, a BS, not a BA -- thank you USAFA for 14 mandatory STEM classes.

Even More on Threat Hunting

In response to my post More on Threat Hunting, Rob Lee asked:

[D]o you consider detection through ID’ing/“matching” TTPs not hunting?

To answer this question, we must begin by clarifying "TTPs." Most readers know TTPs to mean tactics, techniques and procedures, defined by David Bianco in his Pyramid of Pain post as:

How the adversary goes about accomplishing their mission, from reconnaissance all the way through data exfiltration and at every step in between.

In case you've forgotten David's pyramid, it looks like this.


It's important to recognize that the pyramid consists of indicators of compromise (IOCs). David uses the term "indicator" in his original post, but his follow-up post from his time at Sqrrl makes this clear:

There are a wide variety of IoCs ranging from basic file hashes to hacking Tactics, Techniques and Procedures (TTPs). Sqrrl Security Architect, David Bianco, uses a concept called the Pyramid of Pain to categorize IoCs. 

At this point it should be clear that I consider TTPs to be one form of IOC.

In The Practice of Network Security Monitoring, I included the following workflow:

You can see in the second column that I define hunting as "IOC-free analysis." On page 193 of the book I wrote:

Analysis is the process of identifying and validating normal, suspicious, and malicious activity. IOCs expedite this process. Formally, IOCs are manifestations of observable or discernible adversary actions. Informally, IOCs are ways to codify adversary activity so that technical systems can find intruders in digital evidence...

I refer to relying on IOCs to find intruders as IOC-centric analysis, or matching. Analysts match IOCs to evidence to identify suspicious or malicious activity, and then validate their findings.

Matching is not the only way to find intruders. More advanced NSM operations also pursue IOC-free analysis, or hunting. In the mid-2000s, the US Air Force popularized the term hunter-killer in the digital world. Security experts performed friendly force projection on their networks, examining data and sometimes occupying the systems themselves in order to find advanced threats. 

Today, NSM professionals like David Bianco and Aaron Wade promote network “hunting trips,” during which a senior investigator with a novel way to detect intruders guides junior analysts through data and systems looking for signs of the adversary. 

Upon validating the technique (and responding to any enemy actions), the hunters incorporate the new detection method into a CIRT’s IOC-centric operations. (emphasis added)

Let's consider Chris Sanders' blog post titled Threat Hunting for HTTP User Agents as an example of my definition of hunting. 

I will build a "hunting profile" via excerpts (in italics) from his post:

Assumption: "Attackers frequently use HTTP to facilitate malicious network communication."

Hypothesis: If I find an unusual user agent string in HTTP traffic, I may have discovered an attacker.

Question: “Did any system on my network communicate over HTTP using a suspicious or unknown user agent?”

Method: "This question can be answered with a simple aggregation wherein the user agent field in all HTTP traffic for a set time is analyzed. I’ve done this using Sqrrl Query Language here:

SELECT COUNT(*),user_agent FROM HTTPProxy GROUP BY user_agent ORDER BY COUNT(*) ASC LIMIT 20

This query selects the user_agent field from the HTTPProxy data source and groups and counts all unique entries for that field. The results are sorted by the count, with the least frequent occurrences at the top."

Results: Chris offers advice on how to interpret the various user agent strings produced by the query.

This is the critical part: Chris did not say "look for *this user agent*. He offered the reader an assumption, a hypothesis, a question, and a method. It is up to the defender to investigate the results. This, for me, is true hunting.

If Chris had instead referred users to this list of malware user agents (for example) and said look for "Mazilla/4.0", then I consider that manual (human) matching. If I created a Snort or Suricata rule to look for that user agent, then I consider that automated (machine) matching.

This is where my threat hunting definition likely diverges from modern practice. Analyst Z sees the results of Chris' hunt and thinks "Chris found user agent XXXX to be malicious, so I should go look for it." Analyst Z queries his or her data and does or does not find evidence of user agent XXXX.

I do not consider analyst Z's actions to be hunting. I consider it matching. There is nothing wrong with this. In fact, one of the purposes of hunting is to provide new inputs to the matching process, so that future hunting trips can explore new assumptions, hypotheses, questions, and methods, and let the machines do the matching on IOCs already found to be suggestive of adversary activity. This is why I wrote in my 2013 book "Upon validating the technique (and responding to any enemy actions), the hunters incorporate the new detection method into a CIRT’s IOC-centric operations."

The term "hunting" is a victim of its own success, with emotional baggage. We defenders have finally found a way to make "blue team" work appealing to the wider security community. Vendors love this new way to market their products. "If you're not hunting, are you doing anything useful?" one might ask.

Compared to "I'm threat hunting!" (insert chest beating), the alternative, "I'm matching!" (womp womp), seems sad. 

Nevertheless, we must remember that threat hunting methodologies were invented to find adversary activity for which there were no IOCs. Hunting was IOC-free analysis because we didn't know what to look for. Once you know what to look for, you are matching. Both forms of detection require analysis to validate adversary activity, of course. Let's not forget that.

I'm also very thankful, however it's defined or packaged, that people are excited to search for adversary activity in their environment, whether via matching or hunting. It's a big step from the mindset of 10 years ago, which had a "prevention works" milieu.

tl;dr Because TTPs are a form of IOC, then detection via matching IOCs is a form of matching, and not hunting.

More on Threat Hunting

Earlier this week hellor00t asked via Twitter:

Where would you place your security researchers/hunt team?

I replied:

For me, "hunt" is just a form of detection. I don't see the need to build a "hunt" team. IR teams detect intruders using two major modes: matching and hunting. Junior people spend more time matching. Senior people spend more time hunting. Both can and should do both functions.

This inspired Rob Lee to blog a response, from which I extract his core argument:

[Hunting] really isn’t, to me, about detecting threats...

Hunting is a hypothesis-led approach to testing your environment for threats. The purpose, to me, is not in finding threats but in determining what gaps you have in your ability to detect and respond to them...

In short, hunting, to me, is a way to assess your security (people, process, and technology) against threats while extending your automation footprint to better be prepared in the future. Or simply stated, it’s incident response without the incident that’s done with a purpose and contributes something. 

As background for my answer, I recommend my March 2017 post The Origin of Threat Hunting, which cites my article "Become a Hunter," published in the July-August 2011 issue of Information Security Magazine. I wrote it in the spring of 2011, when I was director of incident response for GE-CIRT.

For the term "hunting," I give credit to briefers from the Air Force and NSA who, in the mid-2000s briefed "hunter-killer" missions to the Red Team/Blue Team Symposium at the Johns Hopkins University Applied Physics Lab in Laurel, MD.

As a comment to that post, Tony Sager, who ran NSA VAO at the time I was briefed at ReBl, described hunting thus:

[Hunting] was an active and sustained search for Attackers...

For us, "Hunt" meant a very planned and sustained search, taking advantage of the existing infrastructure of Red/Blue Teams and COMSEC Monitoring, as well as intelligence information to guide the search. 

For the practice of hunting, as I experienced it, I give credit to our GE-CIRT incident handlers -- David Bianco,  Ken Bradley, Tim Crothers, Tyler Hudak, Bamm Visscher, and Aaron Wade -- who took junior analysts on "hunting trips," starting in 2008-2009.

It is very clear, to me, that hunting has always been associated with detecting an adversary, not "determining what gaps you have in your ability to detect and respond to them," as characterized by Rob.

For me, Rob is describing the job of an enterprise visibility architect, which I described in a 2007 post:

[W]e are stuck with numerous platforms, operating systems, applications, and data (POAD) for which we have zero visibility. 

I suggest that enterprises consider hiring or assigning a new role -- Enterprise Visibility Architect. The role of the EVA is to identify visibility deficiencies in existing and future POAD and design solutions to instrument these resources.

A primary reason to hire an enterprise visibility architect is to build visibility in, which I described in several posts, including this one from 2009 titled Build Visibility In. As a proponent of the "monitor first" school, I will always agree that it is important to identify and address visibility gaps.

So where do we go from here?

Tony Sager, as one of my wise men, offers sage advice at the conclusion of his comment:

"Hunt" emerged as part of a unifying mission model for my Group in the Information Assurance Directorate at NSA (the defensive mission) in the mid-late 2000's. But it was also a way to unify the relationship between IA and the SIGINT mission - intelligence as the driver for Hunting. The marketplace, of course, has now brought its own meaning to the term, but I just wanted to share some history. 

In my younger days I might have expressed much more energy and emotion when encountering a different viewpoint. At this point in my career, I'm more comfortable with other points of view, so long as they do not result in harm, or a waste of my taxpayer dollars, or other clearly negative consequences. I also appreciate the kind words Rob offered toward my point of view.

tl;dr I believe the definition and practice of hunting has always been tied to adversaries, and that Rob describes the work of an enterprise visibility architect when he focuses on visibility gaps rather than adversary activity.

Update 1: If in the course of conducting a hunt you identify a visibility or resistance deficiency, that is indeed beneficial. The benefit, however, is derivative. You hunt to find adversaries. Identifying gaps is secondary although welcome.

The same would be true of hunting and discovering misconfigured systems, or previously unidentified assets, or unpatched software, or any of the other myriad facts on the ground that manifest when one applies Clausewitz's directed telescope towards their computing environment.

Cybersecurity and Class M Planets

I was considering another debate about appropriate cybersecurity measures and I had the following thought: not all networks are the same. Profound, right? This is so obvious, yet so obviously forgotten.

Too often when confronting a proposed defensive measure, an audience approaches the concept from their own preconceived notion of what assets need to be protected.

Some think about an information technology enterprise organization with endpoints, servers, and infrastructure. Others think about an industrial organization with manufacturing equipment. Others imagine an environment with no network at all, where constituents access cloud-hosted resources. Still others think in terms of being that cloud hosting environment itself.

Beyond those elements, we need to consider the number of assets, their geographic diversity, their relative value, and many other aspects that you can no doubt imagine.

This made me wonder if we need some sort of easy reference term to capture the essential nature of these sorts of diverse environments. I thought immediately of the term "class M planet," from Star Trek. From the Wikipedia entry:

[An] Earth-like planet, the Class M designation is similar to the real-world astronomical theory of life-supporting planets within the habitable zone... Class M planets are said to possess an atmosphere composed of nitrogen and oxygen as well as an abundance of liquid water necessary for carbon-based life to exist. Extensive plant and animal life often flourishes; often, a sentient race is also present. 

In contrast, consider a class Y planet:

Class Y planets are referred to as "demon" worlds, where surface conditions do not fall into any other recognized category. Such worlds are usually hostile and lethal to humanoid life. If life forms develop on these worlds they usually take on many bizarre forms, like living crystal or rock, liquid or gaseous physical states, or incorporeal, dimensional, or energy-based states. 

Given their work providing names for various offensive security activities in ATT&CK, I wonder if MITRE might consider creating a naming scheme to capture this idea? For example, a "class M" network might be an enterprise organization with endpoints, servers, and infrastructure, of a certain size. Or perhaps M1 might be "small," M2 "medium," and M3 "large," where each is associated with a user count.

Perhaps an environment with no network at all, where constituents access cloud-hosted resources, would be a class C network. (I'm not sure "network" is even the right term, if there is no "network" for which the organization is responsible.)

With such a scheme in place, we could begin a cybersecurity discussion by asking, "given a class M network, what defensive processes, people, or technology are appropriate," versus "given a class C network, what defensive processes, people, or technology are appropriate."

This is only an idea, and I'd be happy if something was already created to address this problem. Comments below are welcome (pending moderation to repel trolls and spammers.) Alternatively, reply to my announcement of this post via @taosecurity on Twitter.

Have Network, Need Network Security Monitoring

I have been associated with network security monitoring my entire cybersecurity career, so I am obviously biased towards network-centric security strategies and technologies. I also work for a network security monitoring company (Corelight), but I am not writing this post in any corporate capacity.

There is a tendency in many aspects of the security operations community to shy away from network-centric approaches. The rise of encryption and cloud platforms, the argument goes, makes methodologies like NSM less relevant. The natural response seems to be migration towards the endpoint, because it is still possible to deploy agents on general purpose computing devices in order to instrument and interdict on the endpoint itself.

It occurred to me this morning that this tendency ignores the fact that the trend in computing is toward closed computing devices. Mobile platforms, especially those running Apple's iOS, are not friendly to introducing third party code for the purpose of "security." In fact, one could argue that iOS is one of, if not the, most security platform, thanks to this architectural decision. (Timely and regular updates, a policed applications store, and other choices are undoubtedly part of the security success of iOS, to be sure.)

How is the endpoint-centric security strategy going to work when security teams are no longer able to install third party endpoint agents? The answer is -- it will not. What will security teams be left with?

The answer is probably application logging, i.e., usage and activity reports from the software with which users interact. Most of this will likely be hosted in the cloud. Therefore, security teams responsible for protecting work-anywhere-but-remote-intensive users, accessing cloud-hosted assets, will have really only cloud-provided data to analyze and escalate.

It's possible that the endpoint providers themselves might assume a greater security role. In other words, Apple and other manufacturers provide security information directly to users. This could be like Chase asking if I really made a purchase. This model tends to break down when one is using a potentially compromised asset to ask the user if that asset is compromised.

In any case, this vision of the future ignores the fact that someone will still be providing network services. My contention is that if you are responsible for a network, you are responsible for monitoring it.

It is negligent to provide network services but ignore abuse of that service.

If you disagree and cite the "common carrier" exception, I would agree to a certain extent. However, one cannot easily fall back on that defense in an age where Facebook, Twitter, and other platforms are being told to police their infrastructure or face ever more government regulation.

At the end of the day, using modern Internet services means, by definition, using someone's network. Whoever is providing that network will need to instrument it, if only to avoid the liability associated with misuse. Therefore, anyone operating a network would do well to continue to deploy and operate network security monitoring capabilities.

We may be in a golden age of endpoint visibility, but closure of those platforms will end the endpoint's viability as a source of security logging. So long as there are networks, we will need network security monitoring.

Network Security Monitoring vs Supply Chain Backdoors

On October 4, 2018, Bloomberg published a story titled “The Big Hack: How China Used a Tiny Chip to Infiltrate U.S. Companies,” with a subtitle “The attack by Chinese spies reached almost 30 U.S. companies, including Amazon and Apple, by compromising America’s technology supply chain, according to extensive interviews with government and corporate sources.” From the article:

Since the implants were small, the amount of code they contained was small as well. But they were capable of doing two very important things: telling the device to communicate with one of several anonymous computers elsewhere on the internet that were loaded with more complex code; and preparing the device’s operating system to accept this new code. The illicit chips could do all this because they were connected to the baseboard management controller, a kind of superchip that administrators use to remotely log in to problematic servers, giving them access to the most sensitive code even on machines that have crashed or are turned off.

Companies mentioned in the story deny the details, so this post does not debate the merit of the Bloomberg reporters’ claims. Rather, I prefer to discuss how a computer incident response team (CIRT) and a chief information security officer (CISO) should handle such a possibility. What should be done when hardware-level attacks enabling remote access via the network are possible?

This is not a new question. I have addressed the architecture and practices needed to mitigate this attack model in previous writings. This scenario is a driving force behind my recommendation for network security monitoring (NSM) for any organization running a network, of any kind. This does not mean endpoint-centric security, or other security models, should be abandoned. Rather, my argument shows why NSM offers unique benefits when facing hardware supply chain attacks.

The problem is one of trust and detectability. The problem here is that one loses trust in the integrity of a computing platform when one suspects a compromised hardware environment. One way to validate whether a computing platform is trustworthy is to monitor outside of it, at places where the hardware cannot know it is being monitored, and cannot interfere with that monitoring. Software installed on the hardware is by definition untrustworthy because the hardware backdoor may have the capability to obscure or degrade the visibility and control provided by an endpoint agent.

Network security monitoring applied outside the hardware platform does not suffer this limitation, if certain safeguards are implemented. NSM suffers limitations unique to its deployment, of course, and they will be outlined shortly. By watching traffic to and from a suspected computing platform, CIRTs have a chance to identify suspicious and malicious activity, such as contact with remote command and control (C2) infrastructure. NSM data on this C2 activity can be collected and stored in many forms, such as any of the seven NSM data types: 1) full content; 2) extracted content; 3) session data; 4) transaction data; 5) statistical data; 6) metadata; and 7) alert data.

Most likely session and transaction data would have been most useful for the case at hand. Once intelligence agencies identified that command and control infrastructure used by the alleged Chinese agents in this example, they could provide that information to the CIRT, who could then query historical NSM data for connectivity between enterprise assets and C2 servers. The results of those queries would help determine if and when an enterprise was victimized by compromised hardware.

The limitations of this approach are worth noting. First, if the intruders never activated their backdoors, then there would be no evidence of communications with C2 servers. Hardware inspection would be the main way to deal with this problem. Second, the intruders may leverage popular Internet services for their C2. Historical examples include command and control via Twitter, domain fronting via Google or other Web sites, and other covert channels. Depending on the nature of the communication, it would be difficult, though not impossible, to deal with this situation, mainly through careful analysis. Third, traditional network-centric monitoring would be challenging if the intruders employed an out-of-band C2 channel, such as a cellular or radio network. This has been seen in the wild but does not appear to be the case in this incident. Technical countermeasures, whereby rooms are swept for unauthorized signals, would have to be employed. Fourth, it’s possible, albeit unlikely, that NSM sensors tasked with watching for suspicious and malicious activity are themselves hosted on compromised hardware, making their reporting also untrustworthy.

The remedy for the last instance is easier than that for the previous three. Proper architecture and deployment can radically improve the trust one can place in NSM sensors. First, the sensors should not be able to connect to arbitrary systems on the Internet. The most security conscious administrators apply patches and modifications using direct access to trusted local sources, and do not allow access for any reason other than data retrieval and system maintenance. In other words, no one browses Web sites or checks their email from NSM sensors! Second, this moratorium on arbitrary connections should be enforced by firewalls outside the NSM sensors, and any connection attempts that violate the firewall policy should generate a high-priority alert. It is again theoretically possible for an extremely advanced intruder to circumvent these controls, but this approach increases the likelihood of an adversary tripping a wire at some point, revealing his or her presence.

The bottom line is that NSM must be a part of the detection and response strategy for any organization that runs a network. Collecting and analyzing the core NSM data types, in concert with host-based security, integration with third party intelligence, and infrastructure logging, provides the best chance for CIRTs to detect and respond to the sorts of adversaries who escalate their activities to the level of hardware hacking via the supply chain. Whether or not the Bloomberg story is true, the investment in NSM merits the peace of mind a CISO will enjoy when his or her CIRT is equipped with robust network visibility.

This post first appeared on the Corelight blog.

Firewalls and the Need for Speed

I was looking for resources on campus network design and found these slides (pdf) from a 2011 Network Startup Resource Center presentation. These two caught my attention:



This bothered me, so I Tweeted about it.

This started some discussion, and prompted me to see what NSRC suggests for architecture these days. You can find the latest, from April 2018, here. Here is the bottom line for their suggested architecture:






What do you think of this architecture?

My Tweet has attracted some attention from the high speed network researcher community, some of whom assume I must be a junior security apprentice who equates "firewall" with "security." Long-time blog readers will laugh at that, like I did. So what was my problem with the original recommendation, and what problems do I have (if any) with the 2018 version?

First, let's be clear that I have always differentiated between visibility and control. A firewall is a poor visibility tool, but it is a control tool. It controls inbound or outbound activity according to its ability to perform in-line traffic inspection. This inline inspection comes at a cost, which is the major concern of those responding to my Tweet.

Notice how the presentation author thinks about firewalls. In the slides above, from the 2018 version, he says "firewalls don't protect users from getting viruses" because "clicked links while browsing" and "email attachments" are "both encrypted and firewalls won't help." Therefore, "since firewalls don't really protect users from viruses, let's focus on protecting critical server assets," because "some campuses can't develop the political backing to remove firewalls for the majority of the campus."

The author is arguing that firewalls are an inbound control mechanism, and they are ill-suited for the most prevalent threat vectors for users, in his opinion: "viruses," delivered via email attachment, or "clicked links."

Mail administrators can protect users from many malicious attachments. Desktop anti-virus can protect users from many malicious downloads delivered via "clicked links." If that is your worldview, of course firewalls are not important.

His argument for firewalls protecting servers is, implicitly, that servers may offer services that should not be exposed to the Internet. Rather than disabling those services, or limiting access via identity or local address restrictions, he says a firewall can provide that inbound control.

These arguments completely miss the point that firewalls are, in my opinion, more effective as an outbound control mechanism. For example, a firewall helps restrict adversary access to his victims when they reach outbound to establish post-exploitation command and control. This relies on the firewall identifying the attempted C2 as being malicious. To the extent intruders encrypt their C2 (and sites fail to inspect it) or use covert mechanisms (e.g., C2 over Twitter), firewalls will be less effective.

The previous argument assumes admins rely on the firewall to identify and block malicious outbound activity. Admins might alternatively identify the activity themselves, and direct the firewall to block outbound activity from designated compromised assets or to designated adversary infrastructure.

As some Twitter responders said, it's possible to do some or all of this without using a stateful firewall. I'm aware of the cool tricks one can play with routing to control traffic. Ken Meyers and I wrote about some of these approaches in 2005 in my book Extrusion Detection. See chapter 5, "Layer 3 Network Access Control."

Implementing these non-firewall-based security choices requries a high degree of diligence, which requires visibility. I did not see this emphasized in the NSRC presentation. For example:


These are fine goals, but I don't equate "manageability" with visibility or security. I don't think "problems and viruses" captures the magnitude of the threat to research networks.

The core of the reaction to my original Tweet is that I don't appreciate the need for speed in research networks. I understand that. However, I can't understand the requirement for "full bandwidth, un-filtered access to the Internet." That is a recipe for disaster.

On the other hand, if you define partner specific networks, and allow essentially site-to-site connectivity with exquisite network security monitoring methods and operations, then I do not have a problem with eliminating firewalls from the architecture. I do have a problem with unrestricted access to adversary infrastructure.

I understand that security doesn't exist to serve itself. Security exists to enable an organizational mission. Security must be a partner in network architecture design. It would be better to emphasize enhance monitoring for the networks discussed above, and think carefully about enabling speed without restrictions. The NSRC resources on the science DMZ merit consideration in this case.

Twenty Years of Network Security Monitoring: From the AFCERT to Corelight

I am really fired up to join Corelight. I’ve had to keep my involvement with the team a secret since officially starting on July 20th. Why was I so excited about this company? Let me step backwards to help explain my present situation, and forecast the future.

Twenty years ago this month I joined the Air Force Computer Emergency Response Team (AFCERT) at then-Kelly Air Force Base, located in hot but lovely San Antonio, Texas. I was a brand new captain who thought he knew about computers and hacking based on experiences from my teenage years and more recent information operations and traditional intelligence work within the Air Intelligence Agency. I was desperate to join any part of the then-five-year-old Information Warfare Center (AFIWC) because I sensed it was the most exciting unit on “Security Hill.”

I had misjudged my presumed level of “hacking” knowledge, but I was not mistaken about the exciting life of an AFCERT intrusion detector! I quickly learned the tenets of network security monitoring, enabled by the custom software watching and logging network traffic at every Air Force base. I soon heard there were three organizations that intruders knew to be wary of in the late 1990s: the Fort, i.e. the National Security Agency; the Air Force, thanks to our Automated Security Incident Measurement (ASIM) operation; and the University of California, Berkeley, because of a professor named Vern Paxson and his Bro network security monitoring software.

When I wrote my first book in 2003-2004, The Tao of Network Security Monitoring, I enlisted the help of Christopher Jay Manders to write about Bro 0.8. Bro had the reputation of being very powerful but difficult to stand up. In 2007 I decided to try installing Bro myself, thanks to the introduction of the “brolite” scripts shipped with Bro 1.2.1. That made Bro easier to use, but I didn’t do much analysis with it until I attended the 2009 Bro hands-on workshop. There I met Vern, Robin Sommer, Seth Hall, Christian Kreibich, and other Bro users and developers. I was lost most of the class, saved only by my knowledge of standard Unix command line tools like sed, awk, and grep! I was able to integrate Bro traffic analysis and logs into my TCP/IP Weapons School 2.0 class, and subsequent versions, which I taught mainly to Black Hat students. By the time I wrote my last book, The Practice of Network Security Monitoring, in 2013, I was heavily relying on Bro logs to demonstrate many sorts of network activity, thanks to the high-fidelity nature of Bro data.

In July of this year, Seth Hall emailed to ask if I might be interested in keynoting the upcoming Bro users conference in Washington, D.C., on October 10-12. I was in a bad mood due to being unhappy with the job I had at that time, and I told him I was useless as a keynote speaker. I followed up with another message shortly after, explained my depressed mindset, and asked how he liked working at Corelight. That led to interviews with the Corelight team and a job offer. The opportunity to work with people who really understood the need for network security monitoring, and were writing the world’s most powerful software to generate NSM data, was so appealing! Now that I’m on the team, I can share how I view Corelight’s contribution to the security challenges we face.

For me, Corelight solves the problems I encountered all those years ago when I first looked at Bro. The Corelight embodiment of Bro is ready to go when you deploy it. It’s developed and maintained by the people who write the code. Furthermore, Bro is front and center, not buried behind someone else’s logo. Why buy this amazing capability from another company when you can work with those who actually conceptualize, develop, and publish the code?

It’s also not just Bro, but it’s Bro at ridiculous speeds, ingesting and making sense of complex network traffic. We regularly encounter open source Bro users who spend weeks or months struggling to get their open source deployments to run at the speeds they need, typically in the tens or hundreds of Gbps. Corelight’s offering is optimized at the hardware level to deliver the highest performance, and our team works with customers who want to push Bro to the even greater levels. 

Finally, working at Corelight gives me the chance to take NSM in many exciting new directions. For years we NSM practitioners have worried about challenges to network-centric approaches, such as encryption, cloud environments, and alert fatigue. At Corelight we are working on answers for all of these, beyond the usual approaches — SSL termination, cloud gateways, and SIEM/SOAR solutions. We will have more to say about this in the future, I’m happy to say!

What challenges do you hope Corelight can solve? Leave a comment or let me know via Twitter to @corelight_inc or @taosecurity.

Defining Counterintelligence

I've written about counterintelligence (CI) before, but I realized today that some of my writing, and the writing of others, may be confused as to exactly what CI means.

The authoritative place to find an American definition for CI is the United States National Counterintelligence and Security Center. I am more familiar with the old name of this organization, the  Office of the National Counterintelligence Executive (ONCIX).

The 2016 National Counterintelligence Strategy cites Executive Order 12333 (as amended) for its definition of CI:

Counterintelligence – Information gathered and activities conducted to identify, deceive,
exploit, disrupt, or protect against espionage, other intelligence activities, sabotage, or assassinations conducted for or on behalf of foreign powers, organizations, or persons, or their agents, or international terrorist organizations or activities. (emphasis added)

The strict interpretation of this definition is countering foreign nation state intelligence activities, such as those conducted by China's Ministry of State Security (MSS), the Foreign Intelligence Service of the Russian Federation (SVR RF), Iran's Ministry of Intelligence, or the military intelligence services of those countries and others.

In other words, counterintelligence is countering foreign intelligence. The focus is on the party doing the bad things, and less on what the bad thing is.

The definition, however, is loose enough to encompass others; "organizations," "persons," and "international terrorist organizations" are in scope, according to the definition. This is just about everyone, although criminals are explicitly not mentioned.

The definition is also slightly unbounded by moving beyond "espionage, or other intelligence activities," to include "sabotage, or assassinations." In those cases, the assumptions is that foreign intelligence agencies and their proxies are the parties likely to be conducting sabotage or assassinations. In the course of their CI work, paying attention to foreign intelligence agents, the CI team may encounter plans for activities beyond collection.

The bottom line for this post is a cautionary message. It's not appropriate to call all intelligence activities "counterintelligence." It's more appropriate to call countering adversary intelligence activities counterintelligence.

You may use similar or the same approaches as counterintelligence agents when performing your cyber threat intelligence function. For example, you may recruit a source inside a carding forum, or you may plant your own source in a carding forum. This is similar to turning a foreign intelligence agent, or inserting your own agent in a foreign intelligence service. However, activities directing against a carding forum are not counterintelligence. Activities directing against a foreign intelligence service are counterintelligence.

The nature and target of your intelligence activities are what determine if it is counterintelligence, not necessarily the methods you use. Again, this is in keeping with the stricter definition, and not becoming a victim of scope creep.


Why Do SOCs Look Like This?

When you hear the word "SOC," or the phrase "security operations center," what image comes to mind? Do you think of analyst sitting at desks, all facing forward, towards giant screens? Why is this?

The following image is from the outstanding movie Apollo 13, a docudrama about the challenged 1970 mission to the moon.


It's a screen capture from the go for launch sequence. It shows mission control in Houston, Texas. If you'd like to see video of the actual center from 1970, check out This Is Mission Control.

Mission control looks remarkably like a SOC, doesn't it? When builders of computer security operations centers imagined what their "mission control" rooms would look like, perhaps they had Houston in mind?

Or perhaps they thought of the 1983 movie War Games?


Reality was way more boring however:


I visited NORAD under Cheyenne Mountain in 1989, I believe, when visiting the Air Force Academy as a high school senior. I can confirm it did not look like the movie depiction!

Let's return to mission control. Look at the resources available to personnel manning the mission control room. The big screens depict two main forms of data: telemetry and video of the rocket. What about the individual screens, where people sit? They are largely customized. Each station presents data or buttons specific to the role of the person sitting there. Listen to Ed Harris' character calling out the stations: booster, retro, vital, etc. For example:


This is one of the key differences between mission control and any modern computerized operations center. In the 1960s and 1970s, workstations (literally, places where people worked) had to be customized. They lacked the technology to have generic workstations where customization was done via screen, keyboard, and mouse. They also lacked the ability to display video on demand, and relied on large television screens. Personnel with specific functions sat at specific locations, because that was literally the only way they could perform their jobs.

With the advent of modern computing, every workstation is instantly customizable. There is no need to specialize. Anyone can sit anywhere, assuming computers allow one's workspace to follow their logon. In fact, modern computing allows a user to sit in spaces outside of their office. A modern mission control could be distributed.

With that in mind, what does the current version of mission control look like? Here is a picture of the modern Johnson Space Center's mission control room.



It looks similar to the 1960s-1970s version, except it's dominated by screens, keyboards, and mice.

What strikes me about every image of a "SOC" that I've ever seen is that no one is looking at the big screens. They are almost always deployed for an audience. No one in an operational role looks at them.

There are exceptions. Check out the Arizona Department of Transportation operations center.


Their "big screen" is a composite of 24 smaller screens showing traffic and roadways. No one is looking at the screen, but that sort of display is perfect for the human eye.

It's a variant of Edward Tufte's "small multiple" idea. There is no text. The eye can discern if there is a lot of traffic, or little traffic, or an accident pretty easily. It's likely more for the benefit of an audience, but it works decently well.

Compare those screens to what one is likely to encounter in a cyber SOC. In addition to a "pew pew" map and a "spinning globe of doom," it will likely look like this, from R3 Cybersecurity:


The big screens are a waste of time. No one is standing near them. No one sitting at their workstations can read what the screens show. They are purely for an audience, who can't discern what they show either.

The bottom line for this post is that if you're going to build a "SOC," don't build it based on what you've seen in the movies, or in other industries, or what a consultancy recommends. Spend some time determining your SOC's purpose, and let the workflow drive the physical setting. You may determine you don't even need a "SOC," either physically or logically, based on maturing understandings of a SOC's mission. That's a topic for a future post!

Bejtlich on the APT1 Report: No Hack Back

Before reading the rest of this post, I suggest reading Mandiant/FireEye's statement Doing Our Part -- Without Hacking Back.

I would like to add my own color to this situation.

First, at no time when I worked for Mandiant or FireEye, or afterwards, was there ever a notion that we would hack into adversary systems. During my six year tenure, we were publicly and privately a "no hack back" company. I never heard anyone talk about hack back operations. No one ever intimated we had imagery of APT1 actors taken with their own laptop cameras. No one even said that would be a good idea.

Second, I would never have testified or written, repeatedly, about our company's stance on not hacking back if I knew we secretly did otherwise. I have quit jobs because I had fundamental disagreements with company policy or practice. I worked for Mandiant from 2011 through the end of 2013, when FireEye acquired Mandiant, and stayed until last year (2017). I never considered quitting Mandiant or FireEye due to a disconnect between public statements and private conduct.

Third, I was personally involved with briefings to the press, in public and in private, concerning the APT1 report. I provided the voiceover for a 5 minute YouTube video called APT1: Exposing One of China's Cyber Espionage Units. That video was one of the most sensitive, if not the most sensitive, aspects of releasing the report. We showed the world how we could intercept adversary communications and reconstruct it. There was internal debate about whether we should do that. We decided to cover the practice in the report, as Christopher Glyer Tweeted:


In none of these briefings to the press did we show pictures or video from adversary laptops. We did show the video that we published to YouTube.

Fourth, I privately contacted former Mandiant personnel with whom I worked during the time of the APT1 report creation and distribution. Their reaction to Mr Sanger's allegations ranged from "I've never heard of that" to "completely false." I asked former Mandiant colleagues, like myself, in the event that current Mandiant or FireEye employees were told not to talk to outsiders about the case.

What do I think happened here? I agree with the theory that Mr Sanger misinterpreted the reconstructed RDP sessions for some sort of "camera access." I have no idea about the "bros" or "leather jackets" comments!

In the spirit of full disclosure, prior to publication, Mr Sanger tried to reach me to discuss his book via email. I was sick and told him I had to pass. Ellen Nakashima also contacted me; I believe she was doing research for the book. She asked a few questions about the origin of the term APT, which I answered. I do not have the book so I do not know if I am cited, or if my message was included.

The bottom line is that Mandiant and FireEye did not conduct any hack back for the APT1 report.

Update: Some of you wondered about Ellen's role. I confirmed last night that she was working on her own project.

Bejtlich Joining Splunk


Since posting Bejtlich Moves On I've been rebalancing work, family, and personal life. I invested in my martial arts interests, helped more with home duties, and consulted through TaoSecurity.

Today I'm pleased to announce that, effective Monday May 21st 2018, I'm joining the Splunk team. I will be Senior Director for Security and Intelligence Operations, reporting to our CISO, Joel Fulton. I will help build teams to perform detection and monitoring operations, digital forensics and incident response, and threat intelligence. I remain in the northern Virginia area and will align with the Splunk presence in Tyson's Corner.

I'm very excited by this opportunity for four reasons. First, the areas for which I will be responsible are my favorite aspects of security. Long-time blog readers know I'm happiest detecting and responding to intruders! Second, I already know several people at the company, one of whom began this journey by Tweeting about opportunities at Splunk! These colleagues are top notch, and I was similarly impressed by the people I met during my interviews in San Francisco and San Jose.

Third, I respect Splunk as a company. I first used the products over ten years ago, and when I tried them again recently they worked spectacularly, as I expected. Fourth, my new role allows me to be a leader in the areas I know well, like enterprise defense and digital operational art, while building understanding in areas I want to learn, like cloud technologies, DevOps, and security outside enterprise constraints.

I'll have more to say about my role and team soon. Right now I can share that this job focuses on defending the Splunk enterprise and its customers. I do not expect to spend a lot of time in sales cycles. I will likely host visitors in the Tyson's areas from time to time. I do not plan to speak as much with the press as I did at Mandiant and FireEye. I'm pleased to return to operational defense, rather than advise on geopolitical strategy.

If this news interests you, please check our open job listings in information technology. As a company we continue to grow, and I'm thrilled to see what happens next!

Trying Splunk Cloud

I first used Splunk over ten years ago, but the first time I blogged about it was in 2008. I described how to install Splunk on Ubuntu 8.04. Today I decided to try the Splunk Cloud.

Splunk Cloud is the company's hosted Splunk offering, residing in Amazon Web Services (AWS). You can register for a 15 day free trial of Splunk Cloud that will index 5 GB per day.

If you would like to follow along, you will need a computer with a Web browser to interact with Splunk Cloud. (There may be ways to interact via API, but I do not cover that here.)

I will collect logs from a virtual machine running Debian 9, inside Oracle VirtualBox.

First I registered for the free Splunk Cloud trial online.

After I had a Splunk Cloud instance running, I consulted the documentation for Forward data to Splunk Cloud from Linux. I am running a "self-serviced" instance and not a "managed instance," i.e., I am the administrator in this situation.

I learned that I needed to install a software package called the Splunk Universal Forwarder on my Linux VM.

I downloaded a 64 bit Linux 2.6+ kernel .deb file to the /home/Downloads directory on the Linux VM.

richard@debian:~$ cd Downloads/

richard@debian:~/Downloads$ ls

splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-amd64.deb

With elevation permissions I created a directory for the .deb, changed into the directory, and installed the .deb using dpkg.

richard@debian:~/Downloads$ sudo bash
[sudo] password for richard: 

root@debian:/home/richard/Downloads# mkdir /opt/splunkforwarder

root@debian:/home/richard/Downloads# mv splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-amd64.deb /opt/splunkforwarder/

root@debian:/home/richard/Downloads# cd /opt/splunkforwarder/

root@debian:/opt/splunkforwarder# ls

splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-amd64.deb

root@debian:/opt/splunkforwarder# dpkg -i splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-amd64.deb 

Selecting previously unselected package splunkforwarder.
(Reading database ... 141030 files and directories currently installed.)
Preparing to unpack splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-amd64.deb ...
Unpacking splunkforwarder (7.1.0) ...
Setting up splunkforwarder (7.1.0) ...
complete

root@debian:/opt/splunkforwarder# ls
bin        license-eula.txt
copyright.txt  openssl
etc        README-splunk.txt
ftr        share
include        splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-amd64.deb
lib        splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest

Next I changed into the bin directory, ran the splunk binary, and accepted the EULA.

root@debian:/opt/splunkforwarder# cd bin/

root@debian:/opt/splunkforwarder/bin# ls

btool   copyright.txt   openssl slim   splunkmon
btprobe   genRootCA.sh   pid_check.sh splunk   srm
bzip2   genSignedServerCert.sh  scripts splunkd
classify  genWebCert.sh   setSplunkEnv splunkdj

root@debian:/opt/splunkforwarder/bin# ./splunk start

SPLUNK SOFTWARE LICENSE AGREEMENT

THIS SPLUNK SOFTWARE LICENSE AGREEMENT ("AGREEMENT") GOVERNS THE LICENSING,
INSTALLATION AND USE OF SPLUNK SOFTWARE. BY DOWNLOADING AND/OR INSTALLING SPLUNK
SOFTWARE: (A) YOU ARE INDICATING THAT YOU HAVE READ AND UNDERSTAND THIS

...

Splunk Software License Agreement 04.24.2018

Do you agree with this license? [y/n]: y

Now I had to set an administrator password for this Universal Forwarder instance. I will refer to it as "mypassword" in the examples that follow although Splunk does not echo it to the screen below.

This appears to be your first time running this version of Splunk.

An Admin password must be set before installation proceeds.
Password must contain at least:
   * 8 total printable ASCII character(s).
Please enter a new password: 
Please confirm new password: 

Splunk> Map. Reduce. Recycle.

Checking prerequisites...
Checking mgmt port [8089]: open
Creating: /opt/splunkforwarder/var/lib/splunk
Creating: /opt/splunkforwarder/var/run/splunk
Creating: /opt/splunkforwarder/var/run/splunk/appserver/i18n
Creating: /opt/splunkforwarder/var/run/splunk/appserver/modules/static/css
Creating: /opt/splunkforwarder/var/run/splunk/upload
Creating: /opt/splunkforwarder/var/spool/splunk
Creating: /opt/splunkforwarder/var/spool/dirmoncache
Creating: /opt/splunkforwarder/var/lib/splunk/authDb
Creating: /opt/splunkforwarder/var/lib/splunk/hashDb
New certs have been generated in '/opt/splunkforwarder/etc/auth'.
Checking conf files for problems...
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunkforwarder/splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...  
Done

With that done, I had to return to the Splunk Cloud Web site, and click the link to "Download Universal Forwarder Credentials" to download a splunkclouduf.spl file. As noted in the documentation, splunkclouduf.spl is a "credentials file, which contains a custom certificate for your Splunk Cloud deployment. The universal forwarder credentials are different from the credentials that you use to log into Splunk Cloud."

After downloading the splunkclouduf.spl file, I installed it. Note I pass "admin" as the user and "mypassword" as the password here. After installing I restart the universal forwarder.

root@debian:/opt/splunkforwarder/bin# ./splunk install app /home/richard/Downloads/splunkclouduf.spl -auth admin:mypassword

App '/home/richard/Downloads/splunkclouduf.spl' installed 

root@debian:/opt/splunkforwarder/bin# ./splunk restart
Stopping splunkd...
Shutting down.  Please wait, as this may take a few minutes.
.......
Stopping splunk helpers...

Done.

Splunk> Map. Reduce. Recycle.

Checking prerequisites...
Checking mgmt port [8089]: open
Checking conf files for problems...
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunkforwarder/splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...  
Done

It's time to take the final steps to get data into Splunk Cloud. I need to forwarder management in the Splunk Cloud Web site. Observe the input-prd-p-XXXX.cloud.splunk.com in the command. You obtain this (mine is masked with XXXX) from the URL for your Splunk Cloud deployment, e.g., https://prd-p-XXXX.cloud.splunk.com. Note that you have to add "input-" before the fully qualified domain name used by the Splunk Cloud instance.

root@debian:/opt/splunkforwarder/bin# ./splunk set deploy-poll input-prd-p-XXXX.cloud.splunk.com:8089

Your session is invalid.  Please login.
Splunk username: admin
Password: 
Configuration updated.

Once again I restart the universal forwarder. I'm not sure if I could have done all these restarts at the end.

root@debian:/opt/splunkforwarder/bin# ./splunk restart
Stopping splunkd...
Shutting down.  Please wait, as this may take a few minutes.
.......
Stopping splunk helpers...

Done.

Splunk> Map. Reduce. Recycle.

Checking prerequisites...
Checking mgmt port [8089]: open
Checking conf files for problems...
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunkforwarder/splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...  
Done

Finally I need to tell the universal forwarder to watch some logs on this Linux system. I tell it to monitor the /var/log directory and restart one more time.

root@debian:/opt/splunkforwarder/bin# ./splunk add monitor /var/log
Your session is invalid.  Please login.
Splunk username: admin
Password: 
Added monitor of '/var/log'.

root@debian:/opt/splunkforwarder/bin# ./splunk restart

Stopping splunkd...
Shutting down.  Please wait, as this may take a few minutes.
...............
Stopping splunk helpers...

Done.

Splunk> Map. Reduce. Recycle.

Checking prerequisites...
Checking mgmt port [8089]: open
Checking conf files for problems...
Done
Checking default conf files for edits...
Validating installed files against hashes from '/opt/splunkforwarder/splunkforwarder-7.1.0-2e75b3406c5b-linux-2.6-x86_64-manifest'
All installed files intact.
Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...  
Done

At this point I return to the Splunk Cloud Web interface and click the "search" feature. I see Splunk is indexing some data.


I run a search for "host=debian" and find my logs.


Not too bad! Have you tried Splunk Cloud? What do you think? Leave me a comment below.

Update: I installed the Universal Forwarder on FreeBSD 11.1 using the method above (except with a FreeBSD .tgz) and everything seems to be working!

Importing Pcap into Security Onion

Within the last week, Doug Burks of Security Onion (SO) added a new script that revolutionizes the use case for his amazing open source network security monitoring platform.

I have always used SO in a live production mode, meaning I deploy a SO sensor sniffing a live network interface. As the multitude of SO components observe network traffic, they generate, store, and display various forms of NSM data for use by analysts.

The problem with this model is that it could not be used for processing stored network traffic. If one simply replayed the traffic from a .pcap file, the new traffic would be assigned contemporary timestamps by the various tools observing the traffic.

While all of the NSM tools in SO have the independent capability to read stored .pcap files, there was no unified way to integrate their output into the SO platform.

Therefore, for years, there has not been a way to import .pcap files into SO -- until last week!

Here is how I tested the new so-import-pcap script. First, I made sure I was running Security Onion Elastic Stack Release Candidate 2 (14.04.5.8 ISO) or later. Next I downloaded the script using wget from https://github.com/Security-Onion-Solutions/securityonion-elastic/blob/master/usr/sbin/so-import-pcap.

I continued as follows:

richard@so1:~$ sudo cp so-import-pcap /usr/sbin/

richard@so1:~$ sudo chmod 755 /usr/sbin/so-import-pcap

I tried running the script against two of the sample files packaged with SO, but ran into issues with both.

richard@so1:~$ sudo so-import-pcap /opt/samples/10k.pcap

so-import-pcap

Please wait while...
...creating temp pcap for processing.
mergecap: Error reading /opt/samples/10k.pcap: The file appears to be damaged or corrupt
(pcap: File has 263718464-byte packet, bigger than maximum of 262144)
Error while merging!

I checked the file with capinfos.

richard@so1:~$ capinfos /opt/samples/10k.pcap
capinfos: An error occurred after reading 17046 packets from "/opt/samples/10k.pcap": The file appears to be damaged or corrupt.
(pcap: File has 263718464-byte packet, bigger than maximum of 262144)

Capinfos confirmed the problem. Let's try another!

richard@so1:~$ sudo so-import-pcap /opt/samples/zeus-sample-1.pcap

so-import-pcap

Please wait while...
...creating temp pcap for processing.
mergecap: Error reading /opt/samples/zeus-sample-1.pcap: The file appears to be damaged or corrupt
(pcap: File has 1984391168-byte packet, bigger than maximum of 262144)
Error while merging!

Another bad file. Trying a third!

richard@so1:~$ sudo so-import-pcap /opt/samples/evidence03.pcap

so-import-pcap

Please wait while...
...creating temp pcap for processing.
...setting sguild debug to 2 and restarting sguild.
...configuring syslog-ng to pick up sguild logs.
...disabling syslog output in barnyard.
...configuring logstash to parse sguild logs (this may take a few minutes, but should only need to be done once)...done.
...stopping curator.
...disabling curator.
...stopping ossec_agent.
...disabling ossec_agent.
...stopping Bro sniffing process.
...disabling Bro sniffing process.
...stopping IDS sniffing process.
...disabling IDS sniffing process.
...stopping netsniff-ng.
...disabling netsniff-ng.
...adjusting CapMe to allow pcaps up to 50 years old.
...analyzing traffic with Snort.
...analyzing traffic with Bro.
...writing /nsm/sensor_data/so1-eth1/dailylogs/2009-12-28/snort.log.1261958400

Import complete!

You can use this hyperlink to view data in the time range of your import:
https://localhost/app/kibana#/dashboard/94b52620-342a-11e7-9d52-4f090484f59e?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:'2009-12-28T00:00:00.000Z',mode:absolute,to:'2009-12-29T00:00:00.000Z'))

or you can manually set your Time Range to be:
From: 2009-12-28    To: 2009-12-29


Incidentally here is the capinfos output for this trace.

richard@so1:~$ capinfos /opt/samples/evidence03.pcap
File name:           /opt/samples/evidence03.pcap
File type:           Wireshark/tcpdump/... - pcap
File encapsulation:  Ethernet
Packet size limit:   file hdr: 65535 bytes
Number of packets:   1778
File size:           1537 kB
Data size:           1508 kB
Capture duration:    171 seconds
Start time:          Mon Dec 28 04:08:01 2009
End time:            Mon Dec 28 04:10:52 2009
Data byte rate:      8814 bytes/s
Data bit rate:       70 kbps
Average packet size: 848.57 bytes
Average packet rate: 10 packets/sec
SHA1:                34e5369c8151cf11a48732fed82f690c79d2b253
RIPEMD160:           afb2a911b4b3e38bc2967a9129f0a11639ebe97f
MD5:                 f8a01fbe84ef960d7cbd793e0c52a6c9
Strict time order:   True

That worked! Now to see what I can find in the SO interface.

I accessed the Kibana application and changed the timeframe to include those in the trace.


Here's another screenshot. Again I had to adjust for the proper time range.


Very cool! However, I did not find any IDS alerts. This made me wonder if there was a problem with alert processing. I decided to run the script on a new .pcap:

richard@so1:~$ sudo so-import-pcap /opt/samples/emerging-all.pcap

so-import-pcap

Please wait while...
...creating temp pcap for processing.
...analyzing traffic with Snort.
...analyzing traffic with Bro.
...writing /nsm/sensor_data/so1-eth1/dailylogs/2010-01-27/snort.log.1264550400

Import complete!

You can use this hyperlink to view data in the time range of your import:
https://localhost/app/kibana#/dashboard/94b52620-342a-11e7-9d52-4f090484f59e?_g=(refreshInterval:(display:Off,pause:!f,value:0),time:(from:'2010-01-27T00:00:00.000Z',mode:absolute,to:'2010-01-28T00:00:00.000Z'))

or you can manually set your Time Range to be:
From: 2010-01-27    To: 2010-01-28

When I searched the interface for NIDS alerts (after adjusting the time range), I found results:


The alerts show up in Sguil, too!



This is a wonderful development for the Security Onion community. Being able to import .pcap files and analyze them with the standard SO tools and processes, while preserving timestamps, makes SO a viable network forensics platform.

This thread in the mailing list is covering the new script.

I suggest running on an evaluation system, probably in a virtual machine. I did all my testing on Virtual Box. Check it out!