I spent a chunk of the day troubleshooting a network security monitoring (NSM) problem. I thought I would share the problem and my investigation in the hopes that it might help others. The specifics are probably less important than the general approach.It began with ja3
. You may know ja3
as a set of Zeek scripts developed by the Salesforce engineering team to profile client and server TLS parameters.
I was reviewing Zeek logs captured by my Corelight appliance and by one of my lab sensors running Security Onion. I had coverage of the same endpoint in both sensors.
I noticed that the SO Zeek logs did not have ja3 hashes in the ssl.log entries. Both sensors did have ja3s hashes. My first thought was that SO was misconfigured somehow to not record ja3 hashes. I quickly dismissed that, because it made no sense. Besides, verifying that intution required me to start troubleshooting near the top of the software stack.
I decided to start at the bottom, or close to the bottom. I had a sinking suspicion that, for some reason, Zeek was only seeing traffic sent from remote systems, and not traffic originating from my network. That would account for the creation of ja3s hashes, for traffic sent by remote systems, but not ja3 hashes, as Zeek was not seeing traffic sent by local clients.
I was running SO in VirtualBox 6.0.4 on Ubuntu 18.04. I started sniffing TCP network traffic on the SO monitoring interface using Tcpdump. As I feared, it didn't look right. I ran a new capture with filters for ICMP and a remote IP address. On another system I tried pinging the remote IP address. Sure enough, I only saw ICMP echo replies, and no ICMP echoes. Oddly, I also saw doubles and triples of some of the ICMP echo replies. That worried me, because unpredictable behavior like that could indicate some sort of software problem
My next step was to "get under" the VM guest and determine if the VM host could see traffic properly. I ran Tcpdump on the Ubuntu 18.04 host on the monitoring interface and repeated my ICMP tests. It saw everything properly. That meant I did not need to bother checking the switch span port that was feeding traffic to the VirtualBox system.
It seemed I had a problem somewhere between the VM host and guest. On the same VM host I was also running an instance of RockNSM. I ran my ICMP tests on the RockNSM VM and, sadly, I got the same one-sided traffic as seen on SO.
Now I was worried. If the problem had only been present in SO, then I could fix SO. If the problem is present in both SO and RockNSM, then the problem had to be with VirtualBox -- and I might not be able to fix it.
I reviewed my configurations in VirtualBox, ensuring that the "Promiscuous Mode" under the Advanced options was set to "Allow All". At this point I worried that there was a bug in VirtualBox. I did some Google searches and reviewed some forum posts, but I did not see anyone reporting issues with sniffing traffic inside VMs. Still, my use case might have been weird enough to not have been reported.
I decided to try a different approach. I wondered if running VirtualBox with elevated privileges might make a difference. I did not want to take ownership of my user VMs, so I decided to install a new VM and run it with elevated privileges.
Let me stop here to note that I am breaking one of the rules of troubleshooting. I'm introducing two new variables, when I should have introduced only one.
I should have built a new VM but run it with the same user privileges with which I was running the existing VMs.
I decided to install a minimal edition of Ubuntu 9, with VirtualBox running via sudo. When I started the VM and sniffed traffic on the monitoring port, lo and behold, my ICMP tests revealed both sides of the traffic as I had hoped. Unfortunately, from this I erroneously concluded that running VirtualBox with elevated privileges was the answer to my problems.
I took ownership of the SO VM in my elevated VirtualBox session, started it, and performed my ICMP tests. Womp womp. Still broken.
I realized I needed to separate the two variables that I had entangled, so I stopped VirtualBox, and changed ownership of the Debian 9 VM to my user account. I then ran VirtualBox with user privileges, started the Debian 9 VM, and ran my ICMP tests. Success again! Apparently elevated privileges had nothing to do with my problem.
By now I was glad I had not posted anything to any user forums describing my problem and asking for help. There was something about the monitoring interface configurations in both SO and RockNSM that resulted in the inability to see both sides of traffic (and avoid weird doubles and triples).
I started my SO VM again and looked at the script that configured the interfaces. I commented out all the entries below the management interface as shown below.$ cat /etc/network/interfaces
# This configuration was created by the Security Onion setup script.## The original network interface configuration file was backed up to:# /etc/network/interfaces.bak.## This file describes the network interfaces available on your system# and how to activate them. For more information, see interfaces(5).
# loopback network interfaceauto loiface lo inet loopback
# Management network interfaceauto enp0s3iface enp0s3 inet static address 192.168.40.76 gateway 192.168.40.1 netmask 255.255.255.0 dns-nameservers 192.168.40.1 dns-domain localdomain
#auto enp0s8#iface enp0s8 inet manual# up ip link set $IFACE promisc on arp off up# down ip link set $IFACE promisc off down# post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done# post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
#auto enp0s9#iface enp0s9 inet manual# up ip link set $IFACE promisc on arp off up# down ip link set $IFACE promisc off down# post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done# post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
I rebooted the system and brought the enp0s8 interface up manually using this command:
$ sudo ip link set enp0s8 promisc on arp off up
Fingers crossed, I ran my ICMP sniffing tests, and voila, I saw what I needed -- traffic in both directions, without doubles or triples no less.
So, there appears to be some sort of problem with the way SO and RockNSM set parameters for their monitoring interfaces, at least as far as they interact with VirtualBox 6.0.4 on Ubuntu 18.04. You can see in the network script that SO disables a bunch of NIC options. I imagine one or more of them is the culprit, but I didn't have time to work through them individually.
I tried taking a look at the network script in RockNSM, but it runs CentOS, and I'll be darned if I can't figure out where to look. I'm sure it's there somewhere, but I didn't have the time to figure out where.
The moral of the story is that I should have immediately checked after installation that both SO and RockNSM were seeing both sides of the traffic I expected them to see. I had taken that for granted for many previous deployments, but something broke recently and I don't know exactly what. My workaround will hopefully hold for now, but I need to take a closer look at the NIC options because I may have introduced another fault.
A second moral is to be careful of changing two or more variables when troubleshooting. When you do that you might fix a problem, but not know what change fixed the issue.