Last Updated: 2021-06-17 14:40:22 UTC
by Daniel Wesemann (Version: 1)
The tooling to investigate a potentially malicious event on an Azure Cloud VM is still in its infancy. We have covered before (Forensicating Azure VMs) how we can create a snapshot of the OS disk of a running VM. Snapshotting and then killing off the infected VM is very straight forward, but it also tips off an intruder that he has been found out. Sometimes, it makes sense to first watch for a while, and learn more, for example about compromised accounts, lateral movement, or other involved hosts.
And to that end ... we would like to capture the network traffic to and from our affected VM. In an "old-fashioned" on-premises datacenter, we would in such a situation likely deploy an RSPAN port to mirror the traffic to a collector. A similar functionality was once in the making for Azure with "Virtual Network Tap" but this solution has been put on ice by Microsoft, and is not currently available.
Of course we could also log into the VM and start Wireshark or tcpdump ... but this wouldn't be exactly subtle, and if the machine indeed were compromised, doing so would for sure tip off the attacker.
So ... which options are there?
Option 1: Network Watcher Packet Capture
This delivers a full PCAP, and Microsoft amazingly even managed not to "improve" the PCAP format. Which means that the PCAPs collected with Network Watcher can in fact be opened and analyzed Wireshark. I'm sure this was just an oversight on Microsoft's part, and they are currently busy re-implementing packet capture to produce a JSON or Excel file instead :)
Joking aside, the main downside of this Network Watcher PCAP is that the capture duration is limited to 18000 seconds aka five hours. So you need to add some sort of logic to (or have an operator manually) restart the capture every couple hours, to make sure no data is lost. The second downside is that the PCAP is written to an Azure Blob Storage account, and the caching/timing of those writes is mightily unpredictable. If you expect that you can "watch live" what is going on on the VM, think again. The PCAPs show up with delays of half a hour or more.
The main advantage is ... well, it is full PCAPs, and packets rule! Another advantage is that Network Watcher PCAP can be deployed onto a VM interface from within the Azure Portal, without touching the VM itself, and while the VM keeps running. An intruder who has an eye on the processes within the VM would notice though - while the watcher gets deployed via the virtualization layer, it does result in running "Network Watcher" processes that are visible from within the VM.
To enable Network Watcher for your VM, search for "Network Watcher" from the main Azure Portal search bar. Enable the Watcher for the Azure region(s) where you have resources of interest. Then open Network Watcher, and click on "Packet Capture" in the menu on the left.
Option 2: Flow Logs on the Network Interface Security Group (NSG)
This is one of the older techniques. Flow logs don't provide full packet captures, but create a log of source/destination IPs and ports, timestamp, and bytes transferred in each direction. Better than nothing. The main disadvantage is that in this case, Microsoft did reinvent the format, and it is a puke pile of JSON where, to their shame, even Microsoft's own recommendation seems to be to import the logs into Elastic or Splunk, or - yikes - to evaluate them using PowerBI. In a pinch, the format can though also be parsed with a bit of Powershell. https://docs.microsoft.com/en-us/azure/network-watcher/network-watcher-nsg-flow-logging-overview explains the structure.
To enable Flow Logs, open the NSG attached to the interface or VNET in question. The option "NSG Flow Logs" in the menu on the left, under "Monitoring", is where you enable the forwarding of flow logs to an Azure Storage Account.
In tomorrow's diary, we're going to look at a third option, using Azure Monitor Insights.