DFIR Redefined Part 3 - Deeper Functionality for Investigators with R series continued
In keeping with pending presentations for theSecure Iowa Conferenceand(ISC)2 Security Congress, I’m continuing the DFIR Redefined: Deeper Functionality for Investigators with R series (seePart 1andPart 2). Incident responders and investigators, faced with an inundation of data and ever-evolving threat vectors, require skills enhancements and analytics optimization. DFIR Redefined is intended to explore such opportunities to create efficiencies and help the blue team cause.visNetworkrepresents another fine example of visualizing datasets in a manner that analysts can naturally gravitate towards.
Inspired by theSTATWORXwriteupon visNetwork, I immediately envisioned using it for malicious IP network activity.
Imagine the following scenario.
You’re the security lead for a midsize financial services firm operating six total sites. The network design is inadequate; while there are six unique sites the topology is mesh-like. The intential design serves two purposes, one positive and one deeply problematic. While collaboration and node cooperation are inherent, so to is the ease of malware to propogate rapidly accross the whole topology. You, as the security punching bag, have dealt with a number of malware incidents prior, but now you’re facing a real cluster.Emotetis in the house. Emotet, malware originally designed as a banking Trojan aimed at stealing financial data, has evolved to become a major threat. As of 2018, new versions of the Emotet Trojan include the ability to install other malware to infected machines, including other Trojans and ransomware. More succinctly, Emotet, per US-CERTNCASalertTA18-201A, includes worm-like features result in rapidly spreading network-wide infection, which are difficult to combat. This is exactly where you find yourself in your incident response, and you need to rapidly identify impacted nodes, contain, and mitigate.
You have data.
Your asset inventory is current, as it should be, and your network topology, albeit suboptimal, is well documented. You have logs. Via network flow aggregation you have good raw data regarding what nodes are communicating with each other, and to what extent (volume, frequency), referred to aswidthin the CSVs to be ingested. Raw data is nice, a must have, but here exists a golden opportunity for network visualization…of your network. You have what is the required data to compile a list of nodes, and a list of edges to incorporate directly into a visNetwork visualization that should more rapidly help you identify command and control (C2) nodes, and others that are falling to the outbreak.
Again, thanks to Niklas Junker atSTATWORXfor the stimulus here. This is a complete and unadulterated resuse of his code and excellent writeup. The complete R script as well as the nodes and edges CSVs are posted on myGitHubfor your own use and experimentation. A walkthrough in snippets follows:
# Remove all the objects from the workspace (clear the chaff), and set the working directoryrm(list=ls())setwd("c:/coding/R/visNetwork")#Load the required packageslibrary(dplyr)library(visNetwork)library(geomnet)library(igraph)# Data Preparation#Load dataset# Load nodes datafrom CSV
nodeData <-read.csv("nodes.csv", header =TRUE)
nodes <-as.data.frame(nodeData)# Load edges from CSV
edgeData <-read.csv("edges.csv", header =TRUE)
edges <-as.data.frame(edgeData)# Create graph for Louvain Community Detection (LCD)# https://arxiv.org/pdf/0803.0476.pdf
graph <- graph_from_data_frame(edges, directed =FALSE)#Louvain Community Detection (LCD)
cluster <- cluster_louvain(graph)
nodes <- left_join(nodes, cluster_df,by="label")colnames(nodes)<-"group"# Visualize datawith visNetwork
Figure 1:Initial visNetwork result for Emotet-impacted IPv4 network
When you render this for yourself you’ll note that you can drag nodes in case you need to read a label it’s hiding for another node. While that is dynamic in part, the real action ensues when you customize your network view with additional functions as we’ll see in the next snippet.
Above all else, consider how the above mentionedwidthdrives specific behavior in the graph. The more a given node communicates with another, the wider the representing edge will be visualized. This leads us to possible conclusions in the example. Referring toFigure 2, a zoomed view intoFigure 1, it is reasonable to assume that three nodes in particular may be operating as C2 in the Emotet outbreak: 172.17.12.22, 172.17.12.30, and 192.168.22.46.
Figure 2:Probable C2 nodes
With additional functionality as mentioned above, you can create even more dynamic views. Code follows:
As noted, use the likes of visNodes, visEdges, visOptions, visLayout or visIgraphLayout to enhance the visualization as seen inFigure 3.
Figure 3:Enhanced visNetwork result for Emotet-impacted IPv4 network
Most importantly, note that visOptions is used to highlight nodes resulting in the ability to select by group. The logical groupings in this example represent each of the six financial services locations, and the Emotet-impacted nodes on their networks. The resultingSelect by groupprovides highlighted focus of a particular site’s network. If you’re deploying incident responders in person, or implementing remote mitigation, such views create efficients and improved time-to-mitigate (TTM). A focus on group 4 (site 4) highlights two of the above mentioned C2 nodes.
To apply this practice, you’d need to devise nuance flow reporting on node-to-node communications inclusive of count over a given period. You could tailor by specific protocols and traffic types depending on the question you’re trying to answer in the data. More to related experiments to come in Part 4 of DFIR Redefined series.