Preamble
While my official title at work is “Security Analyst” my role is far more general. I am responsible for anything that can in some way or another be placed under these words: "Security" or "Network".
To try and make my life easier I spend what little free time I do have trying to come up with ways to quickly identify “changes” in our environment. This is why I started playing with visuals.
A ‘cheap’ heat map
The example below summarizes the results of Nessus scans that look for Antivirus compliance for a week. This visual is perfect because it shows a lot of information that can be interpreted in very little time. I only call it cheap because it is just a simple HTML table. Trivial to generate though and a small footprint. Elegant :)
I am currently working on using this same technique for Blackhole DNS data, IDS event data and network session data. I envision a single page with very little text content and a bunch of 4”x4” tables representing each source. This will be a huge time saver.
Link graphs
These are generated using Afterglow and Graphviz. I have made a little program called glow that takes care of the queries and feeds the results to Afterglow. Afterglow creates a dot file which is then sent to Graphviz to generate the graph. The graph is then scaled by glow so that it can be easily viewed.
glow(query results) -> afterglow (dot file) -> graphviz (image) -> glow (scale image)
Glow has 2 arrays. One contains base queries, the second, filters. I have 2 base queries, one for my Sguil database and one for my Spam database. The filters are just added to the queries during execution. A filter is just something like this: WHERE signature LIKE “%SSH%”.
Once I want to start keeping track of something; say a new signature, or an IP that is acting weird I just add it to the filter array and glow takes care of the rest.
Lastly, there is a simple dynamic web page that keeps track of what has been generated. It isn't perfect, but it works for now:
Link graphs can take a while to get used to. After examining them over time though it is surprising how consistently the noise begins to turn into discernible patterns. Once this happens, it becomes very easy to perceive even the smallest changes. These changes prompt questions; which is exactly what I was looking for from the onset.
Spam Event Data Examples
This image shows blocked inbound email (yellow=message, red=sender, black=recipient):
If I break it down:
A single subject, single sender, many recipients. What was it?
A Single subject, many senders, many recipients. Botnet?
A Single recipient, many senders, many subjects. This user probably clicks on a lot of stuff and loves to enter contests :). This user is also likely at a higher risk. Who are they? What department? I try to look for clusters, especially outbound. Did an account get compromised, is someone infected?
More information about where I get the data from can be found here.
IDS Event Data Examples
The event data from Sguil is processed with a different properties file than the spam data. It uses a function that colourizes the nodes using a sliding scale based on the event count (light=low dark=high). It also sizes the object based on event count. The sizing isn’t 1:1, just enough to make it stand out.
The example below is a snippet from my “All Events” query (yellow=hosts white(text)=signature) the colour scale is yellow to red.
That “mirrored mushroom” pattern normally isn’t present. Not like that anyway, and especially not when the students are out. The extra signatures above the upper portion are ET compromised hosts signatures, the others are SSH signatures. A perfect example of a distributed attack.
What is also interesting is that it looks like only half of the attacking hosts also belong to the "compromised hosts" group. Are my rules out of date?
P2P activity (TCP, not UDP) often has a pattern similar to that above but typically exhibits more hosts with more hits (red nodes). They are also invariably more densely packed.
Malware signatures, especially the toolbars, exhibit a somewhat consistent clustering behavior. The toolbar activity is the mess in the center. The more interesting Malware signatures usually begin to appear along the brink.
This example shows inappropriate conduct (white=signature, square=destination, triangle=source). The sizing and colour make this visual easy to digest "at a glance". A good indication of whether someone is expressing a little more interest than they should in a particular website, or perhaps even an infected machine.
Summary
Afterglow has been a good introduction to link graphs but it only scratches the surface of what can be done with Graphviz. Graphviz contains numerous features that can be leveraged to produce more compound and telling visuals.
The results are not always favorable. Different query constraints and different output types (sfdp vs. neato) can produce dramatically different results. Further, run-time options can also significantly change what you end up with. This of course adds a little more complexity to the initial setup.
Ideally, for large graphs some kind of logic needs to be used on the data prior to generation to determine the optimal output type and run-time options. The solution for the interim is to use very specific filters with the queries. This unfortunately, gives me a lot more to look at and certainly is not ideal.