NetWitness Hunting Guide

NetWitness Hunting Guide - Continued

Begin Analysis

The first query after this is meant to segment the dataset into smaller buckets for us to examine in more detail. These are our technical aspects and behavior queries; we are not yet looking for indicators of compromise. Rather, we are selecting traffic based on behaviors (such as Method usage) or technical aspects (such as short User-Agent strings or watchlist file extensions).

For example, when we are looking for covert bi-directional communications over well-known protocols, such as HTTP, we can go back to the HTTP Method analysis. HTTP POSTs are roughly 10% of HTTP GETs in most environments. Knowing this, we can effectively query for http post no get and remove roughly 90% of our traffic as currently uninteresting.

The Service Characteristics meta key is where we start with our behavioral and technical aspects queries such as the previous query for http post no get. Depending on the throughput and specific aspects of your dataset, you might have to dig a little deeper before finding something out of place or suspicious. The Service Characteristics meta keys enable an analyst to start carving out what appears to be human generated behavior and focus on the automated behaviors that malware exhibit. Previously, it was discussed that most modern browsers use more than 6 HTTP headers. Thus, drilling into HTTP sessions with six or fewer headers would eliminate these sessions, and leave us with more interesting ones.

From here, we go through the aspects of HTTP previously discussed and look for sessions that cannot be explained based on what we know to be normal behavior. If we see an HTTP POST without a referrer, how did it get there? If we come across an odd User-Agent, what is the application purporting to be and does it match that in behavior? What if we find an odd domain name that is similar to a trusted vendor but the domain is not registered by that vendor? Oftentimes with network-only visibility, it is difficult to answer these questions until you find the source of the traffic.

Deeper Analysis

When you discover something that seems potentially out of place, the real investigation begins. How long has the activity been occurring and on how many hosts? With a DHCP environment with short lease times, this can be difficult to discern and tedious if DHCP logs are available. For example, if an analyst were to find suspicious HTTP communications with foo.com, the analyst should take note of the destination IP and open a new Investigator window. Then they take the destination IP and query for ip.dst = 123.123.123.123 || ip.src=123.123.123.123 || alias.ip = 123.123.123.123. This gives you both sides of the possible initiators of communications; there might have been inbound webshell access from the C2 IP as well as a Trojan beaconing to it.

Note: The analyst could also add a Context Action for this drill-down activity to be performed more easily.

Example

In the following example, we examine a Rekaf beacon and look at some of the metadata it produces. The session as seen in the text view in NetWitness is as follows.

Figure 15. Rekaf Beacon

The following Session Characteristics were observed.

Figure 16. Rekaf Session Characteristics

This does not tell you much, other than the session originated from inside the environment and connected out to the internet and transmitted far more data than it received as payload, but still under 5 KB for both TX and RX. The Service Characteristics for this HTTP Trojan are more interesting and telling.

Figure 17. Rekaf Service Characteristics

This is more revealing. We can see there are POST Methods in this request but no GET Method and the session does not contain a Referrer header. So how does a person with a browser generate a POST without coming from a previous site? We can also see that it has a short User-Agent that appears to be the default User-Agent stored in the Windows Registry as well as having 6 HTTP Headers. Even more telling are the two cache directives and the absence of a cookie, so it is definitely not a browser or person generating this request, but rather some sort of machinery. The most glaring and obvious indicator is the binary payload in the body of the request. Not all of the characters are in the range for ASCII, so NetWitness renders unprintable characters as the ‘.’ character.

ESA, Event Stream Analysis, contains very similar logic to what an analyst would apply to hunting in HTTP. Consider the following figure for our Rekaf Trojan.

Figure 18. ESA C2 Detection

The ESA appliance queries the dataset regularly for outbound HTTP traffic and applies a machine learning algorithm that generates and weights certain scores. All HTTP communications are evaluated in this manner and have scores, with the most severe generating an incident in the incident manager queue. Not all the data for enrichment comes from the local NetWitness service; WHOIS information is gathered from Live and enriches the score with domain aging and expiring information. This type of system augments the analyst and performs analysis over multiple sessions and longer time frames than humanly possible. When we search NetWitness, we are looking for single sessions or some sort of beaconing pattern to stand out. ESA keeps connection statistics for every HTTP transaction in memory and looks for patterns that we not able to discern. Below is a list of the scored categories and descriptions.

ESA Evaluation Category:
Beacon Behavior
Description:
A high score indicates that the communications between this source IP and this domain are highly regular and therefore suspected Command and Control.

ESA Evaluation Category:
Domain Age
Description:
A high score indicates that this domain is relatively new based on the registration date found at the registrar.

ESA Evaluation Category:
Expiring Domain
Description:
A high score means that the likelihood the domain will expire soon is high.

ESA Evaluation Category:
Rare Domain
Description:
A high score indicates that relatively few source IPs have connected to this domain on this network in the last week.

ESA Evaluation Category:
No Referers
Description:
A high score indicates that a relatively low percentage of the IPs connecting to this domain have used referrers.

ESA Evaluation Category:
Rare User-Agent
Description:
A high score indicates that the domain has a high percentage of IPs using a rare user agent.

Here is a more detailed view of the scoring calculations used to generate the final score.

Figure 19. ESA C2 Detection Scoring Detail

SSL and TLS

The SSL and TLS parser is a bit of a misnomer; it should actually be called the X.509 certificate parser. After the ‘Change Cipher Spec’ message by the server, all the data is encrypted and can be considered random data (recent vulnerabilities aside). The only logic that applies to SSL/TLS are:

The bad_ssl rule that is simply looking for an alias.host entry of ‘localhost’ with the SSL Service
The hostname consecutive consonants logic looking for DGA [Domain Generation Algorithm].

Since the payload is nearly all cipher text, the Session Characteristics meta category along with SSL CA and SSL Subject are the only places to go digging into SSL/TLS. If a known SSL/TLS Trojan is being sought and the encryption and HMAC algorithm is known, an analyst can look at the Crypto Key meta category for a match.

There exists metadata for self-signed certificates; it is defined as the CA being the same as the SSL Subject. This happens legitimately in some cases (for example, Google, since they have been a CA for quite some time). These sessions, using the traffic flow metadata discussed in Hunting Pack can be examined for beaconing type behavior as depicted below.

Figure 20. Example SSL Beaconing

By opening the visualization for a specific hostname or destination IP address, we can begin to notice patterns. Most of our sessions happen every 5 minutes, as seen by the spikes. There are fewer sessions just before and after, possibly due to system load or clock drift, but the sessions definitely seem to have a mechanical cadence. SSL/TLS is used on Trojans infrequently. This is most likely because many proxies have options to check for the validity of a certificate before allowing the connection to be made. Whether or not that option is enabled is up to the organization in question. This point is moot if the Trojan uses a legitimate website with a legitimate certificate that has been compromised.

Service OTHER

The Service type OTHER oftentimes confuses the analyst. “Why didn’t protocol X get registered as service X” is probably the most commonly asked question. NetWitness is shipped with a default set of parsers that identify services and parse out a majority of the details about that protocol. The Live content management system contains hundreds of additional parsers to identify protocols and look for additional details about individual sessions. There is a tradeoff between CPU performance and the amount of logic that a parser can execute on a given session. We must also consider the perspective of the forensic analyst looking through the dataset.

Do we really need to parse Kerberos or the QQ messenger protocol as verbosely as HTTP? Is the protocol in question well defined and has a handshake or other marker that can determine the service type? Often for these OTHER sessions, there is a mix of forensically uninteresting sessions (such as Kerberos), or ill-defined and ever changing protocols such as QQ.

,>Parsing and identifying these protocols takes CPU time on the decoder, while using resources that are generally better suited to forensically interesting sessions, instead of being used to arbitrarily identify random protocols. Much of the OTHER traffic is also a mix of SYN timeouts, resumed sessions after the 32MB or 60 second limit has been reached (although with version 10.5 and newer this has been greatly reduced) or Peer-to-Peer traffic that is encrypted and has no magic or protocol identifier.>Session Lua Parser,>In section Identifying Traffic Flows, the metadata generated from the traffic_flow.lua parser can be of help. The analyst can use the first carve rule to first filter out the single sided sessions, zero payload and internal traffic from the dataset. The following Session Characteristics can be used to further classify traffic with higher granularity.,>MetadataDescription, , , , , , , payload > 0 and the TCP_SYN flag was seen, ,>The session_analysis.lua parser is extremely important to this search. Because NetWitness declares sessions to be over prematurely to allow for realistic memory size and transfer rates to be achieved by a single machine, the TCP flags become very relevant. For example, if I were to open a browser and request an ISO from ‘foo.com/bin.iso’, the HTTP request would look like the figure below.>, , , , application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8, like G, , , deflate, sdch, en;q=0.8, ,>Figure 21. Simple HTTP GET Request for bin.iso,>If this ISO was a regular ~4.7 GB DVD image, the first session would be service type 80, or HTTP and be 32MB in size. The next sessions would all be the same service type and be 32MB in size and have the same TCP source port as the session that initiated the request. Starting with version 10.5, however, these split sessions are tracked in a state table in the decoder and are logically linked together with the session.split meta key. We can see this logic in action in the long SSL connection below.>

,>Figure 22. State Tracking with Session Splitting,>The decoder determines request and response based on which IP/Port combination began the session by being captured in the assembler. If my local IP address is 192.168.1.10 and the remote server is 192.168.5.5, a query such as the one shown in State Tracking with Session Splitting would select all the continuation traffic, as well as the beginning of the connection.,>, , ,>Figure 23. Example IP Query,>However, by specifying the TCP SYN flag as additional metadata, we can ensure we get the beginning of the TCP session. This also eliminated the OTHER continuation traffic that otherwise might overwhelm the analyst.,>, , ,>Figure 24. Example IP Query with SYN TCP Flag Meta,>Some other possibly interesting cases of OTHER type traffic:,>

OTHER with certain file fingerprints (exe, archives, and so forth), although in many cases, these are simply backups.

OTHER on common protocol ports such as 80>Currently, there are two additional pieces of metadata that deserve explanation: binary indicator and binary handshake. These two are registered in Session Characteristics and look for specific characteristics of TCP connections. The first, binary indicator, is looking at the first 8 bytes of payload of a TCP session and compares each byte to the frame length. The logic then reads the first 16 bytes of a TCP session and compares each word [2 bytes] in Big Endian and Little Endian to the frame length. If any of these checks are true, the meta is added. This is looking for common Trojan characteristics that use custom protocols to control data flow. The Trojan known as Gh0st uses this method because the payload after the frame header is compressed with the zlib library and it needs to know the compressed sized of the payload as well as the uncompressed size of the payload.> ,>The binary handshake meta fires when a TCP session has 2 streams and each stream has a payload greater than 256 bytes. The bytes are looped through and checked to see if they are within the ASCII range or not. If the number of non-ASCII characters in the 512 bytes analyzed is greater than 310, the meta will be registered. This is another artifact of custom protocols and these two meta values combined with the first carve and service = 0 give the analyst more data points to analyze other sessions in NetWitness.,>Layer Four Indicators,>Some Trojan C2 protocols leave indicators at layer 4 or below. NetWitness parsers operate only on payload data; anything in layer 4 or below has to be understood by the Decoder process itself and is not as easy to modify as the parsers themselves. The Lua library nw.packet provides a method to interrogate the individual packets and header information. The well-known Trojan Poison Ivy uses Camellia ECB to encrypt the session between the server and the client. The handshake is an exchange of 0x100 or 256 bytes between the server and the client. The data in this handshake is based on the configured password for the Trojan. Attempts to create signatures based on common passwords can be simply circumvented by using a password that is not on the list. This has allowed the Poison Ivy Trojan to remain a staple of targeted intrusions despite its age and known properties.,>

,>Figure 25. Poison Ivy Handshake,>Live has the poison_ivy.lua parser looking for the 256 byte authentication exchange utilized by Poison Ivy. A full 256-byte exchange will alert into the Indicators of Compromise meta category, and a 256-byte beacon will alert into the same. When combined with the OTHER service, this can be a powerful indicator when searching for this type of Trojan and communications protocol.,>Additional Protocol Anomalies,>A lot of attention is given to the HTTP, SSL and OTHER services because that is generally where the sessions associated with Trojan C2 communicate with. Malicious activity is not always associated with a Trojan—it can be caused by misconfigurations or other types of behavior. The sub-sections below outline hunting for malicious activity in other services and protocols for FTP, RDP, SSH, DNS> ,>BACK NEXT,

NetWitness Hunting Guide - Page 4