Open Sourcing JA3

SSL/TLS Client Fingerprinting for Malware Detection

UPDATE: Please read the latest blog post on JA3 here:

A JA3 hash represents the fingerprint of an SSL/TLS client application as detected via a network sensor or device, such as Bro or Suricata. This allows for simple and effective detection of client applications such as Chrome running on OSX ( JA3=94c485bca29d5392be53f2b8cf7f4304 ) or the Dyre malware family running on Windows ( JA3=b386946a5a44d1ddcc843bc75336dfce ) or Metasploit’s Meterpreter running on Linux ( JA3=5d65ea3fb1d4aa7d826733d2f2cbbb1d ). JA3 allows us to detect these applications, malware families, and pen testing tools, regardless of their destination, Command and Control (C2) IPs, or SSL certificates.

JA3 has been open sourced and is available here: https://github.com/salesforce/ja3

JA3 was created by:

John B. Althouse

Jeff Atkinson

Josh Atkins

How it works

TLS and its predecessor, SSL, I will refer to both as “SSL” for simplicity, are used to encrypt communication for both common applications, to keep your data secure, and malware, so it can hide in the noise. To initiate a SSL session, a client will send a SSL Client Hello packet following the TCP 3-way handshake. This packet and the way in which it is generated is dependent on packages and methods used when building the client application. The server, if accepting SSL connections, will respond with a SSL Server Hello packet, thus continuing the cryptographic negotiation. Because SSL negotiations are transmitted in the clear, it’s possible to fingerprint and identify client applications using the details in the SSL Client Hello packet.

Using details within the SSL Client Hello packet to help identify a client application is not a new concept. Blog posts such as this date back to 2009. In 2015, Lee Brotherston published the most research on the topic and released FingerprinTLS, a stand-alone tool for the job which got a lot of people excited, myself included.

It was then that we set out to create a SSL Fingerprinting solution that would work on existing tools, whether it’s a network security monitoring tool or a load balancer. The fingerprints needed to be unique to the client application and agnostic to it’s destination. We also wanted them to be easy to create and share, meaning they needed to be easily consumed by others and their tools, which could be vastly different from our own. This means that simplicity is key, the more simple the process and output, the more likely it will work with existing technologies.

A common initial thought process is to just hash the entire packet, easy. But this doesn’t work as there’s a random string within the packet as well as certain SSL extensions which can hold destination specific information like the “Server_Name” extension which holds the destination domain. After some trial and tribulation, we finalized on a process that ticked all the boxes and is simple enough that it should be easy to add to existing tools.

JA3 gathers the decimal values of the bytes for the following fields; SSL Version, Accepted Ciphers, List of Extensions, Elliptic Curves, and Elliptic Curve Formats. It then concatenates those values together in order, using a “,” to delimit each field and a “-” to delimit each value in each field.

The field order is as follows:

SSLVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats

Example:

769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0

If there are no SSL Extensions in the Client Hello, the fields are left empty.

Example:

769,4–5–10–9–100–98–3–6–19–18–99,,,

These strings are then MD5 hashed to produce an easily consumable and shareable 32 character fingerprint. This is the JA3 SSL Client Fingerprint.

769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0 → ada70206e40642a3e4461f35503241d5 769,4–5–10–9–100–98–3–6–19–18–99,,,

→ de350869b8c85de67a350c8d186f11e6

We also needed to introduce some code to account for Google’s GREASE (Generate Random Extensions And Sustain Extensibility) as described here. Google uses this as a mechanism to prevent extensibility failures in the TLS ecosystem. JA3 ignores these values completely to ensure that programs utilizing GREASE can still be identified with a single JA3 hash.

Utilizing JA3

This Bro script will add JA3 to every connection observed in ssl.log:

https://github.com/salesforce/ja3/tree/master/bro

This Python script will output JA3 details from pcaps:

https://github.com/salesforce/ja3/tree/master/python

JA3 support has also been added to Moloch and Trisul NSM as of this writing.

Example Analysis

JA3 is currently running on the internal abuse.ch Sandnet. If you’re not familiar with abuse.ch, I highly recommend you take a look at their malware trackers and blacklists. Within the Sandnet we find a malware sample which has had no internal signatures match and no VirusTotal hits.

Diving into this example we can see the malware made several SSL connections outbound, all with the same JA3 fingerprint.

Pivoting on the JA3 fingerprint we can see that it was found in 112 other malware samples with most of them matching the Dyre malware family. We can now say with confidence this sample is related to Dyre so we add it to the blacklist.

Traditionally one might turn the 228 associated IPs into low confidence IOCs. But with JA3, we can simply add one, single high confidence IOC to the blacklist. That’s because JA3 allows us to detect malware based on how it communicates rather than what it communicates to.

Tor, as another example, could be detected by monitoring the thousands of exit nodes throughout the internet. But one can also detect the standard Tor client simply with JA3=e7d705a3286e19ea42f587b344ee6865 .

Conclusion

JA3 is a much more effective way to detect malicious activity over SSL than IP or domain based IOCs. Since JA3 detects the client application, it doesn’t matter if malware uses DGA (Domain Generation Algorithms), or different IPs for each C2 host, or even if the malware uses Twitter for C2, JA3 can detect the malware itself based on how it communicates rather than what it communicates to.

JA3 is also an excellent detection mechanism in locked-down environments where only a few specific applications are allowed to be installed. In these types of environments it is trivial to build a whitelist of expected applications and then alert on any other communication.

Certainly, more analysis needs to be done with JA3, on what it can detect as well as other things it could be used for. We’ve open sourced JA3 and are looking forward to feedback from the community. You can find JA3 here: https://github.com/salesforce/ja3 and can contact me on twitter @4A4133 or over email. Let me know what you find and if you have any feature requests.