Before companies like Microsoft and Apple release new software, the code is reviewed and tested to ensure it works as planned and to find any bugs.

Hackers and cybercrooks do the same. The last thing you want if you're a cyberthug is for your banking Trojan to crash a victim's system and be exposed. More importantly, you don't want your victim's antivirus engine to detect the malicious tool.

So how do you maintain your stealth? You submit your code to Google's VirusTotal site and let it do the testing for you.

It's long been suspected that hackers and nation-state spies are using Google's antivirus site to test their tools before unleashing them on victims. Now Brandon Dixon, an independent security researcher, has caught them in the act, tracking several high-profile hacking groups—including, surprisingly, two well-known nation-state teams—as they used VirusTotal to hone their code and develop their tradecraft.

"There's certainly irony" in their use of the site, Dixon says. "I wouldn't have expected a nation state to use a public system to do their testing."

VirusTotal is a free online service—launched in 2004 by Hispasec Sistemas in Spain and acquired by Google in 2012—that aggregates more than three dozen antivirus scanners made by Symantec, Kaspersky Lab, F-Secure and others. Researchers, and anyone else who finds a suspicious file on their system, can upload the file to the site to see if any of the scanners tag it malicious. But the site, meant to protect us from hackers, also inadvertently provides hackers the opportunity to tweak and test their code until it bypasses the site's suite of antivirus tools.

Dixon has been tracking submissions to the site for years and, using data associated with each uploaded file, has identified several distinct hackers or hacker teams as they've used VirusTotal to refine their code. He's even been able to identify some of their intended targets.

He can do this because every uploaded file leaves a trail of metadata available to subscribers of VirusTotal's professional-grade service. The data includes the file's name and a timestamp of when it was uploaded, as well as a hash derived from the uploader's IP address and the country from which the file was submitted based on the IP address. Though Google masks the IP address to make it difficult to derive from the hash, the hash still is helpful in identifying multiple submissions from the same address. And, strangely, some of the groups Dixon monitored used the same addresses repeatedly to submit their malicious code.

Using an algorithm he created to parse the metadata, Dixon spotted patterns and clusters of files submitted by two well-known cyberespionage teams believed to be based in China, and a group that appears to be in Iran. Over weeks and months, Dixon watched as the attackers tweaked and developed their code and the number of scanners detecting it dropped. He could even in some cases predict when they might launch their attack and identify when some of the victims were hit—code that he saw submitted by some of the attackers for testing later showed up at VirusTotal again when a victim spotted it on a machine and submitted it for detection.

Tracking the Infamous Comment Crew

One of the most prolific groups he tracked belongs to the infamous Comment Crew team, also known by security researchers as APT1. Believed to be a state-sponsored group tied to China's military, Comment Crew reportedly is responsible for stealing terabytes of data from Coca-Cola, RSA and more than 100 other companies and government agencies since 2006. More recently, the group has focused on critical infrastructure in the U.S., targeting companies like Telvent, which makes control system software used in parts of the U.S. electrical power grid, oil and gas pipelines and in water systems. The group Dixon tracked isn't the main Comment Crew outfit but a subgroup of it.

He also spotted and tracked a group known by security researchers as NetTraveler. Believed to be in China, NetTraveler has been hacking government, diplomatic and military victims for a decade, in addition to targeting the office of the Dalai Lama and supporters of Uyghur and Tibetan causes.

The groups Dixon observed, apparently ignorant of the fact that others could watch them, did little to conceal their activity. However, at one point the Comment Crew did begin using unique IP addresses for each submission, suggesting they suddenly got wise to the possibility that they were being watched.

Dixon got the idea to mine VirusTotal's metadata after hearing security researchers repeatedly express suspicions that hackers were using the site as a testing tool. Until now he's been reluctant to publicly discuss his work on the metadata, knowing it would prompt attackers to change their tactics and make it harder to profile them. But he says there is now enough historical data in the VirusTotal archive that other researchers can mine it to identify groups and activity he may have missed. This week he's releasing code he developed for analyzing the metadata so others can do their own research.

Dixon says it wasn't initially easy to spot groups of attackers in the data. "Finding them turned out to be a very difficult problem to solve," he says. "When I first looked at this data, I didn't know what I should be looking for. I didn't know what made an attacker until I found an attacker."

Brandon Dixon http://blog.9bplus.com/

Surreptitiously Watching Hackers Hone Their Attacks

The data provides a rare and fascinating look at the inner workings of the hacker teams and the learning curve they followed as they perfected their attacks. During the three months he observed the Comment Crew gang, for example, they altered every line of code in their malware's installation routine and added and deleted different functions. But in making some of the changes to the code, the hackers screwed up and disabled their Trojan at one point. They also introduced bugs and sabotaged other parts of their attack. All the while, Dixon watched as they experimented to get it right.

Between August and October 2012, when Dixon watched them, he mapped the Crew's operations as they modified various strings in their malicious files, renamed the files, moved components around, and removed the URLs for the command-and-control servers used to communicate with their attack code on infected machines. They also tested out a couple of packer tools—used to reduce the size of malware and encase it in a wrapper to make it harder for virus scanners to see and identify malicious code.

Some of their tactics worked, others did not. When they did work, the attackers often were able to reduce to just two or three the number of engines detecting their code. It generally took just minor tweaks to make their attack code invisible to scanners, underscoring how hard it can be for antivirus engines to keep pace with an attacker's shapeshifting code.

There was no definitive pattern to the kinds of changes that reduced the detection rate. Although all of the samples Dixon tracked got detected by one or more antivirus engine, those with low detection rates were often found only by the more obscure engines that are not in popular use.

Though the Crew sometimes went to great lengths to alter parts of their attack, they curiously never changed other telltale strings—ones pertaining to the Trojan's communication with command servers, for example, remained untouched, allowing Dixon to help develop signatures to spot and halt the malicious activity on infected machines. The Crew also never changed an encryption key they used for a particular attack—derived from an MD5 hash of the string Hello@)!0. And most of the time, the Crew used just three IP addresses to make all of their submissions to VirusTotal before suddenly getting wise and switching to unique IP addresses. Given the number of mistakes the group made, he suspects those behind the code were inexperienced and unsupervised.

Connecting Attacks to Victims

At times, Dixon could track files he saw uploaded to VirusTotal and connect them to victims. And sometimes he could track how much time passed between the end of testing and the launch of an attack. Most of the time, Comment Crew launched its attack within hours or days of testing. For example, on August 20, 2012 the group introduced a bug in their code that never got fixed. The sample, with bug intact, showed up on a victim's machine within two days of it being tested.

Dixon tracked NetTraveler in much the same way that he tracked the Comment Crew. The Travelers showed up on VirusTotal in 2009 and appeared to gradually grow more prolific over time, more than doubling the number of files submitted each year. In 2009, the hackers submitted just 33 files to the site, but last year submitted 391 files. They've already submitted 386 this year.

They made it particularly easy to track their code in the wild because even the emails and attachments they used in their phishing campaigns got tested on VirusTotal. More surprising, they even uploaded files they'd stolen from victims's machines. Dixon found calendar documents and attachments taken from some of the group's Tibetan victims uploaded to VirusTotal. He thinks, ironically, that the hackers may have been testing the files to see if they were infected before opening them on their own machines.

The unknown hacker or group of hackers that Dixon tracked from Iran popped up on VirusTotal this past June. In just a month, the party uploaded about 1,000 weaponized documents to the site and showed considerable skill in evading detection. In some cases, they even took old exploits that had been circling in the wild for two years and managed to tweak them enough to bypass all of the virus scanners. Dixon also spotted what appeared to be members of the PlugX hacking group uploading files to the site. PlugX is a family of malware believed to be from China that started appearing last year in the wild and has evolved over time. The PlugX group has uploaded about 1,600 components to VirusTotal since April 2013, and tends to use a unique IP address each time.

Now that the activity of hacking groups on VirusTotal has been exposed, they'll no doubt continue to use the site but alter their ways to better avoid tracking. Dixon is fine with that. As long as security companies now have confirmation that some of the code uploaded to the site is pre-attack code, it gives them an opportunity to look for telltale signs and craft their signatures and other defense mechanisms before the code is released in the wild.