Last blog post in this series described the analysis of the attack with the use of webshells. Such attacks showed how difficult it is to ensure the security of the entire infrastructure to defend against them. This part focuses on the evaluation of available tools and providing prevention and mitigation recommendations.

Webshell detection tools

I have evaluated the following projects focusing on webshells detection:

These tools were tested against the files presented in part 1 with addition of a few new ones:

byroe.jpg - webshell hide in an image file

myluph.php - example of PHP webshell

webshell.php - simple PHP webshell presented in part 1

vero.txt - PHP webshell containing both “clean” and obfuscated PHP code

myluphdecoded.php - decoded file myluph.php

China Chopper - ASPX chinachopper.aspx and PHP version chinachopper.php

c99madshell.php - popular C99 webshell

unknownPHP.php - shared by Bart in his blog post

The conducted tests verified the detection accuracy of all tools when faced with a combination of different webshells mixed with hundreds of valid files from GitHub repositories and other public sources:

index.html from different popular websites

ASPX files

PHP files

JavaScript files

NeoPI

At first, I tested NeoPI. According to project’s GitHub page, NeoPI is a Python script that uses a variety of statistical methods to detect obfuscated and encrypted content. Below output presents result of running a tool against a set of aforementioned files:

[[ Total files scanned: 4323 ]] [[ Total files ignored: 0 ]] [[ Scan Time: 16.773207 seconds ]] [[ Average IC for Search ]] 0.0762022597838 [[ Top 10 lowest IC files ]] 0.0153 ../webshell_db_short/myluph.php< 0.0168 ../webshell_db_short/vero.txt 0.0202 ../webshell_db_short/unknownPHP.php 0.0248 ../webshell_db_short/phpcollection/2.php 0.0262 ../webshell_db_short/myluphdecoded.php 0.0268 ../webshell_db_short/phpcollection/wkv3.php 0.0270 ../webshell_db_short/china.aspx 0.0284 ../webshell_db_short/phpcollection/agenda.ics.php 0.0285 ../webshell_db_short/phpcollection/config.xml.php 0.0289 ../webshell_db_short/phpcollection/uploads.php [[ Top 10 entropic files for a given search ]] 6.2409 ../webshell_db_short/phpcollection/phpmailer.lang-zh.php 6.2355 ../webshell_db_short/phpcollection/phpmailer.lang-zh_cn.php 6.1932 ../webshell_db_short/unknownPHP.php 6.1622 ../webshell_db_short/phpcollection/phpmailer.lang-ch.php 6.0307 ../webshell_db_short/vero.txt 6.0258 ../webshell_db_short/myluph.php 6.0151 ../webshell_db_short/phpcollection/phpmailer.lang-ko.php 5.9169 ../webshell_db_short/phpcollection/phpmailer.lang-ja.php 5.7736 ../webshell_db_short/phpcollection/1.php 5.7393 ../webshell_db_short/phpcollection/phpmailer.lang-vi.php [[ Top 10 longest word files ]] 554750 ../webshell_db_short/phpcollection/wkv3.php 11999 ../webshell_db_short/phpcollection/full_dump.php 11999 ../webshell_db_short/phpcollection/contentobjects.php 1774 ../webshell_db_short/myluph.php 660 ../webshell_db_short/vero.txt 641 ../webshell_db_short/c99shell.php 547 ../webshell_db_short/phpcollection/EmailAddressValidator.php 356 ../webshell_db_short/phpcollection/priv.txt 197 ../webshell_db_short/phpcollection/emission.xml (2).php 197 ../webshell_db_short/phpcollection/emission.xml.php [[ Top 10 signature match counts ]] 85 ../webshell_db_short/c99shell.php 35 ../webshell_db_short/phpcollection/run-tests.php 27 ../webshell_db_short/phpcollection/WikiComments.aspx 24 ../webshell_db_short/phpcollection/MemberSearch.aspx 22 ../webshell_db_short/phpcollection/CustomPageManagement.aspx 22 ../webshell_db_short/phpcollection/Comments.aspx 20 ../webshell_db_short/phpcollection/phpmailerTest.php 20 ../webshell_db_short/phpcollection/ManageTerms.aspx 20 ../webshell_db_short/phpcollection/TimestampIntegrationTest.php 17 ../webshell_db_short/byroe.jpg [[ Top cumulative ranked files ]] 56 ../webshell_db_short/myluph.php 57 ../webshell_db_short/vero.txt 176 ../webshell_db_short/c99shell.php 219 ../webshell_db_short/phpcollection/wkv3.php 225 ../webshell_db_short/phpcollection/1.php 372 ../webshell_db_short/myluphdecoded.php 444 ../webshell_db_short/phpcollection/profile.php 525 ../webshell_db_short/phpcollection/WikiComments.aspx 570 ../webshell_db_short/phpcollection/uploadpostattachment.aspx 595 ../webshell_db_short/phpcollection/Fields.aspx

Pros:

detection ratio: 6 out of 9 webshell files

successful detection of clean and obfuscated code of the same webshell

the more complex code structure is, the better results and detection ratio

various methodologies to detect webshells - signatures, index of coincidence (IC), ratio, entropy, longest keyword matching

Cons:

failed detection of simple one-line webshells (e.g. China Chopper)

false negatives and positives in different categories, including final rankings

manual triage and additional analysis of the highlighted files is required for some of the methodologies (e.g. entropy, keyword matching)

signature database is outdated as the project appears to be not developed anymore

webshells hidden inside of another file format (byroe.jpg) will be not detected in wide spectrum of files - NeoIP produce massive false positive

I’ve noticed it would be really helpful to combine summary information about a files detected by more than one heuristic. For instance in my test byroe.jpg was visible in top ten signature matches, longest word and entropy but not in Top cumulative ranked files.

Taking into account that NeoPI wasn’t updated for last 4 years, didn’t detect all types of webshells, generated number of false negatives, it still had quite impressive detection rates of a relatively new webshell samples. I can recommend adding NeoIP to webshell analysis toolbox. InfoSec Institute has a nice write-up on NeoIP with some additional details.

Shell Detector

Shell Detector was a second tool that I have evaluated. I really liked how the results were presented in console:

There is also a web version available here.

Pros:

detection ratio: 7 out of 9 webshell files (5 as suspicious + 2 webshell)

successful detection of clean and obfuscated code of the same webshell.

provided final results in clear graphical form

Cons:

131 false positives based on suspicious word existence

only signature based detection

webshell signature database out of date

sluggish interface when number of results is too high (Web version)

signature database is written in serialized php format (not scalable)

byroe.jpg was not detected by Shell Detector - not support JPG files

To sum up even though the signature database file appears to be out of date the tool correctly determined almost all files to be malicious. This tool can provide powerful detection capability as long as signature database is kept up to date.

LOKI

LOKI presents scan results in a terminal, coloring entries depending on their severity. It also outputs all matches to a single log file. The rules are written in YARA, easy to use yet very powerful language to identify and classify malware which appears to be a tool of choice by the security industry. According to project’s website most effective rules were borrowed from the rule sets of his bigger brother THOR APT Scanner. For me, the most interesting were the ones dedicated to webshells detection.

My first scan of a sample set with a default signature database showed moderate detection ratio (5/9). With YARA growing popularity among infosec world, it’s possible to build and maintain a powerful database to hunt malware including webshells and research new obfuscation techniques and variants observed in the wild. Taking that into account, I decided to improve the results obtained previously. I found set of rules, that almost perfectly match my expectation. After a quick adjustment, final score was close to ideal - ratio (8/9). It were really a tiny changes, so I’ll shortly describe it:

Change $php parameter to “<?” in new rule created based on misc_php_exploits

Add “system($_REQUEST” in misc_php_exploits and newly created rule from point above

Remove two strings in rule misc_shells - $s6 and $s8 (that one was even marked with a comment that it could generate FP, so it was easy ;)

After all of that, as a result I received the biggest advantage of LOKI - false positive number was zero!

Pros:

detection ratio: 8 out of 9 webshell files

successful detection of clean and obfuscated code of the same webshell.

provided final results in clear log file

zero false positives(but that really depends on Yara rule set you use)

easy to develop signatures based on Yara rule

supports all extensions

Cons:

only signature based detection for webshells

Summary

To sum up the results from all the tools, it’s really hard task to develop one tool which will mark with good accuracy webshells as suspicious. It’s because there is a wide range of different functions, methods, encodings which would be use to achieve the same effect. Attackers don’t need to use base64_decode function to decode their base64 code. Instead, they can add their own proprietary function to do exactly that. They can use a string lookup array to avoid keyword-based detection or invoke function names by string with str_replace and much more. Imperva did a great research describing various teqchniques in their blog post.

The only webshell not detected by LOKI was unknownPHP.php which obfuscation technique is really advanced - thanks to Darryl from Kahu Security, you can follow the decoding process in a great post. As its not possible to detect it using general signature rules, NeoPI methods (entropy, Index of Coincidence) are an excellent solution for this kind of backdoors. Together with LOKI, it seems to be a powerful weapon to detect webshells.

Prevention and mitigation

There are a few things that can be done to protect organizations against a server compromises:

PATCH! - it sounds silly, because it seems SO obvious but last year showed that even a well-known attack like Heartbleed doesn’t guarantee that administrators do their job. Two months after the public release, there were still around 300k vulnerable servers

harden your web server - implement a least-privileges policy on the web server, limit script execution permissions in specific locations etc.

deploy DMZ (demilitarized zone) - enable logging of allowed and blocked traffic, limit interaction between DMZ and your production environment

deploy reverse proxy with WAF (Web Application Firewall) - restrict accessible URL paths for only legitimate sources using for example free Mod-Security or other comercial product, consider fuzzy hash matching

regular test your environment - conduct virus signature(e.g. use by WAF) checks, application fuzzing, code reviews and server network analysis

regular test system and application - regularly check the application’s security - pentest and vulnerability scans to establish areas of risk

versioning + backup - establish offline a “well-known good” backup all critical servers, enable monitoring for changes to have clear history on servers

user validation - employ user input validation to restrict local and remote file inclusion vulnerabilities

scan all incoming files to web server (if you accepting file upload from users) - as it was shown before, the administrator can not trust the extensions of the files, all of this could be just a trick to hide malware

always follow up social media discussion!;)

When #ThreatHunting try and define a narrow scope of what you are looking for. I have a thing for webshells lately so… #DFIR 1/8 — Jack Crook (@jackcr) May 10, 2016

Look at processes that are spawned by the owner of the webserver process #DFIR 4/8 — Jack Crook (@jackcr) May 10, 2016

Look at POST requests with no referrer and a 200 response code #DFIR 5/8 — Jack Crook (@jackcr) May 10, 2016

Look for POST requests to new directory paths and filenames with a 200 response code #DFIR 6/8 — Jack Crook (@jackcr) May 10, 2016

Community also has its own ideas:

@jackcr baseline the web server/ app error logs. Focus on exceptions about previously not seen file names e.g -> https://t.co/gIOFcE6wgI — dfir_it (@dfir_it) May 10, 2016

@jackcr File: size, ext, owner, location, content. Request: UA, URI/params, internal 2 internal, interval/duration/size of requests — Glenn (@hiddenillusion) May 10, 2016

AV/HIDS scan of the web server…

Let me digress a little about the last recommendation. First of all, as you know, AV is not a fail-safe mechanism, so you cannot trust it fully. AV products do not protect against all types of attack vectors. It is relatively easy to bypass AV. As a result, you can at least block known malicious code (detected by signatures or heuristics) - not ideal but still an advantage.

When you’ve got AV on your web server (or any other machine for that matter) you need to know that there are costs involved:

introduce additional risk to your machine by adding code which could be vulnerable to different type of attacks like RCE, local priviliges escalation, sandbox escape, etc. Details can be found on Joxean Koret’s presentation or Google Project Zero posts (1, 2)

performance - every AV generate some efficiency loss, it is periodically measured and reported by AV-Comparatives organization - lastest can be found here

Conclusion

The whole series was intended to familiarize you with how popular, diverse and at the same time dangerous are attacks leveraging webshells. As the second part of this series showed, crooks aim was targeting specific companies and webshells are only a small part of bigger plan. Variety, diversity and simplicity of webshells causes the defense against them to be a very difficult task. Even if you fill all the recommendations of the section “prevention and mitigation” does not guarantee that your application/environment is 100% safe, but it is important to build security in a comprehensive manner and to leave as little space as possible to beat our “entanglements” ;) Keep fighting! Keep defending!