• Search Blog Archives

Follow us: 
Like us on Facebook Follow us on Twitter Visit us on YouTube Follow us on LinkedIn
Browse by Tags



In 2011, this exploit kit won't work
Posted: 30 Dec 2010 10:30 AM

And some Web sites will be a lot safer! While reviewing incidents and deobfuscating a Web site today, I discovered an installation of a particular exploit kit that won't work after New Year's Eve.  The site I found caught my attention because the code simply looks like garbage.  As the saying goes, "One man's trash is another man's treasure."  So I started digging into the obfuscation of the code and found something that I thought would be topical considering today's date.  The code in this exploit kit will actually expire at midnight on New Year's Eve local time!  In this post, I'll cover how I came across this and show you how and why the exploit kit installations will expire.

 

Here is a screen shot of the code in the original state as I found it:

 

You may notice right off the bat that there is a Java exploit in there, but the focus of this post is in the obfuscated script that is meant to exploit the client browser.  If you have read any of my previous blog posts, you know that I tend to zero in on long streams of data in an obfuscated script.  More often than not, that is where the multiple exploits of an obfuscated script live, and also usually where the deobfuscation routine can be found.  So let's have a look at that part of the script and beautify it a bit so that we can see more clearly what is going on.

 

Here is a screen shot of the part we want to focus on:

 

The attack script is obfuscated in the var w declaration above.  We can see that variable holds a long string of digits. We can also see that there is a masked String.fromCharCode a little further down in the script.  Right away, we might assume that this is a simple character code obfuscation, but with numbers like 1001, 1361, 426..., this assumption would be incorrect.  Just looking at the script for a bit, we can see what is going on for the deobfuscation.  A loop is using String.fromCharCode to generate characters from the difference between the latter half of w and the first half of w.  However, this is not the focus, and the funny part, of this blog posting. 

 

When you look at the above script, you can see that the Date() object is used to get the current date from the client JavaScript engine.  Later, we see that the object iilq, which is where the date is assigned, uses the getFullYear() method and subtracts 1 from the current year.  Up until December 31, 2010 11:59:59 local time, this resutl will be 2009.  At this point, var qgy looks like e2009al, so we can see that this variable is meant to be a masked eval.  To be used as an eval, this variable has to be unmasked, which is done in the following statement of .replace("2009","v").  The whole execution of the attack script depends on this eval. So what happens in 2011?  The varable will remain a masked eval (which will look like e2010al) because the following .replace will not work, which in turn renders the script benign! 

 

Here is what the attack script will look like until December 31, 2010 11:59:59:

 

In conclusion, we can only assume that this was an unintended mistake by the exploit kit writers and that it will probably be fixed.  However, I'm sure they were unaware of this mistake, as the obfuscation of their attacks is probably contracted out, or they use off-the-shelf software to obfuscate their kits.  Come New Year's Day, we will all be just a little safer out on the Web because of this!  Happy New Year all!!

Chris Astacio

Installation Protection Mechanisms of Phoenix Exploit's Kit
Posted: 27 Dec 2010 12:00 PM

As part of my research within Websense Security Labs, I collaborate with a group of researchers tasked with profiling exploit kits.  This helps us refine the analytics used in ACE, our Advanced Classification Engine.  In this post I want to cover the installation of Phoenix Exploit's Kit.  I'm not going to tell you how to install and use it, but I will cover some of the more interesting aspects of installation.  Specifically, I want to cover how the developers protect their code from being reverse engineered and how the developers have attempted to keep researchers from poking around in installed kits. 

 

To begin, let's have a look at the installer for the kit.  Like many exploit kits, this one is PHP-based but unlike most kits, the installer is actually obfuscated.  This is probably an attempt by the developers to make it harder for security researchers to understand how to install the kits, especially if there is no 'readme.txt' file included in a kit.  Typically, exploit kits come with some sort of installation and or revision documents which come in the form of a 'readme.txt' file or 'notes.txt'.  Without the readme file, it can be difficult to install a kit unless you reverse engineer the installation process.  Most of the time, the reverse engineering of kit installation is pretty easy because the PHP code is not obfuscated.

 

Here is a look at the obfuscated code in the PHP installer:

 

 

Looking at this code, we can see that it's Base64 encoded and a ZLIB compressed stream of data.  The PHP script uses an 'eval' statement with 'gzuncompress' and 'base_64decode' functions to decode the stream of data.  For us to get the clear text code, we can use a simple substitution trick along with the PHP CLI so that we can then analyze the installer's code. To do this, we simply need to replace the 'eval' with a 'print' and run the install.php script on the command line. 

 

 

Here is a snippet of the deobfuscated install.php script:

 

 

Looking at this code, if you're like me, you might think that the interesting thing about it is the variable declarations with long base64 encoded streams.  It actually turns out that each one of those variables is holding obfuscated PHP code for the page for which the variable is named.  For example the '$config' variable holds the base64 encoded 'config.php' file and the '$activate' variable holds the 'activate.php' code, which we will get to in a bit.  This is where things get interesting, as far as protection mechanisms go.  The reason that the PHP code for each of these scripts is held in a variable is because the page names actually get randomized for each installation!  This helps to prevent security researchers from easily finding and possibly viewing statistics about the site hosting a Phoenix Exploit's Kit.  Prior to the version being analyzed here, Phoenix came with standard page names so once the exploit page was found, it was easy to find the statistics page and try to break in to view stats from that particular installation. 

 

Here is what the install looks like when it's visited from the browser:

 

 

As you can see, when viewing the installer from the browser, there is really nothing special about it.  You get to choose the language of the installation instructions, either English or Russian. And on the next page you have a form to fill out for various resources.  I'm not going to show you this form for the reason that it contains sensitive information.  However, I will show you the result after filling out the form so that you can see the randomized page names and what has to be done to activate the kit.

 

This is a look at my current working directory before the install of Phoenix Exploit's Kit:

 

 

Here is the same directory after the completion of the install script:

 

 

As you can see, the install script contains just about everything needed to install the kit.  It extracts the necessary scripts and randomizes the file names, and thus the purpose of the file.  If you have a look at the code in each file, you can begin to figure out the purpose of each file.  The thing to notice and realize from here is that each installation creates unique names for each of the pages.  Again, what this means is that a researcher can't find statistics for an installed kit after finding the page serving up the exploits.  Rather, for any given kit installed in the wild, it's anybody's guess as to the names used for statistics and other pages used by that kit! 

 

Regarding the installation we've been examining, at this point the kit isn't at all usable because it doesn't yet contain the exploits.  To obtain the exploits, the purchaser of the kit must contact the developer to activate their kit.  The "installation success" page explains this: "To activate this installed copy of Phoenix Exploits Kit please send the following activation string to the author."

 

Here is a screen shot of the installation success page:

 

In summary, we can see that the developers of Phoenix Exploit's Kit are working on not only protecting their exploit code from being recognized, but also their installations.  This makes it difficult for researchers to further dissect and understand how the kit works, especially if a researcher comes across just the install script.  It also makes things more difficult for others who want to study and report on the statistics found from individual installations of Phoenix by randomizing the page names used in the kit installations.

Chris Astacio

Crypto-Analysis in Shellcode Detection
Posted: 03 Jun 2010 03:32 AM

Probably the biggest computer threats nowadays are the Exploits. Exploits seek out the vulnerabilities of an application to launch their malicious activities. Cyber criminals target popular applications such as Shockwave Flash and Adobe Acrobat PDF to keep the chances high that a user's computer is vulnerable. In this blog we will examine a Flash exploit using a very simple crypto-analysis technique we call X-ray.

 

Crypto-analysis of malicious code is not a new technology or invention. It has been used in fighting MS-DOS viruses since the '90s. This article provides an in-depth, detailed discussion on this subject, explaining how it works and how it can be used for malicious content detection in shell code.

 

First we need to understand the X-ray technique and how it works, and then we can see how it helps us to analyze and detect malicious content in shell code. X-ray is basically a differential crypto-analysis method which is a very easy way to attack simple encrypted data. What we assume is that when a simple block encryption algorithm is used, the difference between the consecutive data blocks remains the same.

 

One very good way to explain this is to encrypt a picture and then try decrypting it. Take a look at this picture:

 

 

The picture does not tell us much, except that we can see that it is encrypted. It looks random enough, even though we can spot some repetition. In fact the algorithm used is very simple stream ciphering with some avalanche effect. The result is a picture that suggests very little about itself. However, when we generate the difference in between the consecutive bytes, we get this:

 

 

Ah-ha! Now we see that this is the logo of our secret weapon against Internet threats.  :-)  (See the original graphic below)

 

 

Now, no wonder it is called X-ray!  We may not see the 'skin', but we clearly see the 'bones'. The resulting picture is far from the original one, but is good enough to see what it was. Nice, but how does it work?

 

To understand, we need to get into the math behind cryptography. Take a look at this very simple block ciphering algorithm. We have a message of:

 

Where M is the n length of plaintext message and m is the block of the message (typically a character).

 

In order to get the ciphered message of:

 

Where C is the n length of ciphertext (encrypted) message and c is the block of the message (typically a character),

 

we need to apply an encryption to each one of the message blocks using the same key:

 

Where E is the encryption algorithm using k key.

 

When E encryption algorithm is a simple XOR using the same k key on each block, then

 

 

the (above) formula gives us the encrypted stream. Usually we see this simple method in shell code with byte size blocks. In other words, each one of the characters of clear text is simply XORed with a constant (see the pseudo code). The reason this kind of encryption is so popular is that it is easy to understand and it is also easy to obfuscate the data and the code sections enough to avoid detection by a simple string detection engine.

 

 

Now we can note that a simple differential analysis will easily decypher this kind of encryption:

 

 

Why? Because:

 

 

Because XOR is commutative, we can remove the brackets and reorganize the equation to:

 

 

We know that:

 

 

Therefore:

 

 

As we can see, a simple block ciphering does not provide strong encryption. And because simple block ciphering is widely used in exploits, we can easily break those by decyphering known text or binary content in them. To put the theory in practice, let's take a look at this simple decryption loop taken out of shell code used in an SWF exploit:

(MD5 of the sample: 32398CBF94CA9B42E0B415BB93B00CFE)

 

As we can see, the code uses byte size blocks and a simple XOR ciphering with a constant 0x3D. Inside the code we can also see a pattern starts with some 0x3D following by a text "UIIM":

 

 

We might suspect that is an encrypted URL starting with "http://". Now that we know the algorithm and the encryption key, it is easy to double check if our suspicion is correct. The question is how do we find this string without knowing the key?

 

Do you remember the differential attack? All we need to do is to take a known text, which is "http://", and create a stream of differences:

 

 

 

 

Similarly, we create a difference on the encrypted stream:

 

 

 

 

And then, if we can find ΔM in ΔC, then we have what we are looking for. Obviously, the longer the known text, the less prone it is to falsely detecting the string.

 

The next step is to determine the key, and decipher the entire URL, which is very simple by just doing an XOR on the first detected block and the first block of the known text:

 

 

Knowing all of this, we can now write a simple analysis tool that can find and extract 'interesting text' from a binary file. As an example, here is the output of a small Perl hack I wrote earlier (see the script attached below):

 

 

The good thing about this technique is that we have to generate the differential set from the known text set only once in the lifetime of the application. Also, we need to generate the differences of the scanned shell code only once to check all the known text from our dictionary. Our dictionary therefore can be huge, including not only known and unknown URL patterns, but binary sequences that can identify each type of shell code we already know.

 

So far so good, however, life would be too easy if all of our work was finished now, yes? Many times, we see the shell code in compressed and otherwise obfuscated format. For example a Flash file could be compressed, or in a PDF file each stream can be compressed and encoded in different ways, which than can contain obfuscated JavaScript code that holds the shell code. The detection or analytics engine therefore first needs to do all the necessary transformations and de-obfuscations in order to be able to analyze the shell code. Maybe this is one of the reasons we can see simple encryptions most of the times.

 

Although in reality we see mostly simple block ciphers in exploits, there are many examples in viruses and trojans of much more sophisticated encryptions. These use a variety of block and stream ciphers with different length encryption keys, even applying more than one algorithm on top of each other to harden the encryption. Breaking such ciphertext requires more complex method, however, due to constantly increasing computing power, it is even possible to attack the DES algorithm. The good news is that even when stronger encryption has been applied, we have better techniques to detect malicious content.

 

©2013 Websense, Inc. All Rights Reserved.