Friday, October 9, 2015

Blindly decrypting after ransomware attack (full solution)

Intro

Recently I was asked to help a new friend after some ransomware encrypted all his important files.

And so I went to his office, a warehouse, and found that the main computer had almost all files encrypted.

Sadly that computer was used to store all the information regarding transactions, stocks, prices, official documents since 5 years ago and, having no recent backup, years of documents were apparently lost.

File extensions, all: db, js, bak, bmp, bon, cdx, chm, cif, css, cur, dat, dbc, dbf, dct, dcx, doc, docx, dim, err, exe, fig, fpt, fpw, frt, frx, fxp, gif, htm, ico, idx, ini, jpg, key, lbt, lbx, lfa, log, mnt, mnu, mnx, ocx, out, pdf, rar, rtf, sax, scx, sdf, sem, spx, stp, tbk, ter, tmp, txt, url, xls, xlsx, xml, zip

e.g.
athena.cfg after encryption became athena.cfg!______CRYPTN1@GMAIL.COM.crypt

He contacted the criminals at the provided email: CRYPTN1@GMAIL.COM, they asked for 2000$ ransom.

My friend had to call the police and officials, notify them that the electronic evidence of stocks and transactions is gone.

First look

When I first talked to him I asked him to power down the computer and leave it that way.

So I got there, powered up, used my lucky Hirens bootable memory stick, started looking for traces of virus, encryption complexity and deleted files. Important files were stored on D: drive, for which shadow volume copy was disabled.

Some other "friends with great IT knowledge" already installed various software, antivirus and most if not all decryptors made by Kaspersky. By doing so they just ruined any fast solution of finding the virus name or recovering some possible useful deleted files like temporary stored encryption key.

First look at encrypted files revealed frequency and entropy, and it didn't looked well.

A basic comparison of a text file shows that ASCII low case values 97-122 and 32 (space) are the most frequent in the original one and the same values fill the full range 0-255 after encryption.
Original vs. encrypted text file
A frequency analysis (occurrences here) shows an almost uniform distribution of values after encryption. The better the algorithm, the better the randomness and closer to natural noise.

File entropy, 8 is maximum:
  • 7.95 the encrypted 
  • 4.73 the original, plain text
As I was just starting my vacation I copied a few samples original and encrypted and told him to try bargain with those thieves for at most 500$.

Second look


Back from vacation (one week) my friend had a different problem, the criminals raised the ransom up to 5000 euro after figuring out how important his files are.

So I started to look at encrypted vs. decrypted, blindly having no access to virus code or any information regarding it.

Header

Encrypted files header has 1200 bytes and it's a footer actually, each file shares the same, haven't figured out what it contains.

This is typically good news some ransomware use a different key for each file, see CryptoWall.

Cipher type

Two options here:
  • stream cipher, each byte is encrypted at a time (previous or next bytes are unaffected)
  • block cipher, blocks of data (chunks) like 8 or 16 or 32 in length are encrypted at a time (multiple rounds of substitution-permutation applied to the whole block so one byte different  leads to a whole different block). e.g. S-BOX

Cipher mode of operation

This one deals with the encryptor state, if it changes based on input data or not, also called chaining. See for example ECB or CBC, AES.

Figuring out

By looking at two pairs of encrypted - original files and comparing.


If it was a block cipher, then each block of let's say 8 bytes would have different values so there would be no match like above at position 2 or 6-9 or 11.

If it did change state based on input then there would be no match at file positions 102 or 104, 105.

A closer look reveals that letter p (byte 112) at position 12 is encrypted into 205.
While the same at position 14 is encrypted into 220.

So it's a stream cipher, not based on input just file position.

Decrypting

Decryption by plain-text attack looked feasible so I started harvesting the internet for pairs of original identical copies of the encrypted files. Drivers, default wallpapers, software, txt and html documents, doc, anything that can be used.

The Map

For a map of encrypted byte vs original I wrote more C# code to build a table, which links the two, up to 500 bytes in file length (rows in this case). The code implemented does statistics to filter in case the original partially matches the encrypted one.

Map as table, region
After processing most available data, an overview of the map looked just like below, green cells are known values the rest unknown. Most known values are low case letters, null, space and enter due to the text nature of samples.
Map overview
Not enough to decrypt any file.

Guessing

I started looking for patterns, at first for vertical ones did some xor, bit comparison, various non-linear curve fitting, signal analysis still no luck.

After a while I figured out I wasn't looking the right way, so switched to horizontal rows and rules started to popped out:
  • Symmetry: if original 127 is encrypted to 234, then 234 is encrypted to 127, almost doubled the values.
  • For an even byte the next odd byte is equal to even -1 or +1. e.g. 79 is followed by 78 or 80. The whole row has the same sign -1 or +1. So for each row after finding the sign from an exiting pair of consecutive even-odd numbers, the missing pairs can be computed.
Rules after guessing
These rules and some additional binary .jar files filled about 80% of the map.
Still not enough to decrypt a random file especially binary, and longer than the 500 bytes map.

Map overview after rules
The Key

At the end of the third day of work, late at night, while looking at the numbers I just saw that the rows were unique and repeating. I've started counting and there were only 256 unique rows, so full decryption of any file became possible.

The next day, I simply mixed the identical rows, filled the missing values, and so the map was completely filled, decoded small files to check the algorithm and built a 256x256 matrix to hold the key. In the mosaic picture bellow you can see the symmetry and distribution, colours are based on values.
Key overview

Key small region

The Sequence

The requested files to recover had at most 220 MBytes, which is way more that the map length (500 bytes).

Starting with the largest file I had both encrypted and original, an IntelHD Graphics driver (230MB),
I wrote a new algorithm, which simply finds which row from the key matrix is used at which file position. The output was stored as a 230MB file.

I've analysed the sequence with various tools like diehard,  RaBiGeTe_MT, signal processing, but it looks pure pseudo-random, pure 8.00 file entropy, further analysis was not required anyway.

Final

By combining the sequence and the key table I was finally able after 3 and half days to decrypt all the information from my friend's computer. I wrote the software nicely, multi threaded, and gave a copy to him both source and binary.

The encryption algorithm in C# pseudo-code looks apparently like this:

 byte[][] key = new byte[256][256]; //unique key for all files  
 const int Seed = 1234; //unique initial state for random number generator  
 InitKey();  
 Random rnd = new Random(seed);  
 for(long i = 0; i < file.Length; i++)  
 {  
    int keyRowIndex = rnd.Next(0, 255);  //the random sequence
    byte[] keyRow = keyTable[keyRowIndex];  //the row for that file position
    byte original = file.ReadByte();  //byte to encrypt
    byte encrypted = keyRow[original];  //encrypted
    file.ReplaceCurrentByteWith(encrypted);  //imaginary function
 }  

Conclusion

Ransomware is ugly, because some people want to steal money, they attack and block access to personal and private data. It can be a business or holiday pics or some unique memories stored as files of the beloved ones. And if not paying, files are forever lost.

Paying will never guarantee file recovery, moreover some viruses are badly written and damage files during encryption.

If you encounter such issue, simply stop, turn off your computer, and let some people who know what they are doing, do their job.

First step is to make at least one hdd image and work with that.

Second, whomever wants to help you, for free or not, should know at least what an hdd image is how to make it and how to use it, what ransomware, encryption, byte, are.

All work should be done with that image, not your hdd, don't borrow it to anyone he might drop it in a puddle.

Don't trust what's written on most forums and websites, it's just copy paste, most authors have no idea what they are writing about.

CryptoWall 3.0, you'll find plenty of information, the recovery suggested solutions are these:
  • volume shadow copy 
  • deleted file recovery 
Most likely none will actually work because:

  • volumes shadow copy is typically inactive or deleted by the virus and 
  • deleted files recovery is not possible because files are overwritten.

Sadly all those forums, websites providing tutorials how to recover are just scam and make money from ads and expensive file recovery tools, pointless in this case.

Funny in way, CryptoWall 3.0 deletes itself after it finished encrypting, so all the suggested measures on all those websites, to remove it, are the same quite pointless. And you will reinstall your OS anyway after recovery and add some free antivirus.

First priority is to recover what is important for you, stored files, not computer functionality.

SO don't delete the virus after it encrypted your files, it's your last priority, find it's name and look for an existing solution. This analysis was done blindly because of that.

Trust websites like ww.bleepingcomputer.com and Kaspersky forum they are doing a great job helping others for free.

Have fun