The annual FireEye Flare-On challenge is something I really look forward to, with great excitement and impatience. I strongly suggest to everyone to spend sometime on past challenges, as they are an excellent source of knowledge and a chance for skill development. From all Flare-On challenges I recall Flare-On 5 during 2018 which is considered the most difficult challenge, so difficult, that the organizers mentioned, that in the future the challenges will be easier. One such challenge was the leet_editr (challenge number 5) which became my nemesis as I was not able to complete the challenge and thus move forward.
The challenge it self is exciting; the authors implemented an obfuscation technique based on Vector Exception Handling. The binary has 5 encrypted function code blocks and one encrypted data block. The allocated page for each encrypted block has the PAGE_NO_ACCESS permission set. When the instruction flow is led to those parts or encrypted data are accessed an exception is generated. The exception handler decrypts on the fly the instructions and resume the execution, then the instructions are encrypted again. The same stands for the data block. It should be noted that the decryption / encryption does not happen once but multiple times in different sections of the block. Each section is described from a map structure.
The data block is actually a VBScript loaded by Internet Explorer. Moreover the binary uses COM objects and specifically IScriptControl interfaces to manipulate the execution of the integrated VBScripts.
As stated before there are 6 encrypted blocks (5 code and 1 data). Each block has attributes that are described by structures. The structure has a field that specifies the decryption routine type as the encryption algorithm is not the same at all cases.
There are 4 decryption routines:
- Incrementing XOR
- A hybrid algorithm containing XOR loops and RC4
In this post we will analyze only the last algorithm. Many times I found my self stuck due to mistakes in my coding while trying to reimplement the reversed functions. This leads to confusion, wasting time and stress. What if I could directly call the necessary function blocks directly to get the decrypted data? Sure this could be done on the debugger but the VEH handling interrupts made the debugging difficult. I always wanted to try FireEye Speakeasy emulator and testing it while facing once again my nemesis was the best occasion.
Speakeasy (link here) is an emulator platform based on Unicorn emulator. The Speakeasy platform has numerous wrappers over Unicorn. Significant parts of Windows user and kernel mode have been implemented. Kernel mode emulation is for sure something unusual. Everything from threads, processes, network and file operations have been implemented also. Various Window DLL functions are implemented and described by Python files so they can easily be extended.
Speakeasy is being governed by a detailed configuration file. Finally Speakeasy authors have fully implemented basic Windows structures such as PEB, TEB as well as memory management operations. Speakeasy is extremely extendable and the code is very clean and easy to understand
In the following script we can see the decompiled code of the aforementioned code block. We will not go into many details as the goal is to merely get an overview.
As stated before the decryption routine handles four decryption types; the code snippet describes the implementation of the last one (hybrid decryption). The hybrid contains multiple XOR loops and RC4
The entry point of the specific function block is the 0x4013cf. The first argument is stored in esi register which holds the pointer to the encrypted block attribute structure. Specifically the offset +0x1c holds the encryption Keys for each part of the decryption.
The structure of the encrypted block has a member where points to an array that is used as a map. This map defines the offset and the length of bytes for each operation. For example if code is executed in one of the encrypted function blocks we need to know the length of the instructions and the offset. How many bytes will be decrypted? This is important given that the decryption takes place on subsequences of data. The ebx register holds the offset to that map indicating the length of decryptions as well as the data offset. The following image describes this map structure. For each couple of double words we first have the encrypted data offset that has to be decrypted and then the length.
In the edi register the pointer of the encrypted data that will be decrypted is stored. One can see that edi is offset by ebx as described previously
The last parameter included is the “SBox” which is a stack-based 260 byte array used for the initialization of RC4 SBox. The parameter is pushed in the stack as argument
Having described all the necessary components we can now build our script. First lets create a class inheriting the Speakeasy class and it contains the necessary components such as logger and code callbacks.
The logger argument initializes the logging output and verbose level. The debug arguments specify whether Speakeasy will output debug information or not. Additionally we have implemented a function that prints the DWORD values on specific offsets. Finally the code hook callback function is defined (will be explained later).
Speakeasy is initialized by calling our Emulator class. Then the binary should be loaded. Speakeasy can load various types binaries such as executables, DLLs, kernel drivers and shellcode. Next in line is the code hook specification. Speakeasy can define callback functions (hooks) for a variety of events such as, memory access, API calls, interrupts, etc. Each hook has a callback function that can handle the event. The callback function also receives a context (ctx) parameter which can transfer additional information.
The context class Ctx has only a getter and a setter method. This class is used to set the pointer where the SBox is located and return the pointer address when required. The Ctx class allows to pass previously set pointers to various callbacks.
Next we have the callback function for our code hook. The callback function is part of the Emulator class. The callback functions handles three code addresses:
- 0x401479 which is the call in the SBox initialization function (see screen below). As stated before the decryption function allocates a byte array in the stack in order to store the SBox.
- 0x40148A which is the call to the RC4 function.
- 0x4013b1 the return address of the function
Given that the function is not being executed from its entry point the necessary stack space for local variables are not allocated, therefore we need to allocate this space and point it in our function parameters. The bufferToAlloc pointer holding the SBox array is at the top of the stack. A careful reader might note that the buffer is never freed which is true and please excuse my laziness.
The same stack manipulation has to be done for the RC4 function which requires the SBox buffer pointer to be in the top of the stack.
Once the function exits we write the decrypted data to our buffer.
Finally we will describe the entry of our Emulation script where the initial registers are being saved. As discussed previously the encrypted blocks are based on an offset map stored in 0x40c398. The number of encrypted parts in the block is stored at 0x40cd00+0x10. So if we loop on our map we can easily find the start of each block and its length. The block that needs to be decrypted is at 0x404790.
Speakeasy emulator is very versatile and can assist an analyst in analyzing complex functions when live analysis is cumbersome. In my opinion it can significantly reduce the effort on decrypting / decoding obfuscated strings or encrypted blocks.
Even though Speakeasy is not extensively documented the code repository is very well written and one can immediately understand how the code should be used. Moreover it is very extensible given that the clean structure of the repository allows modifications of small parts of codes without messing the whole framework.