| September 22, 2020

Taidoor - a truly persistent threat

Blog Author

Karlo Zanki, Reverse Engineer at ReversingLabs. Read More...

When malware lasts longer than your washing machine

Introduction

Government-supported actors usually conduct long-lasting activities in cyberspace, and to simplify such continuous processes, they often develop malicious tools with the intention of using them for a long period of time. Like any other malicious tool, they need to be stealthy, and when they get detected, some modifications are necessary to become undetectable again. Sometimes it can be as simple as changing the compromised infrastructure, while at other times creating a new version of the tool is required. When organizations put a lot of effort into creating a tool like that, they probably don't plan to use it in massive campaigns. It is expected to be used in smaller, targeted attacks; therefore, researchers won’t have too many samples for analysis at their disposal.

An example of such a tool is Taidoor RAT (remote access trojan), dating all the way back to 2008, whose new version was recently discovered and presented in a technical report released by the US government institutions. Taidoor is described as a Remote Access Trojan developed and used by cyber actors supported by the Chinese government. The new version of the RAT consists of two parts - a loader in a DLL form, and a main RAT module that comes as RC4-encrypted binary data. The loader first decrypts the encrypted main RAT module, and then executes its exported Start function. The report provides two samples for both the loader and the main encrypted RAT module. These samples come with only two C2 domains and one C2 IP.

In this blog, you will learn how our Titanium Platform can help you find more malware samples to extend the IOC lists and power up your defenses.

Taidoor

As already mentioned, the Taidoor RAT consists of two parts, a loader and the main RAT module. A few pivoting attempts on the loader samples didn’t produce any results, so we focused on the samples of the main RAT module. This makes sense, as they contain the malware configuration including C2 domains and IPs.

Two samples of the main RAT available from the threat report are encrypted with the RC4 algorithm, and aren’t suitable for pivoting in that form. The first step is to decrypt them and get them to their normal DLL format suitable for metadata extraction. Publicly available tools like CyberChef can be used to decrypt various encryption algorithms, including RC4. Once decrypted, the DLL is processed with Titanium Platform, and its metadata is extracted. The first thing to do is look at the files grouped into the same buckets based on the RHA1, our functional file similarity algorithm, for each of the samples.

Similar files grouped by RHA1 algorithm

The RHA1 algorithm reveals ten more samples dating back to 2018 and 2019. However, there are additional options for pivoting. Looking at the samples' exports shows a specific combination of functions and original file name.

Samples exports

Running a Titanium Platform cloud search query with this specific combination finds the samples already discovered using the RHA1 file similarity algorithm, and also 3 additional samples.

Pivoting on exports and original name

Analyzing these three samples shows that they are using the same AES key as described in the threat report. The significant difference is that they have another layer of encryption used to hide the part of the code responsible for configuration decryption, including the AES key and S-Box initialization.

Loading of AES decryption key in previously decrypted layer of code

Pivoting on these three samples didn’t return any new results. However, there is something uncommon in this exported file name - “mm.dll”. It is probably a shorthand for something.

What if there were some variations in the functionality of these samples? There is a chance that this mm prefix could have something appended to it. To check this assumption, a search query is constructed using the wildcard character to match samples that start with the mm prefix and at the same time contain the aforementioned exported functions Start and Service.

Pivoting on exports and original name using wildcard character

Ordering these search results by their size - and ignoring clean software and previously discovered samples - leaves just a few hashes we need to look at. These hashes are split in two groups based on their size. Looking at their exports reveals two new original file names - mm_tcp_dll.dll and mm_tcp_svchost.dll.

Samples exports

Analyzing these samples in disassembler confirms these are indeed Taidoor samples, since they perform the same AES decryption and use the same AES key. The 466 KB samples also have a layer of encryption used to hide the part of the code responsible for configuration decryption, as already described in the previous case.

Loading of AES decryption key in 134KB samples

Loading of AES decryption key in previously decrypted layer of code in 466KB samples

They also include the same strings present in the Taidoor samples mentioned in the referenced threat report.

Some of the strings found in Taidoor samples

One interesting thing that catches the eye are the pdb paths. Samples of the 134 KB size contain the pdb path “C:\Users\john\Desktop\KD17.6_20170628\Release\mm_tcp_svchost.pdb” and have the compilation timestamp set to 2017-07-04T09:20:21. Samples of the 466 KB size have their compilation timestamp set to 2016-04-06T08:44:22 and don’t include any pdb paths, but looking at the contained strings reveals one very similar to the mentioned pdb path - “c:\users\develop\desktop\kd15.1_aes_20160321\mm_tcp_dll\svchost.cpp”. We can assume that in the “KDVV.v_YYYYMMDD” part of the string, the VV.v represents the malware version, and that YYYYMMDD represents the creation date. Unfortunately, pivoting variations on these pdb paths didn’t result in any other samples besides the ones previously found.

There are still two discovered original file names left to pivot on. Pivoting on mm_tcp_svchost.dll doesn’t result in anything new. On the other hand, pivoting on mm_tcp_dll.dll reveals around 70 more samples. Looking at them in detail indicates they are older versions of Taidoor dating all the way back to 2011. These samples have configuration encrypted with the RC4 algorithm, and not with AES like the version from the referenced threat report. We won't describe these samples here because they are a bit outdated and are more similar to the older versions of Taidoor already covered with technical reports from FireEye and TrendMicro.

But if there is a sample named mm_tcp_dll.dll, is it possible that there is a sample using the http protocol for communication? We can guess that it would be named mm_http_dll.dll if it follows the naming convention. A search query indeed finds 17 samples with this name, and analyzing them in disassembler shows they also have the configuration encrypted with the same “0xA1 0xA2” RC4 key. But as illustrated in the following image, these samples are also 8 years old, and are not in the focus of interest of this blog post.

Pivoting on original name

Another very interesting sample exists, originally named mm_udt_dll.dll, suggesting usage of the UDT protocol for the C2 communication. Since it is also quite old, it won’t be examined in detail, but it uses the same RC4 key for configuration encryption, and it is likely that the only difference is the network protocol used for the communication.

Pivoting on original name

All collected samples have their configuration placed in the .data section, and use the same AES key “2B 7E 15 16 28 AE D2 A6 AB F7 15 88 09 CF 4F 3C” IV: “00”. Configurations were extracted from all the samples and subsequently decrypted with the provided key. The last thing to do was to collect decrypted C2 domains and IPs, and create an IOC list provided at the end of this blog.

Development history

The samples were grouped based on the described findings, and one representative sample from each group was put into a table. The table was then sorted based on the compilation timestamp.

SHA1	Compile timestamp	First seen timestamp	Size	File Type	Original name
f1a1ea963ae8aca3a4623912c405cc97df510c07	2015-12-22 T07:15:44	12/23/2015 4:40	155.5KB	PE/Dll	mm.dll
859e0f0ccbcafd25b0877a0c6df0c94cd84d2433	2016-04-06 T08:44:22	7/18/2018 1:03	466KB	PE/Dll	mm_tcp_dll.dll
4118cc4ee6e22bca1933b0033cfe07924293b6bb	2017-07-04 T09:20:21	6/6/2019 12:18	134KB	PE/Dll	mm_tcp_svchost.dll
de7b0889fce6e38ac4f902e2399c9a794f8f00df	2018-04-25 T01:36:11	8/3/2018 13:31	154.5KB	PE/Dll	mm.dll
22c55ded3486614728eaa29a7526d760ac496b20	2018-09-21 T01:51:04	11/20/2018 13:19	179.5KB	PE+/Dll	mm.dll

Two oldest groups of samples have an extra layer of encryption used for hiding the part of the code responsible for AES decryption. At some point in 2017, the malware developers decided to simplify the code and move that extra protection layer, probably to another PE artifact (the loader component). The fact that the same AES key has been used for more than 4 years is a lucky coincidence which simplifies the IOC extraction for researchers and makes correlating samples easier. Older versions of Taidoor mentioned in this blog use a different encryption algorithm, but have a highly similar set of exported functions, and also very similar high-level program logic.

Conclusion

Like any other software product, malware families require a lot of maintenance and improvement to achieve long-term operability. Even though such continuous upgrading helps malware avoid detection mechanisms, it also results in related malware versions. These related versions must have some similarities, and this is an opportunity for security researchers to establish correlations between various malware campaigns and quickly find more threat samples.

As part of this research, and based on the analysis provided in the referenced threat report, 23 related samples were found and 40 new C2 IPs and domains extracted from their configurations.

Titanium Platform helps researchers get a more detailed insight into malware samples. Various search queries can be constructed from the extracted metadata, and can be used to find related threats. With Titanium Platform you can also establish time relationships between the samples, and together with the differences in malware functionality, this can enable you a better understanding of malware development history.