banner
Silas

REAO

Be a better man
github

[Long Ge Submission] Unidbg Hook Collection

This article summarizes the knowledge of Unidbg Hook and Call. Some Hook code is presented in a comparison format with Frida, helping readers who are familiar with Frida but not with Unidbg to quickly get started. Samples can be downloaded from Baidu Cloud.

Link: https://pan.baidu.com/s/1ZRPtQrx4QAPEQhrpq6gbgg
Extraction code: 6666

More tutorials on Unidbg usage and algorithm restoration can be found on the platform.

image

I. Basic Knowledge#

1. Obtaining the Base Address of SO#

I. Frida Obtaining Base Address#

var baseAddr = Module.findBaseAddress("libnative-lib.so");

II. Unidbg Obtaining Base Address#

// Load so into virtual memory
DalvikModule dm = vm.loadLibrary("libnative-lib.so", true);
// The loaded so corresponds to a module
module = dm.getModule();
// Print the base address of libnative-lib.so in Unidbg virtual memory
System.out.println("baseAddr:"+module.base);

In the case of loading multiple SOs

// Get the handle of a specific SO
Module yourModule = emulator.getMemory().findModule("yourModuleName");
// Print its base address
System.out.println("baseAddr:"+yourModule.base);

If only one SO is actively loaded, its base address is always 0x40000000. This is a point to check Unidbg, which can be modified in com/github/unidbg/memory/Memory.java

public interface Memory extends IO, Loader, StackMemory {

    long STACK_BASE = 0xc0000000L;
    int STACK_SIZE_OF_PAGE = 256; // 1024k

    // Modify the starting address of memory mapping
    long MMAP_BASE = 0x40000000L;

    UnidbgPointer allocateStack(int size);
    UnidbgPointer pointer(long address);
    void setStackPoint(long sp);

2. Obtaining Function Address#

I. Frida Obtaining Exported Function Address#

Module.findExportByName("libc.so", "strcmp")

II. Unidbg Obtaining Exported Function Address#

// Load so into virtual memory
DalvikModule dm = vm.loadLibrary("libnative-lib.so", true);
// The loaded libscmain.so corresponds to a module
module = dm.getModule();
int address = (int) module.findSymbolByName("funcNmae").getAddress();

III. Frida Obtaining Non-exported Function Address#

var soAddr = Module.findBaseAddress("libnative-lib.so");
var FuncAddr = soAddr.add(0x1768 + 1);

IV. Unidbg Obtaining Non-exported Function Address#

// Load so into virtual memory
DalvikModule dm = vm.loadLibrary("libnative-lib.so", true);
// The loaded so corresponds to a module
module = dm.getModule();
// offset, check in IDA
int offset = 0x1768;
// Real address = baseAddr + offset
int address = (int) (module.base + offset);

3. Unidbg Hook Overview#

Unidbg supports two main types of Hook on Android:

  • Built-in third-party Hook frameworks in Unidbg, including xHook/Whale/HookZz
  • Unicorn Hook and the Console Debugger encapsulated by Unidbg based on it

The first category is third-party Hook frameworks supported and built into Unidbg, such as Dobby (formerly HookZz)/Whale as inline Hook frameworks, and xHook as a PLT Hook framework. Some may wonder if Unidbg can support Frida? My personal view is that it is unrealistic at this stage; Frida is much more complex than Dobby or xHook, and Unidbg currently cannot run it. Besides, Dobby + Whale + xHook is definitely sufficient, and there is no need for Frida.

The second category is when the underlying engine of Unidbg is chosen to be Unicorn (the default engine), which has built-in Hook functionality. Unicorn provides various levels and granularities of Hook, such as memory Hook/instruction/basic block Hook/exception Hook, etc., which are very powerful and easy to use. Moreover, Unidbg has encapsulated a more user-friendly Console Debugger based on it.

How to choose a Hook solution? It depends on the purpose of using Unidbg. If the project is for simulated execution, it is recommended to use the Console Debugger for quick analysis, error checking, and after running the code, use third-party Hook frameworks for persistence. Why? This needs to start from the assembly execution engines supported by Unidbg. Unidbg supports multiple underlying engines, with the earliest and default engine being Unicorn, which is evident from the name, as Unidbg has a significant relationship with Unicorn. However, later Unidbg also supported several engines, and any behavior that increases program complexity is certainly to solve a certain pain point.

The hypervisor engine can simulate execution on devices equipped with Apple Silicon chips;

The KVM engine can simulate execution on Raspberry Pi;

The Dynarmic engine is for faster simulated execution;

Unicorn is the most powerful and complete simulated execution engine, but it is much slower than Dynarmic. In the same scenario, Dynarmic can simulate execution several times or even dozens of times faster than Unicorn. If the purpose of using Unidbg is to achieve simulated execution in a production environment, speed is the most important, then Dynarmic + unidbg-boot-server, this high-concurrency server is the perfect choice. In practical operations, first use the Unicorn engine to run the simulated execution code, and after switching to Dynarmic without issues, directly go to the production environment.

Using Dynarmic Engine

private static AndroidEmulator createARMEmulator() {
    return AndroidEmulatorBuilder.for32Bit()
            // Switch to Dynarmic engine
            .addBackendFactory(new DynarmicFactory(true))
            .build();
}

Unicorn Default Engine

private static AndroidEmulator createARMEmulator() {
    return AndroidEmulatorBuilder.for32Bit()
            .build();
}

The second scenario for using Unidbg is assisting algorithm restoration, that is, simulated execution is just a prelude to algorithm restoration. After the simulated execution is confirmed to be correct, Unidbg is used to assist in algorithm restoration. In this case, it is naturally recommended to use the Unicorn engine, and both categories of Hook solutions can be used. Which category to choose? I tend to use the second category throughout, that is, the Unicorn Hook-based solution.

I personally believe there are three advantages:

  • HookZz or xHook and other solutions can be detected based on their Hook implementation principles, but Unicorn native Hook is not easily detectable.
  • Unicorn Hook is not limited, while other solutions have significant limitations. For example, inline Hook solutions cannot Hook short functions or two adjacent addresses; PLT Hook cannot Hook Sub_xxx sub-functions.
  • When using both third-party inline Hook frameworks and native Hook solutions, bugs may arise. In fact, even using certain Hook functionalities of Unicorn alone may have bugs. Therefore, using native Hook uniformly will have fewer bugs and less trouble.

In summary:

I. For the purpose of simulated execution#

Use third-party Hook solutions; HookZz has better support under arm32, and Dobby has better support under arm64. If HookZz/Dobby Hook fails when the function is an exported function, use xHook; otherwise, use Whale.

II. For the purpose of algorithm restoration#

Use Console Debugger and Unicorn Hook, and it is recommended not to prioritize third-party Hook solutions.

4. Basic Code of This Article#

This is the code for the simulated execution demo

package com.tutorial;

import com.github.unidbg.AndroidEmulator;
import com.github.unidbg.Emulator;
import com.github.unidbg.Module;
import com.github.unidbg.arm.HookStatus;
import com.github.unidbg.arm.backend.Backend;
import com.github.unidbg.arm.backend.CodeHook;
import com.github.unidbg.arm.context.RegisterContext;
import com.github.unidbg.debugger.BreakPointCallback;
import com.github.unidbg.hook.HookContext;
import com.github.unidbg.hook.ReplaceCallback;
import com.github.unidbg.hook.hookzz.*;
import com.github.unidbg.hook.whale.IWhale;
import com.github.unidbg.hook.whale.Whale;
import com.github.unidbg.hook.xhook.IxHook;
import com.github.unidbg.linux.android.AndroidEmulatorBuilder;
import com.github.unidbg.linux.android.AndroidResolver;
import com.github.unidbg.linux.android.XHookImpl;
import com.github.unidbg.linux.android.dvm.DalvikModule;
import com.github.unidbg.linux.android.dvm.DvmClass;
import com.github.unidbg.linux.android.dvm.DvmObject;
import com.github.unidbg.linux.android.dvm.VM;
import com.github.unidbg.memory.Memory;
import com.github.unidbg.utils.Inspector;
import com.sun.jna.Pointer;
import unicorn.ArmConst;
import unicorn.Unicorn;

import java.io.File;

public class hookInUnidbg {
    private final AndroidEmulator emulator;
    private final VM vm;
    private final Module module;

    hookInUnidbg() {

        // Create emulator instance
        emulator = AndroidEmulatorBuilder.for32Bit().build();

        // Memory operation interface of the emulator
        final Memory memory = emulator.getMemory();
        // Set system library resolver
        memory.setLibraryResolver(new AndroidResolver(23));
        // Create Android virtual machine
        vm = emulator.createDalvikVM(new File("unidbg-android/src/test/resources/tutorial/hookinunidbg.apk"));

//        emulator.attach().addBreakPoint(0x40000000+0xa80);

        // Load so into virtual memory
        DalvikModule dm = vm.loadLibrary("hookinunidbg", true);
        // The loaded libhookinunidbg.so corresponds to a module
        module = dm.getModule();

        // Execute JNIOnLoad (if any)
        dm.callJNI_OnLoad(emulator);
    }

    public void call(){
        DvmClass dvmClass = vm.resolveClass("com/example/hookinunidbg/MainActivity");
        String methodSign = "call()V";
        DvmObject<?> dvmObject = dvmClass.newObject(null);

        dvmObject.callJniMethodObject(emulator, methodSign);

    }


    public static void main(String[] args) {
        hookInUnidbg mydemo = new hookInUnidbg();
        mydemo.call();
    }


}

There are some log outputs during runtime, which are normal logic.

II. Hook Functions#

The demo hookInunidbg runs several functions, and in this section, we focus on the running base64_encode function.

unsigned int
base64_encode(const unsigned char *in, unsigned int inlen, char *out);

Parameter explanations are as follows:

char *out: The address of a buffer to store the encoded content.
char *in: The address of the original string, pointing to the original string content.
int inlen: The length of the original string.
Return value: Normally returns the actual length of the converted string.

The task of this section is to print the content before base64 and the encoded content.

1. Frida#

// Frida Version
function main(){
    // get base address of target so;
    var base_addr = Module.findBaseAddress("libhookinunidbg.so");

    if (base_addr){
        var func_addr = Module.findExportByName("libhookinunidbg.so", "base64_encode");
        console.log("hook base64_encode function")
        Interceptor.attach(func_addr,{
            // Print input parameters
            onEnter: function (args) {
                console.log("\n input:")
                this.buffer = args[2];
                var length = args[1];
                console.log(hexdump(args[0],{length: length.toUInt32()}))
                console.log("\n")
            },
            // Print return value
            onLeave: function () {
                console.log(" output:")
                console.log(this.buffer.readCString());
            }
        })
    }


}

setImmediate(main);

2. Console Debugger#

Console Debugger is a quick strike and quick verification interactive debugger.

// debug
emulator.attach().addBreakPoint(module.findSymbolByName("base64_encode").getAddress());

It is necessary to reiterate and emphasize several concepts:

  • The breakpoint is triggered when running to the corresponding address, similar to GDB debugging or IDA debugging, the timing is before the target instruction is executed.
  • The breakpoint does not have various concepts of functions and needs to be understood from the perspective of ARM assembly instructions.
  • Console Debugger is used to assist algorithm analysis, quickly analyze and confirm the functionality of a certain function. It can only be used under the Unicorn engine.

To supplement the second point:

According to the ARM ATPCS calling convention, when the number of parameters is less than or equal to four, parameters are passed between subprograms through R0~R3 (i.e., R0-R3 represent parameters 1-4). If the number of parameters exceeds four, the remaining parameters are passed through the data stack pointed to by sp. The return value of the function is always passed back through R0.

Taking the target function as an example, before the function call, the caller puts the three parameters sequentially into R0-R2.

2.png

Immediate values can be viewed directly, for example, the second parameter here is 5. If you suspect it is a pointer, for example, parameters 1 and 3, in interactive debugging, input mxx to view. mrx is equivalent to hexdump in Frida. Taking r0 as an example, you can view its pointed memory with either mr0 or m0x400022e0.

Unidbg has some differences in data display compared to Frida Hexdump, reflected in two aspects:

  • In Frida hexdump, the left base address starts from the current address, while Unidbg starts from 0.
  • Unidbg provides the md5 value of the printed data block, which is convenient for comparing whether the contents of two data blocks are consistent, and Unidbg displays the Hex String of the data, making it easier to search in large logs.

Console Debugger supports many debugging and analysis commands, all displayed as follows:

c: continue
n: step over
bt: back trace

st hex: search stack
shw hex: search writable heap
shr hex: search readable heap
shx hex: search executable heap

nb: break at next block
s|si: step into
s[decimal]: execute specified amount instruction
s(blx): execute until BLX mnemonic, low performance

m(op) [size]: show memory, default size is 0x70, size may hex or decimal
mr0-mr7, mfp, mip, msp [size]: show memory of specified register
m(address) [size]: show memory of specified address, address must start with 0x

wr0-wr7, wfp, wip, wsp <value>: write specified register
wb(address), ws(address), wi(address) <value>: write (byte, short, integer) memory of specified address, address must start with 0x
wx(address) <hex>: write bytes to memory at specified address, address must start with 0x

b(address): add temporarily breakpoint, address must start with 0x, can be module offset
b: add breakpoint of register PC
r: remove breakpoint of register PC
blr: add temporarily breakpoint of register LR

p (assembly): patch assembly at PC address
where: show java stack trace

trace [begin end]: Set trace instructions
traceRead [begin end]: Set trace memory read
traceWrite [begin end]: Set trace memory write
vm: view loaded modules
vbs: view breakpoints
d|dis: show disassemble
d(0x): show disassemble at specify address
stop: stop emulation
run [arg]: run test
cc size: convert asm from 0x400008a0 - 0x400008a0 + size bytes to c function

In Frida code, using console.log(hexdump(args[0],{length: args[1].toUInt32()})) represents printing the memory block pointed to by parameter 1, with parameter 2 as the length. Unidbg can also handle length similarly.

mr0 5

>-----------------------------------------------------------------------------<
[23:41:37 891]r0=RX@0x400022e0[libhookinunidbg.so]0x22e0, md5=f5704182e75d12316f5b729e89a499df, hex=6c696c6163
size: 5
0000: 6C 69 6C 61 63                                     lilac
^-----------------------------------------------------------------------------^

Currently, Console Debugger does not support syntax like mr0 r1.

Thus, we have achieved the functionality equivalent to Frida OnEnter. Next, we need to obtain the timing point for OnLeave, which is when the function execution is complete. In ARM assembly, the LR register holds the return address of the program. When the function runs to the address pointed to by LR, the function has ended. Since the breakpoint is triggered before the target address is executed, when the breakpoint is hit at LR, the target function has just finished executing, which is the principle of the timing point for Frida OnLeave. In Console Debugger interactive debugging, using the blr command can set a temporary breakpoint at LR, which will only trigger once.

The overall logic is as follows:

  • Set a breakpoint at the address of the target function
  • Run to the breakpoint, enter Console Debugger interactive debugging
  • Use the mxx series to view parameters
  • Use blr to set a breakpoint at the function return
  • Use c to continue running the program, stopping at the return value
  • Check the buffer at this time

It should be noted that in onLeave, mr2 is unreliable. R2 only represents parameter 3 at the program entry, and during function operations, R2 is used as a general-purpose register for storage and calculations, and it no longer points to the address of the buffer. In Frida, we save the value of args[2], which is R2, in this.buffer in OnEnter, and then retrieve it in OnLeave to print. In Console Debugger interactive debugging, the method is simpler—just scroll up a bit to see what the original value of r2 was, which is found to be 0x401d2000, and then use m0x401d2000.

In this way, we have achieved the equivalent functionality of Frida. It sounds a bit troublesome, but once you are familiar with it, you will agree with my view—Console Debugger is the best, fastest, and most stable debugging tool. In addition, Console Debugger can also perform persistent Hook, as shown in the code below.

public void HookByConsoleDebugger(){
    emulator.attach().addBreakPoint(module.findSymbolByName("base64_encode").getAddress(), new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            RegisterContext context = emulator.getContext();
            Pointer input = context.getPointerArg(0);
            int length = context.getIntArg(1);
            Pointer buffer = context.getPointerArg(2);

            Inspector.inspect(input.getByteArray(0, length), "base64 input");
            // OnLeave
            emulator.attach().addBreakPoint(context.getLRPointer().peer, new BreakPointCallback() {
                @Override
                public boolean onHit(Emulator<?> emulator, long address) {
                    String result = buffer.getString(0);
                    System.out.println("base64 result:"+result);
                    return true;
                }
            });
            return true;
        }
    });
}

When onHit returns true, the breakpoint will not enter the interactive interface; when false, it will. When the function is called several hundred times, we do not want it to stop repeatedly, and then keep pressing “c” to continue running.

3. Third-party Hook Frameworks#

The following target functions are called before JNIOnLoad.

I. xHook#

public void HookByXhook(){
    IxHook xHook = XHookImpl.getInstance(emulator);
    xHook.register("libhookinunidbg.so", "base64_encode", new ReplaceCallback() {
        @Override
        public HookStatus onCall(Emulator<?> emulator, HookContext context, long originFunction) {
            Pointer input = context.getPointerArg(0);
            int length = context.getIntArg(1);
            Pointer buffer = context.getPointerArg(2);
            Inspector.inspect(input.getByteArray(0, length), "base64 input");
            context.push(buffer);
            return HookStatus.RET(emulator, originFunction);
        }
        @Override
        public void postCall(Emulator<?> emulator, HookContext context) {
            Pointer buffer = context.pop();
            System.out.println("base64 result:"+buffer.getString(0));
        }
    }, true);
    // Activate it
    xHook.refresh();
}

xHook is an open-source Android PLT hook framework by iQIYI, which is stable and easy to use, but it cannot Hook Sub_xxx sub-functions.

II. HookZz#

public void HookByHookZz(){
    IHookZz hookZz = HookZz.getInstance(emulator); // Load HookZz, supports inline hook
    hookZz.enable_arm_arm64_b_branch(); // Test enable_arm_arm64_b_branch, optional
    hookZz.wrap(module.findSymbolByName("base64_encode"), new WrapCallback<HookZzArm32RegisterContext>() {
        @Override
        public void preCall(Emulator<?> emulator, HookZzArm32RegisterContext context, HookEntryInfo info) {
            Pointer input = context.getPointerArg(0);
            int length = context.getIntArg(1);
            Pointer buffer = context.getPointerArg(2);
            Inspector.inspect(input.getByteArray(0, length), "base64 input");
            context.push(buffer);
        }
        @Override
        public void postCall(Emulator<?> emulator, HookZzArm32RegisterContext context, HookEntryInfo info) {
            Pointer buffer = context.pop();
            System.out.println("base64 result:"+buffer.getString(0));
        }
    });
    hookZz.disable_arm_arm64_b_branch();
}

HookZz can also achieve similar Hook to single-line breakpoints, but in the context of Unidbg's Hook environment, it seems to have little use and is not recommended.

IHookZz hookZz = HookZz.getInstance(emulator);
hookZz.instrument(module.base + 0x978 + 1, new InstrumentCallback<RegisterContext>() {
    @Override
    public void dbiCall(Emulator<?> emulator, RegisterContext ctx, HookEntryInfo info) {
        System.out.println(ctx.getIntArg(0));
    }
});

HookZz is now called Dobby, and in Unidbg, HookZz and Dobby are two independent Hook libraries because the author believes HookZz has better support on arm32 and Dobby has better support on arm64. HookZz is an inline hook solution, so it can Hook Sub_xxx, but the downside is that short functions may have bugs due to the limitations of inline Hook principles.

III. Whale#

public void HookByWhale(){
    IWhale whale = Whale.getInstance(emulator);
    whale.inlineHookFunction(module.findSymbolByName("base64_encode"), new ReplaceCallback() {
        Pointer buffer;
        @Override
        public HookStatus onCall(Emulator<?> emulator, long originFunction) {
            RegisterContext context = emulator.getContext();
            Pointer input = context.getPointerArg(0);
            int length = context.getIntArg(1);
            buffer = context.getPointerArg(2);
            Inspector.inspect(input.getByteArray(0, length), "base64 input");
            return HookStatus.RET(emulator, originFunction);
        }

        @Override
        public void postCall(Emulator<?> emulator, HookContext context) {
            System.out.println("base64 result:"+buffer.getString(0));
        }
    }, true);
}

Whale is a cross-platform Hook framework, and it is also an inline Hook solution for Android Native Hook, but I do not know much about the specific situation.

4. Unicorn Hook#

If you want to perform centralized, high-intensity, and flexible debugging on a certain function, Unicorn CodeHook is a good choice. For example, if I want to see the value of r1 for the first instruction of the target function, r2 for the second instruction, and r3 for the third instruction, this kind of demand.

hook_add_new's first parameter is the Hook callback, and here we choose CodeHook, the second parameter is the starting address, the third parameter is the ending address, and the fourth parameter is generally null. This means that within the execution range from the starting address to the ending address, we can process each instruction before its execution.

Find the code range of the target function.

image

public void HookByUnicorn(){
    long start = module.base+0x97C;
    long end = module.base+0x97C+0x17A;
    emulator.getBackend().hook_add_new(new CodeHook() {
        @Override
        public void hook(Backend backend, long address, int size, Object user) {
            RegisterContext registerContext = emulator.getContext();
            if(address == module.base + 0x97C){
                int r0 = registerContext.getIntByReg(ArmConst.UC_ARM_REG_R0);
                System.out.println("0x97C 处 r0:"+Integer.toHexString(r0));
            }
            if(address == module.base + 0x97C + 2){
                int r2 = registerContext.getIntByReg(ArmConst.UC_ARM_REG_R2);
                System.out.println("0x97C +2 处 r2:"+Integer.toHexString(r2));
            }
            if(address == module.base + 0x97C + 4){
                int r4 = registerContext.getIntByReg(ArmConst.UC_ARM_REG_R4);
                System.out.println("0x97C +4 处 r4:"+Integer.toHexString(r4));
            }
        }

        @Override
        public void onAttach(Unicorn.UnHook unHook) {

        }

        @Override
        public void detach() {

        }
    }, start, end, null);
}

III. Replace Parameters and Return Values#

1. Replace Parameters#

Requirement: If the input parameter is lilac, change it to hello world, and the corresponding input length must also be changed. The correct result is aGVsbG8gd29ybGQ=.

I. Frida#

// Frida Version
function main(){
    // get base address of target so;
    var base_addr = Module.findBaseAddress("libhookinunidbg.so");

    if (base_addr){
        var func_addr = Module.findExportByName("libhookinunidbg.so", "base64_encode");
        console.log("hook base64_encode function")
        var fakeinput = "hello world"
        var fakeinputPtr = Memory.allocUtf8String(fakeinput);
        Interceptor.attach(func_addr,{
            onEnter: function (args) {
                args[0] = fakeinputPtr;
                args[1] = ptr(fakeinput.length);
                this.buffer = args[2];
            },
            // Print return value
            onLeave: function () {
                console.log(" output:")
                console.log(this.buffer.readCString());
            }
        })
    }


}

setImmediate(main);

II. Console Debugger#

How to achieve this goal with the quick strike and quick verification Console Debugger?

  1. Set a breakpoint, run the code, and enter the debugger.
emulator.attach().addBreakPoint(module.findSymbolByName("base64_encode").getAddress());
  1. Modify parameters 1 and 2 through commands.
wx0x40002403 68656c6c6f20776f726c64

>-----------------------------------------------------------------------------<
[14:06:46 165]RX@0x40002403[libhookinunidbg.so]0x2403, md5=5eb63bbbe01eeed093cb22bb8f5acdc3, hex=68656c6c6f20776f726c64
size: 11
0000: 68 65 6C 6C 6F 20 77 6F 72 6C 64                   hello world
^-----------------------------------------------------------------------------^
wr1 11
>>> r1=0xb

Console Debugger supports the following write operations:

wr0-wr7, wfp, wip, wsp <value>: write specified register
wb(address), ws(address), wi(address) <value>: write (byte, short, integer) memory of specified address, address must start with 0x
wx(address) <hex>: write bytes to memory at specified address, address must start with 0x

But this is not very convenient; it is still more comfortable to do persistence.

public void ReplaceArgByConsoleDebugger(){
    emulator.attach().addBreakPoint(module.findSymbolByName("base64_encode").getAddress(), new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            RegisterContext context = emulator.getContext();
            String fakeInput = "hello world";
            int length = fakeInput.length();
            // Modify r1 value to new length
            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R1, length);
            MemoryBlock fakeInputBlock = emulator.getMemory().malloc(length, true);
            fakeInputBlock.getPointer().write(fakeInput.getBytes(StandardCharsets.UTF_8));
            // Modify r0 to point to the new string's new pointer
            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R0, fakeInputBlock.getPointer().peer);

            Pointer buffer = context.getPointerArg(2);
            // OnLeave
            emulator.attach().addBreakPoint(context.getLRPointer().peer, new BreakPointCallback() {
                @Override
                public boolean onHit(Emulator<?> emulator, long address) {
                    String result = buffer.getString(0);
                    System.out.println("base64 result:"+result);
                    return true;
                }
            });
            return true;
        }
    });
}

III. Third-party Hook Frameworks#

The shell keeps changing.

  1. xHook
public void ReplaceArgByXhook(){
    IxHook xHook = XHookImpl.getInstance(emulator);
    xHook.register("libhookinunidbg.so", "base64_encode", new ReplaceCallback() {
        @Override
        public HookStatus onCall(Emulator<?> emulator, HookContext context, long originFunction) {
            String fakeInput = "hello world";
            int length = fakeInput.length();
            // Modify r1 value to new length
            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R1, length);
            MemoryBlock fakeInputBlock = emulator.getMemory().malloc(length, true);
            fakeInputBlock.getPointer().write(fakeInput.getBytes(StandardCharsets.UTF_8));
            // Modify r0 to point to the new string's new pointer
            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R0, fakeInputBlock.getPointer().peer);

            Pointer buffer = context.getPointerArg(2);
            context.push(buffer);
            return HookStatus.RET(emulator, originFunction);
        }
        @Override
        public void postCall(Emulator<?> emulator, HookContext context) {
            Pointer buffer = context.pop();
            System.out.println("base64 result:"+buffer.getString(0));
        }
    }, true);
    // Activate it
    xHook.refresh();
}
  1. HookZz
public void ReplaceArgByHookZz(){
    IHookZz hookZz = HookZz.getInstance(emulator); // Load HookZz, supports inline hook
    hookZz.enable_arm_arm64_b_branch(); // Test enable_arm_arm64_b_branch, optional
    hookZz.wrap(module.findSymbolByName("base64_encode"), new WrapCallback<HookZzArm32RegisterContext>() {
        @Override
        public void preCall(Emulator<?> emulator, HookZzArm32RegisterContext context, HookEntryInfo info) {
            Pointer input = context.getPointerArg(0);
            String fakeInput = "hello world";
            input.setString(0, fakeInput);
            context.setR1(fakeInput.length());

            Pointer buffer = context.getPointerArg(2);
            context.push(buffer);
        }
        @Override
        public void postCall(Emulator<?> emulator, HookZzArm32RegisterContext context, HookEntryInfo info) {
            Pointer buffer = context.pop();
            System.out.println("base64 result:"+buffer.getString(0));
        }
    });
    hookZz.disable_arm_arm64_b_branch();
}

Because it can use HookZzArm32RegisterContext, the code is relatively simpler.

2. Modifying Return Values#

The logic for modifying return values is similar to replacing parameters, but it can lead to the fourth section, so it will be explained in detail.

In the demo, there is a verifyApkSign function that always returns 1, leading to APK verification failure. Therefore, the goal is to make it return 0.

extern "C"
JNIEXPORT void JNICALL
Java_com_example_hookinunidbg_MainActivity_call(JNIEnv *env, jobject thiz) {
    int verifyret = verifyApkSign();
    if(verifyret == 1){
        LOGE("APK sign verify failed!");
    } else{
        LOGE("APK sign verify success!");
    }
    testBase64();
}

extern "C" int verifyApkSign(){
    LOGE("verify apk sign");
    return 1;
};

I. Frida#

// Frida Version
function main(){
    // get base address of target so;
    var base_addr = Module.findBaseAddress("libhookinunidbg.so");

    if (base_addr){
        var func_addr = Module.findExportByName("libhookinunidbg.so", "verifyApkSign");
        console.log("hook verifyApkSign function")
        Interceptor.attach(func_addr,{
            onEnter: function (args) {

            },
            // Print return value
            onLeave: function (retval) {
                retval.replace(0);
            }
        })
    }


}

setImmediate(main);

II. Console Debugger#

public void ReplaceRetByConsoleDebugger(){
    emulator.attach().addBreakPoint(module.findSymbolByName("verifyApkSign").getAddress(), new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            RegisterContext context = emulator.getContext();
            // OnLeave
            emulator.attach().addBreakPoint(context.getLRPointer().peer, new BreakPointCallback() {
                @Override
                public boolean onHit(Emulator<?> emulator, long address) {
                    emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R0, 0);
                    return true;
                }
            });
            return true;
        }
    });
}

Our Hook has taken effect, but the log in the verifyApkSign function is still printed. In some cases, we want to change the original execution behavior of the function, rather than just printing some information or replacing input parameters and return values. That is, we need to completely replace the function—replace the original function with our own function.

IV. Replacing Functions#

1. Frida#

const verifyApkSignPtr = Module.findExportByName("libhookinunidbg.so", "verifyApkSign");
Interceptor.replace(verifyApkSignPtr, new NativeCallback(() => {
    console.log("replace verifyApkSign Function")
    return 0;
}, 'void', []));

2. Third-party Hook Frameworks#

Here is the xHook version.

public void ReplaceFuncByHookZz(){
    HookZz hook = HookZz.getInstance(emulator);
    hook.replace(module.findSymbolByName("verifyApkSign").getAddress(), new ReplaceCallback() {
        @Override
        public HookStatus onCall(Emulator<?> emulator, HookContext context, long originFunction) {
            emulator.getBackend().reg_write(Unicorn.UC_ARM_REG_R0,0);
            return HookStatus.RET(emulator,context.getLR());
        }
    });
}

The xHook version is clear and easy to understand, and we have done two things:

  • Assign 0 to R0
  • Assign LR to PC, which means the function will return without executing any lines, and since R0 is assigned 0, the return value will be 0.

3. Console Debugger#

public void ReplaceFuncByConsoleDebugger(){
    emulator.attach().addBreakPoint(module.findSymbolByName("verifyApkSign").getAddress(), new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            System.out.println("Replacing function verifyApkSign");
            RegisterContext registerContext = emulator.getContext();
            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_PC, registerContext.getLRPointer().peer);
            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R0, 0);
            return true;
        }
    });
}

Very clear and easy to understand.

V. Calling Functions#

When analyzing specific algorithms, it is often necessary to actively call them for more flexible and detailed analysis.

1. Frida#

2. Unidbg#

VI. Patch and Memory Retrieval#

1. Patch#

Patching is directly modifying the binary file, and there are essentially two forms of patching:

  • Patch the binary file
  • Patch in memory

Patching has many application scenarios and is sometimes more useful than Hooking, which is why it needs to be introduced. The form of patching binary files is familiar to most people, and using KeyPatch in IDA to patch is a friendly experience. But here we focus on memory patching.

image

At 0x8CA, the signature verification function is called. In sections three and four, we handled it by replacing the return value or function, but in fact, modifying this four-byte instruction at 0x8CA is also a good approach.

image

It should be noted that this article only discusses arm32, and the instruction set only considers the most common thumb2, arm, and arm64 can be tested independently.

I. Frida#

  1. Method One
var str_name_so = "libhookinunidbg.so";    // The name of the so to hook
var n_addr_func_offset = 0x8CA;         // The offset of the function to hook in the function, thumb needs to +1

var n_addr_so = Module.findBaseAddress(str_name_so);
var n_addr_assemble = n_addr_so.add(n_addr_func_offset);

Memory.protect(n_addr_assemble, 4, 'rwx'); // Modify memory attributes to make the program segment writable
n_addr_assemble.writeByteArray([0x00, 0x20, 0x00, 0xBF]);

But this is not the best practice because compared to Unidbg, Frida operations on real Android systems have two issues:

  • Is there multi-threaded manipulation of the target address memory? Is there a conflict?
  • The cache flushing mechanism of arm.

So Frida provides a safer and more reliable series of APIs to modify bytes in memory.

  1. Method Two
var str_name_so = "libhookinunidbg.so";    // The name of the so to hook
var n_addr_func_offset = 0x8CA;         // The offset of the function to hook in the function, thumb needs to +1

var n_addr_so = Module.findBaseAddress(str_name_so);
var n_addr_assemble = n_addr_so.add(n_addr_func_offset);

// safely modify bytes at address
Memory.patchCode(n_addr_assemble, 4, function () {
    // Get a patch object in thumb mode
    var cw = new ThumbWriter(n_addr_assemble);
    // Little-endian
    // 00 20
    cw.putInstruction(0x2000)
    // 00 BF
    cw.putInstruction(0xBF00);
    cw.flush(); // Memory flush
    console.log(hexdump(n_addr_assemble))
});

II. Unidbg#

Unidbg can modify memory by passing either machine code or assembly instructions.

  1. Method One
public void Patch1(){
    // 00 20 00 bf
    int patchCode = 0xBF002000; // movs r0,0
    emulator.getMemory().pointer(module.base + 0x8CA).setInt(0,patchCode);
}
  1. Method Two
public void Patch2(){
    byte[] patchCode = {0x00, 0x20, 0x00, (byte) 0xBF};
    emulator.getBackend().mem_write(module.base + 0x8CA, patchCode);
}
  1. Method Three
public void Patch3(){
    try (Keystone keystone = new Keystone(KeystoneArchitecture.Arm, KeystoneMode.ArmThumb)) {
        KeystoneEncoded encoded = keystone.assemble("movs r0,0;nop");
        byte[] patchCode = encoded.getMachineCode();
        emulator.getMemory().pointer(module.base + 0x8CA).write(0, patchCode, 0, patchCode.length);
    }
}

2. Memory Retrieval#

Assuming the SO is fragmented, for example, if you need to analyze multiple versions of a certain SO, you need to patch the signature verification or some assembly, and the address is not fixed in different versions, but the function characteristics are fixed. Memory retrieval + dynamic patching is a good approach that can adapt well to different versions and fragmentation.

The search for characteristic segments can be based on the need, which may be to search the first ten bytes of the function, or it may be to search the bytes above and below the target address or others.

image

I. Frida#

function searchAndPatch() {
    var module = Process.findModuleByName("libhookinunidbg.so");
    var pattern = "80 b5 6f 46 84 b0 03 90 02 91"
    var matches = Memory.scanSync(module.base, module.size, pattern);
    console.log(matches.length)
    if (matches.length !== 0)
    {
        var n_addr_assemble = matches[0].address.add(10);
        // safely modify bytes at address
        Memory.patchCode(n_addr_assemble, 4, function () {
            // Get a patch object in thumb mode
            var cw = new ThumbWriter(n_addr_assemble);
            // Little-endian
            // 00 20
            cw.putInstruction(0x2000)
            // 00 BF
            cw.putInstruction(0xBF00);
            cw.flush(); // Memory flush
            console.log(hexdump(n_addr_assemble))
        });
    }
}

setImmediate(searchAndPatch);

II. Unidbg#

public void SearchAndPatch(){
    byte[] patterns = {(byte) 0x80, (byte) 0xb5,0x6f,0x46, (byte) 0x84, (byte) 0xb0,0x03, (byte) 0x90,0x02, (byte) 0x91};
    Collection<Pointer> pointers = searchMemory(module.base, module.base+module.size, patterns);
    if(pointers.size() > 0){
        try (Keystone keystone = new Keystone(KeystoneArchitecture.Arm, KeystoneMode.ArmThumb)) {
            KeystoneEncoded encoded = keystone.assemble("movs r0,0;nop");
            byte[] patchCode = encoded.getMachineCode();
            ((ArrayList<Pointer>) pointers).get(0).write(10, patchCode, 0, patchCode.length);
        }
    }

}

private Collection<Pointer> searchMemory(long start, long end, byte[] data) {
    List<Pointer> pointers = new ArrayList<>();
    for (long i = start, m = end - data.length; i < m; i++) {
        byte[] oneByte = emulator.getBackend().mem_read(i, 1);
        if (data[0] != oneByte[0]) {
            continue;
        }

        if (Arrays.equals(data, emulator.getBackend().mem_read(i, data.length))) {
            pointers.add(UnidbgPointer.pointer(emulator, i));
            i += (data.length - 1);
        }
    }
    return pointers;
}

It is worth mentioning that the content of this section can also be implemented using LIEF to patch binary files.

VII. Hook Timing Issues#

In the previous text, the Hook code is located after the SO is loaded, before executing JNI_OnLoad, which is equivalent to the timing of the following Frida code.

var android_dlopen_ext = Module.findExportByName(null, "android_dlopen_ext");
if (android_dlopen_ext != null) {
    Interceptor.attach(android_dlopen_ext, {
        onEnter: function (args) {
            this.hook = false;
            var soName = args[0].readCString();
            if (soName.indexOf("libhookinunidbg.so") !== -1) {
                this.hook = true;
            }
        },
        onLeave: function (retval) {
            if (this.hook) {
                this.hook = false;
                // your code
            }
        }
    });
}

But if there is code logic in the .init and .init_array sections (init → init_array → JNIOnLoad), the Hook timing is too late. In this case, it is necessary to advance the Hook timing point to before the init execution.

In Frida, to achieve this, it is necessary to work on the linker, and the usual approach is to hook the call_function or call_constructor functions in the linker. In Unidbg, there are several methods.

Taking our demo hookInUnidbg as an example, where the init section has the following logic, comparing two strings.

// After compilation, in the .init section [name cannot be changed]
extern "C" void _init(void) {
    char str1[15];
    char str2[15];
    int ret;

    strcpy(str1, "abcdef");
    strcpy(str2, "ABCDEF");

    ret = strcmp(str1, str2);

    if(ret < 0)
    {
        LOGI("str1 is less than str2");
    }
    else if(ret > 0)
    {
        LOGI("str1 is greater than str2");
    }
    else
    {
        LOGI("str1 is equal to str2");
    }

}

Currently, it shows str1 is greater than str2, and our Hook target is to make it display str1 is less than str2.

1. Preloading libc#

Preloading libc and then hooking the strcmp function to modify its return value to -1 is one method. Below is the complete code, providing both Console Debugger and HookZz versions.

package com.tutorial;

import com.github.unidbg.AndroidEmulator;
import com.github.unidbg.Emulator;
import com.github.unidbg.Module;
import com.github.unidbg.arm.context.RegisterContext;
import com.github.unidbg.debugger.BreakPointCallback;
import com.github.unidbg.hook.hookzz.HookEntryInfo;
import com.github.unidbg.hook.hookzz.HookZz;
import com.github.unidbg.linux.android.AndroidEmulatorBuilder;
import com.github.unidbg.linux.android.AndroidResolver;
import com.github.unidbg.linux.android.dvm.*;
import com.github.unidbg.memory.Memory;
import java.io.File;

public class hookInUnidbg {
    private final AndroidEmulator emulator;
    private final VM vm;
    private final Module module;
    private final Module moduleLibc;

    hookInUnidbg() {

        // Create emulator instance
        emulator = AndroidEmulatorBuilder.for32Bit().build();

        // Memory operation interface of the emulator
        final Memory memory = emulator.getMemory();

        // Set system library resolver
        memory.setLibraryResolver(new AndroidResolver(23));
        // Create Android virtual machine
        vm = emulator.createDalvikVM(new File("unidbg-android/src/test/resources/tutorial/hookinunidbg.apk"));

        // First load libc.so
        DalvikModule dmLibc = vm.loadLibrary(new File("unidbg-android/src/main/resources/android/sdk23/lib/libc.so"), true);
        moduleLibc = dmLibc.getModule();

        // Hook
        hookStrcmpByUnicorn();
        // or
        // hookStrcmpByHookZz();

        // Load so into virtual memory
        DalvikModule dm = vm.loadLibrary("hookinunidbg", true);
        // The loaded libhookinunidbg.so corresponds to a module
        module = dm.getModule();

        // Execute JNIOnLoad (if any)
        dm.callJNI_OnLoad(emulator);
    }

    public void call(){
        DvmClass dvmClass = vm.resolveClass("com/example/hookinunidbg/MainActivity");
        String methodSign = "call()V";
        DvmObject<?> dvmObject = dvmClass.newObject(null);
        dvmObject.callJniMethodObject(emulator, methodSign);

    }


    public static void main(String[] args) {
        hookInUnidbg mydemo = new hookInUnidbg();
        mydemo.call();
    }

    public void hookStrcmpByUnicorn(){
        emulator.attach().addBreakPoint(moduleLibc.findSymbolByName("strcmp").getAddress(), new BreakPointCallback() {
            @Override
            public boolean onHit(Emulator<?> emulator, long address) {
                RegisterContext registerContext = emulator.getContext();
                String arg1 = registerContext.getPointerArg(0).getString(0);

                emulator.attach().addBreakPoint(registerContext.getLRPointer().peer, new BreakPointCallback() {
                    @Override
                    public boolean onHit(Emulator<?> emulator, long address) {
                        if(arg1.equals("abcdef")){
                            emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R0, -1);
                        }
                        return true;
                    }
                });
                return true;
            }
        });
    }

    public void hookStrcmpByHookZz(){
        IHookZz hookZz = HookZz.getInstance(emulator); // Load HookZz, supports inline hook
        hookZz.enable_arm_arm64_b_branch(); // Test enable_arm_arm64_b_branch, optional
        hookZz.wrap(moduleLibc.findSymbolByName("strcmp"), new WrapCallback<HookZzArm32RegisterContext>() {
            String arg1;
            @Override
            public void preCall(Emulator<?> emulator, HookZzArm32RegisterContext ctx, HookEntryInfo info) {
                arg1 = ctx.getPointerArg(0).getString(0);
            }
            @Override
            public void postCall(Emulator<?> emulator, HookZzArm32RegisterContext ctx, HookEntryInfo info) {
                if(arg1.equals("abcdef")){
                    ctx.setR0(-1);
                }
            }
        });
        hookZz.disable_arm_arm64_b_branch();
    }
}

But if the target function to hook is not in libc, it will not work. For example, if you want to set a breakpoint at 0x978.

image

2. Setting a Breakpoint at a Fixed Address#

This is the most common and convenient method, but it can only be used under the Unicorn engine.

The first user SO loaded by vm.loadLibrary has a base address of 0x40000000, so you can check the function offset in IDA and hook that address using Console Debugger.

package com.tutorial;

import com.github.unidbg.AndroidEmulator;
import com.github.unidbg.Emulator;
import com.github.unidbg.Module;
import com.github.unidbg.arm.context.RegisterContext;
import com.github.unidbg.debugger.BreakPointCallback;
import com.github.unidbg.hook.hookzz.*;
import com.github.unidbg.linux.android.AndroidEmulatorBuilder;
import com.github.unidbg.linux.android.AndroidResolver;
import com.github.unidbg.linux.android.dvm.*;
import com.github.unidbg.memory.Memory;
import java.io.File;

public class hookInUnidbg {
    private final AndroidEmulator emulator;
    private final VM vm;
    private final Module module;
    private Module moduleLibc;

    hookInUnidbg() {

        // Create emulator instance
        emulator = AndroidEmulatorBuilder.for32Bit().build();

        // Memory operation interface of the emulator
        final Memory memory = emulator.getMemory();
        // Set system library resolver
        memory.setLibraryResolver(new AndroidResolver(23));
        // Create Android virtual machine
        vm = emulator.createDalvikVM(new File("unidbg-android/src/test/resources/tutorial/hookinunidbg.apk"));

        emulator.attach().addBreakPoint(0x40000000 + 0x978);

        // Load so into virtual memory
        DalvikModule dm = vm.loadLibrary("hookinunidbg", true);
        // The loaded libhookinunidbg.so corresponds to a module
        module = dm.getModule();

        // Execute JNIOnLoad (if any)
        dm.callJNI_OnLoad(emulator);
    }

    public void call(){
        DvmClass dvmClass = vm.resolveClass("com/example/hookinunidbg/MainActivity");
        String methodSign = "call()V";
        DvmObject<?> dvmObject = dvmClass.newObject(null);
        dvmObject.callJniMethodObject(emulator, methodSign);

    }


    public static void main(String[] args) {
        hookInUnidbg mydemo = new hookInUnidbg();
        mydemo.call();
    }
}

image

If multiple user SOs are loaded, you can run the code once to confirm the base address of the target SO (there is no address randomization in Unidbg, and the target function's address is fixed every time). Then, hook that address before loadLibrary to ensure that the Hook is not missed.

3. Using Unidbg's Module Listener#

Implement your own module listener.

package com.tutorial;

import com.github.unidbg.Emulator;
import com.github.unidbg.Module;
import com.github.unidbg.ModuleListener;
import com.github.unidbg.arm.context.RegisterContext;
import com.github.unidbg.hook.hookzz.HookEntryInfo;
import com.github.unidbg.hook.hookzz.HookZz;
import com.github.unidbg.hook.hookzz.InstrumentCallback;

public class MyModuleListener implements ModuleListener {
    private HookZz hook;

    @Override
    public void onLoaded(Emulator<?> emulator, Module module) {
        // Preload Hook framework
        if(module.name.equals("libc.so")){
             hook = HookZz.getInstance(emulator);
        }

        // Hook in the target function
        if(module.name.equals("libhookinunidbg.so")){
            hook.instrument(module.base + 0x978 + 1, new InstrumentCallback<RegisterContext>() {
                @Override
                public void dbiCall(Emulator<?> emulator, RegisterContext ctx, HookEntryInfo info) {
                    System.out.println(ctx.getIntArg(0));
                }
            });
        }
    }
}

Bind it using memory.addModuleListener.

package com.tutorial;

import com.github.unidbg.AndroidEmulator;
import com.github.unidbg.Module;
import com.github.unidbg.linux.android.AndroidEmulatorBuilder;
import com.github.unidbg.linux.android.AndroidResolver;
import com.github.unidbg.linux.android.dvm.*;
import com.github.unidbg.memory.Memory;
import java.io.File;

public class hookInUnidbg{
    private final AndroidEmulator emulator;
    private final VM vm;

    hookInUnidbg() {

        // Create emulator instance
        emulator = AndroidEmulatorBuilder.for32Bit().build();

        // Memory operation interface of the emulator
        final Memory memory = emulator.getMemory();

        // Add module loading listener
        memory.addModuleListener(new MyModuleListener());

        // Set system library resolver
        memory.setLibraryResolver(new AndroidResolver(23));
        // Create Android virtual machine
        vm = emulator.createDalvikVM(new File("unidbg-android/src/test/resources/tutorial/hookinunidbg.apk"));

        // Load so into virtual memory
        DalvikModule dm = vm.loadLibrary("hookinunidbg", true);
        // The loaded libhookinunidbg.so corresponds to a module
        Module module = dm.getModule();

        // Execute JNIOnLoad (if any)
        dm.callJNI_OnLoad(emulator);
    }

    public void call(){
        DvmClass dvmClass = vm.resolveClass("com/example/hookinunidbg/MainActivity");
        String methodSign = "call()V";
        DvmObject<?> dvmObject = dvmClass.newObject(null);
        dvmObject.callJniMethodObject(emulator, methodSign);

    }


    public static void main(String[] args) {
        hookInUnidbg mydemo = new hookInUnidbg();
        mydemo.call();
    }

}

Each method has its corresponding usage scenario, and should be used as needed. In addition, you can also modify the Unidbg source code to add your own logic before the callInitFunction function.

VIII. Conditional Breakpoints#

In algorithm analysis, conditional breakpoints can reduce interference information. Taking strcmp as an example, all modules of the entire process may call the strcmp function.

1. Limited to a Certain SO#

I. Frida#

Interceptor.attach(
    Module.findExportByName("libc.so", "strcmp"), {
        onEnter: function(args) {
            var moduleName = Process.getModuleByAddress(this.returnAddress).name;
            console.log("strcmp arg1:"+args[0].readCString())
            // You can filter prints based on moduleName
            console.log("call from :"+moduleName)
        },
        onLeave: function(ret) {
        }
    }
);

II. Unidbg#

public void hookstrcmp(){
    long address = module.findSymbolByName("strcmp").getAddress();
    emulator.attach().addBreakPoint(address, new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            RegisterContext registerContext = emulator.getContext();
            String arg1 = registerContext.getPointerArg(0).getString(0);
            String moduleName = emulator.getMemory().findModuleByAddress(registerContext.getLRPointer().peer).name;
            if(moduleName.equals("libhookinunidbg.so")){
                System.out.println("strcmp arg1:"+arg1);
            }
            return true;
        }
    });
}

2. Limited to a Certain Function#

For example, if a certain function is widely used in the SO, and now you only want to analyze its usage in function a.

I. Frida#

var show = false;
Interceptor.attach(
    Module.findExportByName("libc.so", "strcmp"), {
        onEnter: function(args) {
            if(show){
                console.log("strcmp arg1:"+args[0].readCString())
            }
        },
        onLeave: function(ret) {

        }
    }
);

Interceptor.attach(
    Module.findExportByName("libhookinunidbg.so", "targetfunction"),{
        onEnter: function(args) {
            show = this;
        },
        onLeave: function(ret) {
            show = false;
        }
    }
)

II. Unidbg#

// Earlier declare global variable public boolean show = false;

public void hookstrcmp(){
    emulator.attach().addBreakPoint(module.findSymbolByName("targetfunction").getAddress(), new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            RegisterContext registerContext = emulator.getContext();

            show = true;
            emulator.attach().addBreakPoint(registerContext.getLRPointer().peer, new BreakPointCallback() {
                @Override
                public boolean onHit(Emulator<?> emulator, long address) {
                    show = false;
                    return true;
                }
            });
            return true;
        }
    });

    emulator.attach().addBreakPoint(module.findSymbolByName("strcmp").getAddress(), new BreakPointCallback() {
        @Override
        public boolean onHit(Emulator<?> emulator, long address) {
            RegisterContext registerContext = emulator.getContext();
            String arg1 = registerContext.getPointerArg(0).getString(0);
            if(show){
                System.out.println("strcmp arg1:"+arg1);
            }
            return true;
        }
    });
}

3. Limited to a Certain Place#

image

For example, in the above image, only focus on strcmp occurring at 0xA00. One method is to hook the strcmp function and print the output only when lr register = module.base + 0xA00 + 4 + 1.

Another method is to use Console Debugger, which is also very convenient.

emulator.attach().addBreakPoint(module, 0xA00);
emulator.attach().addBreakPoint(module, 0xA04);

You must master this knowledge and be flexible. In practical combat, scenarios such as "A hook takes effect and then prints the output of function B" are very common; otherwise, printing hundreds of lines for each function will confuse the reader.

IX. System Call Interception—Taking Time as an Example#

The system call interception mentioned here does not mean to hook system calls, such as frida - syscall - interceptor like this, as all system calls are implemented by Unidbg itself, and the logs can be opened directly, so there is obviously no need to hook. The system call interception in Unidbg is to replace the implementation of system calls in Unidbg.

There are two questions to explain:

  • Why modify system calls?

    Some system calls in Unidbg are not implemented or not implemented well, and sometimes we want to fix their output, such as the system call for getting time. These needs require us to fix or modify the implementation of system calls in Unidbg.

  • Why not directly modify the Unidbg source code?

    1. It has poor flexibility.
    2. Our implementation or modification is not perfect, and directly modifying the Unidbg source code is pollution to the running environment, affecting other projects.

When analyzing algorithms, if the output keeps changing under the premise of unchanged input, it will interfere with algorithm analysis. One major source of this situation is that timestamps are involved in the calculations. In Frida, to control such interference factors, it is often necessary to hook the libc's gettimeofday function to obtain the time.

1. Frida#

Hook time

var time = Module.findExportByName(null, "time");
if (time != null) {
    Interceptor.attach(time, {
        onEnter: function (args) {

        },
        onLeave: function (retval) {
            // time returns second-level timestamp, modify return value to 100
            retval.replace(100);
        }
    })
}

Hook gettimeofday

function hook_gettimeofday() {
    var addr_gettimeofday = Module.findExportByName(null, "gettimeofday");
    var gettimeofday = new NativeFunction(addr_gettimeofday, "int", ["pointer", "pointer"]);
    
    Interceptor.replace(addr_gettimeofday, new NativeCallback(function (ptr_tz, ptr_tzp) {

        var result = gettimeofday(ptr_tz, ptr_tzp);
        if (result == 0) {
            console.log("hook gettimeofday:", ptr_tz, ptr_tzp, result);
            var t = new Int32Array(ArrayBuffer.wrap(ptr_tz, 8));
            t[0] = 0xAAAA;
            t[1] = 0xBBBB;
            console.log(hexdump(ptr_tz));
        }
        return result;
    }, "int", ["pointer", "pointer"]));
}

However, doing this in Frida is not easy to achieve perfectly. There are four library functions that can obtain timestamps, including time, gettimeofday, clock_gettime, and clock.

2. Unidbg#

In Unidbg, you can fix the time more conveniently and broadly without having to do it like Frida. The time and gettimeofday library functions are based on the gettimeofday system call, while clock_gettime and clock are based on the clock_gettime system call. Therefore, as long as we fix the timestamps obtained from the gettimeofday and clock_gettime system calls in Unidbg, we can achieve a one-time solution.

First, implement a system call handler related to time, where System.currentTimeMillis() and System.nanoTime() are changed to constants.

package com.tutorial;

import com.github.unidbg.Emulator;
import com.github.unidbg.linux.ARM32SyscallHandler;
import com.github.unidbg.memory.SvcMemory;
import com.github.unidbg.pointer.UnidbgPointer;
import com.github.unidbg.unix.struct.TimeVal32;
import com.github.unidbg.unix.struct.TimeZone;
import com.sun.jna.Pointer;
import unicorn.ArmConst;

import java.util.Calendar;

public class TimeSyscallHandler extends ARM32SyscallHandler {
    public TimeSyscallHandler(SvcMemory svcMemory) {
        super(svcMemory);
    }

    @Override
    protected boolean handleUnknownSyscall(Emulator emulator, int NR) {
        switch (NR) {
            case 78:
                // gettimeofday
                mygettimeofday(emulator);
                return true;
            case 263:
                // clock_gettime
                myclock_gettime(emulator);
                return true;

        }

        return super.handleUnknownSyscall(emulator, NR);
    }


    private void mygettimeofday(Emulator<?> emulator) {
        Pointer tv = UnidbgPointer.register(emulator, ArmConst.UC_ARM_REG_R0);
        Pointer tz = UnidbgPointer.register(emulator, ArmConst.UC_ARM_REG_R1);
        emulator.getBackend().reg_write(ArmConst.UC_ARM_REG_R0, mygettimeofday(tv, tz));
    };

    private int mygettimeofday(Pointer tv, Pointer tz) {
        long currentTimeMillis = System.currentTimeMillis();

        long tv_sec = currentTimeMillis / 1000;
        long tv_usec = (currentTimeMillis % 1000) * 1000;

        TimeVal32 timeVal = new TimeVal32(tv);
        timeVal.tv_sec = (int) tv_sec;
        timeVal.tv_usec = (int) tv_usec;
        timeVal.pack();

        if (tz != null) {
            Calendar calendar = Calendar.getInstance();
            int tz_minuteswest = -(calendar.get(Calendar.ZONE_OFFSET) + calendar.get(Calendar.DST_OFFSET)) / (60 * 1000);
            TimeZone timeZone = new TimeZone(tz);
            timeZone.tz_minuteswest = tz_minuteswest;
            timeZone.tz_dsttime = 0;
            timeZone.pack();
        }
        return 0;
    }

    private static final int CLOCK_REALTIME = 0;
    private static final int CLOCK_MONOTONIC = 1;
    private static final int CLOCK_THREAD_CPUTIME_ID = 3;
    private static final int CLOCK_MONOTONIC_RAW = 4;
    private static final int CLOCK_MONOTONIC_COARSE = 6;
    private static final int CLOCK_BOOTTIME = 7;
    private final long nanoTime = System.nanoTime();

    private int myclock_gettime(Emulator<?> emulator) {
        int clk_id = emulator.getBackend().reg_read(ArmConst.UC_ARM_REG_R0).intValue();
        Pointer tp = UnidbgPointer.register(emulator, ArmConst.UC_ARM_REG_R1);
        long offset = clk_id == CLOCK_REALTIME ? System.currentTimeMillis() * 1000000L : System.nanoTime() - nanoTime;
        long tv_sec = offset / 1000000000L;
        long tv_nsec = offset % 1000000000L;

        switch (clk_id) {
            case CLOCK_REALTIME:
            case CLOCK_MONOTONIC:
            case CLOCK_MONOTONIC_RAW:
            case CLOCK_MONOTONIC_COARSE:
            case CLOCK_BOOTTIME:
                tp.setInt(0, (int) tv_sec);
                tp.setInt(4, (int) tv_nsec);
                return 0;
            case CLOCK_THREAD_CPUTIME_ID:
                tp.setInt(0, 0);
                tp.setInt(4, 1);
                return 0;
        }
        throw new UnsupportedOperationException("clk_id=" + clk_id);
    }
}

Use it in your emulator, the original emulator creation is done with this line:

// Create emulator instance
emulator = AndroidEmulatorBuilder.for32Bit().build();

Modify it as follows:

// Create emulator instance
AndroidEmulatorBuilder builder = new AndroidEmulatorBuilder(false) {
    public AndroidEmulator build() {
        return new AndroidARMEmulator(processName, rootDir,
                backendFactories) {
            @Override
            protected UnixSyscallHandler<AndroidFileIO>
            createSyscallHandler(SvcMemory svcMemory) {
                return new TimeSyscallHandler(svcMemory);
            }
        };
    }
};

emulator = builder.build();

X. Hook Detection#

There are numerous methods to Anti Unidbg, but in fact, there are almost no samples that actively Anti Unidbg for two main reasons:

  • Several major weaknesses of Unidbg itself have not been resolved, such as multi-threading and signal mechanisms are still not implemented.
  • The popularity and promotion of Unidbg are still low.

Therefore, this section focuses on Hook detection.

1. Detecting Third-party Hook Frameworks#

Based on their Hook implementation principles, corresponding detections can be made.

I. Inline Hook#

Taking inline Hook detection as an example, inline Hook needs to modify the first few bytes at the Hook point, jump to its own place to implement logic, and then jump back. Therefore, there are two types of ideas to implement detection: first, open a detection thread to perform the following two operations on key functions in a loop:

  • Whether the first few bytes at the beginning of the function have been tampered with.
  • Whether the function body is complete and has not been modified, commonly using crc32 checks. Why not use md5 or other hash functions? Because crc32 is extremely fast, with little performance impact, and an acceptable collision rate.

Related project: check_fish_inline_hook

II. Got Hook#

Related project: SliverBullet5563/CheckGotHook: Detecting got hook (using xhook for testing)

2. Detecting Unicorn Based Hook#

Unicorn Hook seems undetectable, but Unicorn is also detectable. In the Anti-Unidbg series on the platform, a detection method has been mentioned. In the Android system, only memory addresses aligned to four bytes can be read and written, so trying to read and write to the SP+1 position using inline assembly will cause the App to crash on a real device, while Unidbg simulation will not cause any issues. Of course, we do not want the App to crash, so we need to implement our own signal handling function in the code. When an exception occurs at this point, the signal handling function receives the signal and performs some processing. Since Unidbg does not throw exceptions, it will not reach the signal handling function, which can create a difference.

In addition, setting breakpoints or performing instruction tracing under Unicorn will inevitably cause the function's running time to exceed normal limits, and strategies based on running time for anti-debugging can also work.

XI. Unidbg Trace Four-piece Set#

Based on Frida, there are many trace solutions, such as JNItrace for tracing JNI functions, ZenTrace and r0tracer for tracing Java calls, or the official multi-functional trace tool frida-trace, and strace on Linux, or frida-syscall-interceptor based on Frida for tracing system calls.

In Unidbg, most of the above tracing can be achieved simply by adjusting the log level. The tracing we are talking about here focuses on how to give users stronger control over the code execution flow.

1. Instruction Tracing#

Instruction tracing includes two parts:

  • Record the execution of each instruction, printing address, machine code, assembly, and other information.
  • Print the register values related to each instruction.

Unidbg encapsulates instruction tracing based on Unicorn CodeHook, and the method and effect are as follows:

/**
 * trace instruction
 * note: low performance
 */
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.