Polymorphic-Code

Mar 01, 2025

12 minutes

Polymorphic Code

What is Polymorphic Code ?

Without changing original algorithm at the target code, creating another code. This is simply explanation of the Polymorphic Code. It seems like different code but it still same.

How it works ?

This is main purpose of this blog, but with my way :))

Why ?

Purpose of Polymorphic Code is obfuscation. After you have it, no need to think about hash based detection and it causes every analyst thinking they are analyzing different binary.

I have already mentioned about “changing code” at my self_modifying_code article. Now we are making it with different way. This program going to take a binary as an input then it will create another binary according to input. Every time this program execute, it will create another file (new hash). This is the idea and it is not changing at runtime, it will produce same file with different hash at every executation. Whole changes will be in “.text” section.

Steps

Create a struct for necessary PE header’s variables
Parse input PE file and get header’s members which are important for us
Disassemble target PE file (I used a library for this. “Zydis”)
Find proper instructions to change without broke the original file (Most important part)
Randomly change instructions
Create new file with original file’s header and whole sections except “.text”, we are going to write modified “code section” to there.

(If you are not familiar with PE file structure you can read my first article PE file)

Warm Up

Firstly I started with PE parsing because we need “.text” section to disassemble.

struct sender {
	int* textBaseAdress; // beginning of the code section
	int size; // size of the code section
	int* entryPoint; // entrypoint is VA so code section is start at 0x1000(baseOfCode) but in raw file it start at 0x400(pointerToRawData)
	int entryOffset; // it will be used in  "broken disassemble" case
	int rawTextoffset; // It usually 0x400 but I will take it
	int architecture; // it will added
};

sender PeParser(char* getFileName)
{

	std::string filename = (string)getFileName;


	HANDLE loadFile = CreateFileA(filename.c_str(), GENERIC_ALL, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

	//check handle
	if (loadFile == INVALID_HANDLE_VALUE)
	{
		printf("[!] Failed to get a handle to the file - Error Code (%d)\n", GetLastError());
		printf(" filename -> %s\n", filename);
		CloseHandle(loadFile);
		exit(1);
	}

	DWORD nNumberOfBytesToRead = GetFileSize(loadFile, NULL);
	LPVOID lpBuffer = HeapAlloc(GetProcessHeap(), 0, nNumberOfBytesToRead);

	DWORD lpNumberOfBytesRead = { 0 };
	if (!ReadFile(loadFile, lpBuffer, nNumberOfBytesToRead, &lpNumberOfBytesRead, NULL))
	{
		printf("[!] Failed to read the file - Error Code (%d)\n", GetLastError());
		CloseHandle(loadFile);
		exit(1);
	}
	
	// Now we are parsing readed file
    // lfanew is point to NT header		
	int* peHeader = (int*)lpBuffer + ((int)((PIMAGE_DOS_HEADER)lpBuffer)->e_lfanew) / 4; 
	if (((PIMAGE_NT_HEADERS32)peHeader)->OptionalHeader.Magic != IMAGE_NT_OPTIONAL_HDR32_MAGIC) { 
		cout << "usta biz x86 bakiyoruz\n" << endl; // only for x86 arch
		//senderVar.architecture == 0x64; // It will added
		exit(1);
	}


	PIMAGE_SECTION_HEADER imageSectionHeader = { 0 };
    // After adding whole NT headers size to pe header, we will reach to section headers beginning
	imageSectionHeader = (PIMAGE_SECTION_HEADER)(peHeader + sizeof(IMAGE_NT_HEADERS32) / 4);
    // While first section is ".text", first raw address is ".text"
	DWORD textBegin = imageSectionHeader->PointerToRawData;

	unsigned char* TextSection = (unsigned char*)((int*)lpBuffer + textBegin / 4);

	sender senderVar;
    // get necessary variables to use
    senderVar.textBaseAdress = (int*)TextSection; // beginning of the code section
	senderVar.size = imageSectionHeader->SizeOfRawData; // size of the code section
	senderVar.entryOffset = (((PIMAGE_NT_HEADERS32)peHeader)->OptionalHeader.AddressOfEntryPoint - ((PIMAGE_NT_HEADERS32)peHeader)->OptionalHeader.BaseOfCode + imageSectionHeader->PointerToRawData); // entrypoint is VA so code section is start at 0x1000(baseOfCode) but in raw file it start at 0x400(pointerToRawData)
	senderVar.entryPoint = (int*)((unsigned char*)((int*)lpBuffer + senderVar.entryOffset / 4)); // it will be used in  "broken disassemble" case
	senderVar.rawTextoffset = textBegin; // It usually 0x400 but I will take it

	CloseHandle(loadFile);

	return senderVar;
}

The code above is initial part of this article, so please read comment lines first.
We just get .text section and size but some files can not disassemble with that, so when it happens program will use entrypoint to disassemble.
Now we got entrypoint and code section informations. Next step ->

Getting Hot

After getting text section and size, now we will give it as a parameter to disassembler.

ZydisDecoderInit(&decoder, ZYDIS_MACHINE_MODE_LEGACY_32, ZYDIS_STACK_WIDTH_32);

For now, I will do it only for x86 arch, so it arranged as 32 bit.

DisassembleBuffer(&decoder, &data[0], size); // zydis library

Don’t forget values are hexadecimal in here!

Actually this is okay, but I like playing with hex. Let’s print it with hex values.

        for (ZyanUSize i = 0; i < instruction.length; ++i) 
            printf("%02X ", data[i]); // Print opcodes

When you add it to disassemble function, it will print instruction’s hex values it step by step.

This is what we need to catch suitable patterns. It means we can check instructions with their opcodes. Next step ->

Melting

Previous part was just preparation.
Here the thing, we will change “.text” section but we can not change the original code flow and algorithm.
So, how we can do it ?

Firstly, we can not change the file size, it is important!
We have to find the equivalent instructions.
What does it mean? Let me show you an example at the below

Both of them make eax register 0, it seems suitable and really good right ?

            // xor and sub
            if ((data[0] == 0x33 || data[0] == 0x29) && instruction.length == 2) { // xor and sub 

                if (data[1] == 0xC0 || // xor eax, eax / sub eax, eax
                    data[1] == 0xDB || // xor ebx, ebx / sub ebx, ebx
                    data[1] == 0xC9 || // xor ecx, ecx / sub ecx, ecx
                    data[1] == 0xD2 || // xor edx, edx / sub edx, edx
                    data[1] == 0xF6 || // xor esi, esi / sub esi, esi
                    data[1] == 0xFF || // xor edi, edi / sub edi, edi
                    data[1] == 0xE4 || // xor esp, esp / sub esp, esp
                    data[1] == 0xED) { // xor ebp, ebp / sub ebp, ebp

                    data[0] = (data[0] == 0x33) ? 0x29 : 0x33; // Toggle between xor and sub
                    counter++;
                    //printf("ANALYZE! Technic 1: Transformed 0x%X 0x%X -> 0x%X 0x%X\n", data[0] == 0x33 ? 0x29 : 0x33, data[1], data[0], data[1]);
                }

                //data[0] = (data[0] == 0x33) ? 0x29 : 0x33; // if it is 0x33 it will make it 0x29 else 0x33. Basically change it
                //counter++;
                //printf("ANALYZE! Technic 1\n");

                goto outside;
            }

beeeeppppp 🚨

This can change flow, I don’t heard you asking why but I will explain it anyway :))

When these instructions reset eax, they are changing flags, different flags… It is not important which is it but eventually it will change the code flow when there is a decision according to these flags.
All the time we don’t care about such things like “should I use xor or sub to reset register and make a decision according to these instruction’s flag?” because compiler will make it for us.
I guess now situation’s seriousness is more clear.
Don’t be upset, still there are options 😉

            // mov reg, 0
            if ((data[0] >= 0xB8 && data[0] <= 0xBF) && instruction.length == 5 && data[1] == 0x00 && data[2] == 0x00 && data[3] == 0x00 && data[4] == 0x00) {

                // Register opcode mapping from `mov reg, 0` to `xor reg, reg`
                const uint8_t mov_to_xor_map[8] = { 0xC0, 0xC9, 0xD2, 0xDB, 0xE4, 0xED, 0xF6, 0xFF };

                data[2] = mov_to_xor_map[data[0] - 0xB8]; // Get correct register encoding
                //printf("ANALYZE! Technic 2 -> %x, %x\n", mov_to_xor_map[data[0] - 0xB8], data[0]);

                data[0] = 0x9C; // push flags, because original mov instruction didn't affect flags
                data[1] = 0x31; // Convert to XOR
                data[3] = 0x9D; // pop flags
                data[4] = 0x90; // Fill remaining bytes with NOP
                counter++;
                goto outside;
            }

Probably you didn’t encounter with this example but it is exist. Maybe older compiler maybe it is about optimization settings but when I tested I saw it. Anyway it will make our algorithm more colorfull :)
mov instruction not effect any flags but xor will. This is why we first push flags and then we take it back, even all this process we still have one empty byte and we will fill it with nop.

// mov reg, reg//32
if (data[0] == 0x8B && instruction.length == 2) { // mov eax, reg

    if (data[1] >= 0xC1 && data[1] <= 0xC7) {
        data[0] = data[1] - 0x70;
        data[1] = 0x58; // pop eax (same for all cases)
        //printf("ANALYZE! Technic 3 -> %x, %x\n", data[0], data[1]);
        counter++;
    }

    else if (data[1] >= 0xC8 && data[1] <= 0xCF) { // mov ecx, reg
        data[0] = data[1] - 0x78; // Convert mov to push
        data[1] = 0x59; // pop ecx
        counter++;
        //printf("Tested! Technic 3 \n");
    }
    else if (data[1] >= 0xD0 && data[1] <= 0xD7) { // mov edx, reg
        data[0] = data[1] - 0x80; // Convert mov to push
        data[1] = 0x5A; // pop edx
        counter++;
        //printf("Tested! Technic 3 \n");
    }
    else if (data[1] >= 0xD8 && data[1] <= 0xDF) { // mov ebx, reg
        data[0] = data[1] - 0x88; // Convert mov to push
        data[1] = 0x5B; // pop ebx
        counter++;
        //printf("Tested! Technic 3 \n");
    }
    else if (data[1] >= 0xE8 && data[1] <= 0xEF) { // mov ebp, reg
        data[0] = data[1] - 0x98; // Convert mov to push
        data[1] = 0x5D; // pop ebp
        counter++;
        //printf("ebp! Technic 3 \n");
    }
    else if (data[1] >= 0xF0 && data[1] <= 0xF7) { // mov esi, reg
        data[0] = data[1] - 0xA0; // Convert mov to push
        data[1] = 0x5E; // pop esi
        counter++;
        //printf("esi! Technic 3 \n");
    }
    else if (data[1] >= 0xF8 && data[1] <= 0xFF) { // mov edi, reg
        data[0] = data[1] - 0xA8; // Convert mov to push
        data[1] = 0x5F; // pop edi
        counter++;
        //printf("edi! Technic 3 \n");
    }
    // If no match, skip modification

    goto outside;
}

Here is the best one for this situation
It won’t effect stack, no changed flags, everything perfect and additionally it uses to much.
You can also add “esp”, I didn’t add because I don’t want to mess with esp.

            if (data[0] == 0x83 && data[2] == 0x01 && data[1] >= 0xC0 && data[1] <= 0xC7) { // add reg, 0x01
                data[1] = data[1] - 0x80;
                data[0] = 0x90;
                data[2] = 0x90;
                counter++;
                printf("NOT TESTED\n");

                goto outside;
            }

This is also make the same process but only difference is Carry Flag. I didn’t exclude this because it never catched in my tests, if you can catch this pattern you can test it but don’t forget to tell me the result :)
Why I showed examples which are not going to be used ?
Because I tried it, so you will see it!! :)) Just kidding, these are also adds variety to your example when it is suitable.

opcodes are done now we will write it to file. Next step ->

Cooling Down

Here is just a quick reminder.

We only changed “.text” section’s few bytes. So we will put everything again in the file except “.text” section.

How to write file with changes -> Here it is ↓

void writeToFile(int* textModifiedBuffer,int size, char* getFileName) {

	std::string filename = (string)getFileName;
	HANDLE loadFile = CreateFileA(filename.c_str(), GENERIC_ALL, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

	//check handle
	if (loadFile == INVALID_HANDLE_VALUE)
	{
		printf("[!] Failed to get a handle to the file - Error Code (%d)\n", GetLastError());
		CloseHandle(loadFile);
		exit(1);
	}

	DWORD nNumberOfBytesToRead = GetFileSize(loadFile, NULL);
	LPVOID lpBuffer = HeapAlloc(GetProcessHeap(), 0, nNumberOfBytesToRead);

	DWORD lpNumberOfBytesRead = { 0 };
	if (!ReadFile(loadFile, lpBuffer, nNumberOfBytesToRead, &lpNumberOfBytesRead, NULL))
	{
		printf("[!] Failed to read the file - Error Code (%d)\n", GetLastError());
		CloseHandle(loadFile);
		exit(1);
	}


	// GAME ZONE

	int* peHeader = (int*)lpBuffer + ((int)((PIMAGE_DOS_HEADER)lpBuffer)->e_lfanew) / 4;

	PIMAGE_SECTION_HEADER imageSectionHeader = { 0 };

	imageSectionHeader = (PIMAGE_SECTION_HEADER)(peHeader + sizeof(IMAGE_NT_HEADERS32) / 4);


	DWORD textBegin = imageSectionHeader->PointerToRawData;
	
	unsigned char* finalArray = (unsigned char*)lpBuffer; // original file pointer
	unsigned char* bytePtr = (unsigned char*)textModifiedBuffer; // it hold changed text section

	for (int i = textBegin,j = 0; i < size + textBegin; i++,j++) { // it will start from the 
		finalArray[i] = bytePtr[j]; // this will write only text section to original file
	}

	// GAME ZONE END

	char Createfilename[50] = "output.exe";
	FILE* file = fopen("output.exe", "r");
	int counter = 1;

	// if it is exist, create new one
	while (file != NULL) {
		fclose(file);
		sprintf(Createfilename, "output(%d).exe", counter++);
		file = fopen(Createfilename, "r");
	}

	file = fopen(Createfilename, "wb");
	if (file == NULL) {
		perror("File couldn't open\nProbably it is open :)");
		exit(1);
	}

	
	fwrite(lpBuffer, 1, lpNumberOfBytesRead, file);

	fclose(file);
	std::cout << "File saved as: " << Createfilename << std::endl;
}

We already gave it the code section to disassemble function and then we will take same variable as input for writing back. But now we will check for another condition.
I mentioned about entrypoint, it is not the same thing with code section’s beginning.
Why I used it ?
In case of not disassembled correctly. How we understand it? Easy, if our patched bytes counter is zero or very low it is not disassembled like it should be.
Yeah it may be disassembled correct and may be the counter is zero, I know but is it matter ? If it is zero, when we try it from entrypoint, it will be zero again. So, just go on

// ------

    int changedBytes = DisassembleBuffer(&decoder, &data[0], size);// execute function first time
    if (!changedBytes || size / 0x5000 * 25 < changedBytes || changedBytes < 20) {

        printf("Couldn't write anything or an error occurred:'(\nProgram will try it with entrypoint\n");

        exeInfo = PeParser(fileName);

        for (int i = 0; i < size; i++) {
            data[i] = (uint8_t)getChar[i]; // Write original file again if something written before here
        }

        int size2 = 0x1000;

        if (size - exeInfo.entryOffset > 0x10000) 
            size2 = (size - exeInfo.entryOffset) % 0x10000 + 0x3000; // it will be better, just testing
        

        uint8_t* data2 = (uint8_t*)malloc(size2);

        data1 = exeInfo.entryPoint;

        getChar = (char*)data1;

        for (int i = 0; i < size2; i++) 
            data2[i] = (uint8_t)getChar[i];
        
        if (!DisassembleBuffer(&decoder, &data2[0], size2)) // execute function second time
            printf("Write Failed Again!\n");
        
        int entryHex = (int)exeInfo.entryOffset - exeInfo.rawTextoffset;
        //printf("entryHex -> %x\n", entryHex);
        for (int i = entryHex; i < entryHex +size2; i++) 
            data[i] = (uint8_t)data2[i- entryHex];

        

    }

    writeToFile(data,size, fileName);

It is done!

As you can see there are 6 files generated, all of them are different from the original “cmd.exe”'s hash and all of them working well. If size is huge like “cmd.exe” it will generate more different file.
Maximum different file count will be approximately n^replaceable_instructions, this is why code size is important.

In the image above, that count is not the number of replaceable_instructions. The count of replaceable_instructions approximately double of that count. Because I changed instructions 1/2 probability.

END

I have not encounter such examples when I research Polymorphic code, so many of them make it with source code.
These examples are not enough to make perfect polymorphic code but it is good for beginning. Changing decision algorithm without broke the flow is best part but now we have not.

Full code

Here is the github link for my code and binary.

~ Existence is a Fact, Living is an Art. - Frédéric Lenoir

Older Memory_Traversal

Newer NT_AUTHORITY/SYSTEM