Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Polynomial

Pages: [1]

Projects and Discussion / Re: Simple demonstration of inline ASM efficiency

« on: April 06, 2011, 02:16:24 AM »

As far as I can tell, the primary reason for this being slow in C is that you used a math operation (d[ i ] -= 4) that performs software integer bounds checks. Disassemble the two resulting binaries to see how it worked. Also, did you compile with gcc -O3 on both?

gh0st - I'd say asm is more powerful than C, simply because you can do things like sysenter/sysexit. However, since you can inline asm into C it's probably best to use C for the mostpart.

.NET Framework / Re: MD5 - File and string

« on: January 27, 2011, 01:48:40 PM »

You should use UTF8, not ASCII. The reason ASCII was useful in ye olden days was that it perfectly fit into 127 characters, which was 7 bits. The problem was that old legacy 8-bit hardware did annoying things to ASCII characters when you took up the 8th bit too, since this was considered the signing bit (i.e. 0xFF is -1). These days you're much better off using UTF8 since modern processors don't screw this kind of thing up, and it provides compatibility when someone puts an ß (0xDF) or something in their string.

Just as another quick idea for those reading this, you can actually replace MD5CryptoServiceProvider with any of the standard hash functions in the .NET framework. For example, the whole SHA family will work here - just replace MD5CryptoServiceProvider with SHA1Managed, SHA256Managed, SHA384Managed or SHA512Managed and it'll work perfectly.

Operating System / Stuff you (probably) didn't know about Windows

« on: January 24, 2011, 02:34:20 AM »

Introduction
I'm starting this topic in hope of slowly populating it with some cool Windows Internals info and tricks that aren't widely known. Every so often I'll come in and post a new update with a new cool bit of functionality I've reverse engineered or discovered.

Required knowledge
A minimum of intermediate C and general systems programming should be enough to get you by in here. I aim to explain things reasonably thoroughly and simply so hopefully you shouldn't have too tough a time deciphering my mad rants. If you have any questions or ideas, please reply to the thread. Keep it on-topic though!

Warranty
I offer no warranty for this information. It's all pulled out of late night debugging sessions and my own intuition, so there may well be mistakes and confusion. I may have also used tarot cards and a ouija board as technical references at one point. Don't use this stuff in your KMDs. Don't quote this stuff to sound clever in front of your driver developer friends. In fact, don't use this stuff at all. It's brain food designed to help you understand how Windows works inside, nothing more. If you use anything I write in here and your computer halts and catches fire, a wild badger cockblocks you at Mardi Gras, or nazi aliens rape your uncle then it is entirely your own fault for not listening to me when I said you should never use this stuff. Mkay!? Mkay. Now to the serious stuff.

Process and Thread IDs - What are they and how are they allocated?
This question had been bugging me for a month or two until a couple of days ago when I tackled the problem head on and found out the answer. I had sat looking at my task manager and noticed that whilst new processes IDs were usually larger than old ones, the amount they increased was non-uniform and sometimes they did make backward leaps. So, what internal mechanism is used to allocate new pIDs and tIDs in Windows, and what do process IDs really mean in the context of the kernel?

First I tried asking a few people in the know about this kind of thing. My first port of call was Dark_Byte, the creator of Cheat Engine. I talk to him online on a regular basis and he's pretty much an encyclopedia of low level programming. Unfortunately the problem stumped him, too, since he'd never really looked into it. He also posed some questions of his own and made the point that even after a process exits you can query its exit status via its process ID, so the IDs must remain cached somewhere and there must be some protocol as to when they can be disposed of or re-used. During our conversation I also suggested that process and thread IDs may actually come from the same pool, hence the seemingly non-uniform process IDs.
I then asked Mark Russinovich (the Microsoft Sysinternals guy who wrote the Windows Internals book) who explained that pIDs and tIDs are simply indices in certain object tables in the kernel, but couldn't go into the details. Damn the limitations of working for Microsoft! On a side note, it was REALLY cool to actually get a response from him, since he's practically my idol.

Since asking people didn't really give me a decent answer, I had to go my own route. After reading the leaked NT4 kernel source code (you should be able to find this on ThePirateBay), doing a little reversing on Win7 and a few hours playing with my results in WinDbg I can confidently say that I know the flow of how pIDs and tIDs are created.

The kernel needs to be able to generate a sequence of process and thread IDs that are unique across the whole system. To efficiently and safely do this, the kernel creates a pool of IDs that can be used for both processes and threads. This pool is exported in the kernel as a HANDLE_TABLE object called PspCidTable. During Phase0 startup of the system, the PspInitPhase0 function is called. This function creates a HANDLE_TABLE object using ExCreateHandleTable, which automatically populates the table with 65536 entires. Each entry is a 16-bit unsigned integer (at least it is on a 32-bit OS) stored inside a list item object that is part of a doubly linked list. Both process and thread IDs come from the PspCidTable pool.
When a new ID is needed one can be gained using ExCreateHandle, which removes a handle from the PspCidTable pool and makes it active. The logic for disposal of these IDs is somewhat obscure, and seems to be somewhat like a garbage collector. When no handles in object handle tables reference the process ID, all references to it are disposed of and it can be dropped or re-used.
If the PspCidTable reaches a critically low level (i.e. there are very few remaining usable pIDs) it automatically expands to fill with unused IDs that were disposed of previously.

And that's pretty much it for now. If I have any updates to this, I will post them at a later date. I'll also soon be posting more cool facts about Windows Internals that aren't really covered anywhere else.

Assembly - Embedded / [Win32 x86] Performing a detour

« on: January 20, 2011, 04:00:44 AM »

A detour is a way of causing an API call to go through a different set of code to filter or modify the call in some way. Let's say for sake of example that you want to detour kernel32.dll!ReadProcessMemory to act like it failed every time.

Here's an example call to ReadProcessMemory:

Code: [Select]

; assume pID of target is in eax
push eax           ; dwProcessId = eax
push 0             ; bInheritHandle = false
push 38h           ; dwDesiredAccess = VM_READ | VM_WRITE | VM_OPERATION
call kernel32!OpenProcess
push 0h            ; lpNumberOfBytesRead = NULL
push 100h          ; nSize = 4096 bytes
push 2010A0h       ; lpBuffer = 0x002010A0 [address of buffer where we store result]
push 12345h        ; lpBaseAddress = 0x00012345 [base address we want to read]
push eax           ; hProcess = eax
call kernel32!ReadProcessMemory

Notice that the prototypes specified on MSDN for the functions mentioned here have their parameters in the opposite order to the code above. This is because in the stdcall calling convention (the standard for Windows) parameters are pushed onto the stack in reverse order.

There are two ways to create the detour code. One is to utilize the empty padding at the end of the .code section (usually filled with zeros) in order to patch the binary executable on disk. Another method is to create a runtime patch by injecting a DLL or by remotely allocating a block of executable memory in the process, writing the code there, then calling CreateRemoteThread to perform the reroute. I'm going to show you the former method, in which we patch the executable to fake a failure.

To find the area of code we want to patch, simply load up the program in OllyDbg and scroll to the bottom of the .code section in the disassembler. You'll see a bunch of add instructions (they're actually just a representation of lots of zeros!) which can be overwritten. Scroll to where these instructions start, so you can maximize the amount of space you have to play with. These instructions are never executed normally, so we don't have to take any additional safety measures.

In the case of patching ReadProcessMemory, we look at the parameters list and notice that it has five parameters on the stack. The call will also have pushed the return address on top of that, so we need to pop that off safely. The only register we can safely use in this case is eax (unless we start using pushad/popad) because all other registers are expected to remain untouched. What we need to do is pop off the return address and store it, then remove the parameters, push the return address back onto the stack, set eax to a nonzero value (i.e. an error) and return. In this case we will also use SetLastError to fake an error code.

Code: [Select]

pop eax            ; store return address in eax
add esp, 14h       ; remove the next 5 DWORDs from the stack (5x4 = 20d = 14h)
push eax           ; push return address back onto stack
push 5h            ; dwErrCode = ERROR_ACCESS_DENIED
call kernel32.SetLastError
mov eax, -1        ; set eax to non-zero to fake the error
ret                ; return to original code

In OllyDbg you can save this patch to the file, then modify the calls in the process to jump to your routine instead of the actual ReadProcessMemory API.

Pages: [1]