- This approach was used in the ATL/WTL (Active Template Library, Windows Template Library) in the early 2000-s. It was a bad idea, because you need to generate executable code, interfering with NX-bit memory protection.
Windows actually had a workaround in its NX-bit implementation that recognized the byte patterns of these trampolines from the fault handler: https://web.archive.org/web/20090123222148/http://support.mi...
- This somewhat reminds me of the old MakeProcInstance mechanism in Win16, which was quickly rendered obsolete by someone who made an important realisation: https://www.geary.com/fixds.html
Another seemingly underutilised feature closely related to {Get,Set}WindowLong is cbClsExtra/cbWndExtra which lets you allocate additional data associated with a window, and store whatever you want there. The indices to the GWL/SWL function are quite revealing of how this mechanism works:
https://learn.microsoft.com/en-us/windows/win32/api/winuser/...
- > This is more work than going through GWLP_USERDATA
Indeed, aside from a party trick, why build an executable trampoline at runtime when you can store and retrieve the context, or a pointer to the context, with SetWindowLong() / GetWindowLong() [1]?
Slightly related: in my view Win32 windows are a faithful implementation of the Actor Model. The window proc of a window is mutable, it represents the current behavior, and can be changed in response to any received message. While I haven't personally seen this used in Win32 programs it is a powerful feature as it allows for implementing interaction state machines in a very natural way (the same way that Miro Samek promotes in his book.)
[1] https://learn.microsoft.com/en-us/windows/win32/api/winuser/...
- Hah! I usually allocate trampolines at runtime, as the article suggests, but reserving R/W space for them within the application's memory space is a cute trick.
Probably not useful for most of my use cases (I'm usually injecting a payload, so I'd still have the pointer-distance issue between the executable and my payload), but it's still potentially handy. Will have to keep that around!
- > Taking this idea further, I’d like to generate these new functions on demand at run time akin to a JIT compiler
This is cool, but isn’t runtime code generation pretty frowned upon nowadays?
- Or I don't know, just use C++ lambdas instead?
- I hate to say it (and I know a lot of C apologists will downvote it), but there is no native closure in C, all you have is a function pointer in C, and you need to manually add the "context" pointer to make it a closure, in the strict (textbook) sense. That's because C does not have the concept of "data ownership", only automatic memory (that is on stack or register) or manual memory (in the sense of malloc/sbrk'd blocks), but a (again, textbook definition of) closure requires you to have access to the data of caller/"parent"/upper layer [^1].
And that's why I generally don't see C to have closures, and requires a JIT/dynamic code generation approach as this article has actually done (using shadow stacks). There is also a hack in GNU C which introduce local function lambda, but it is not in ISO C, and obviously won't in the next decade or so.
[^1]: https://en.wikipedia.org/wiki/Closure_(computer_programming)