Stack unwinding in SMM


mick21@...
 

Hello everyone,

At the moment I'm trying to perform some form of stack unwinding in SMM, but I'm failing to successfully do that at the moment. I tried to use LLVM's __builtin_return_address(), but this only returns the return address for entry 0. Looking at the assembly, it seems that sometimes function epilogues start with "lea 0x8(%rbp),%rsp" and then a certain amount of pop instructions, while other times the function epilogue starts with "add $0x20,%rsp" and then a certain amount of pop instructions. This seems to indicate that manually performing the stack unwinding is cumbersome, as there are a lot of cases to cover.

I was wondering whether there are better ways to perform this task, where I want to retrieve a call stack from arbitrary places in an SMM driver. I'm not sure where to start, I have looked online but I would likely need unwind tables (which seem to be enabled in EDK II https://github.com/tianocore/edk2/commit/77b738b36f5f19b33a77ff74ce822c9c5a050876), but I'm not sure how I should use them in the EDK II project or whether they are even present in my .efi file. The sections available in my .efi file are .text, .rdata, .data, and .reloc.

Kind regards,

Mick


Andrew Fish
 

On Jul 1, 2021, at 4:08 AM, mick21@live.nl wrote:

Hello everyone,

At the moment I'm trying to perform some form of stack unwinding in SMM, but I'm failing to successfully do that at the moment. I tried to use LLVM's __builtin_return_address(), but this only returns the return address for entry 0. Looking at the assembly, it seems that sometimes function epilogues start with "lea 0x8(%rbp),%rsp" and then a certain amount of pop instructions, while other times the function epilogue starts with "add $0x20,%rsp" and then a certain amount of pop instructions. This seems to indicate that manually performing the stack unwinding is cumbersome, as there are a lot of cases to cover.
Mick,

It might be possible to turn on frame pointers via: -fno-omit-frame-pointer

The Xcode flavor of clang defaults to emitting the frame pointer so __builtin_return_address() works as expected for N levels, and you can manually walk the stack in code.

The other option is to use the debugger to walk the unwind + debug info to get stack frames….

FYI for Xcode the pattern is:

pushq %rbp
movq %rsp, %rbp

popq %rbp
retq

So %rbp is a frame pointer, and the return address is on the stack from the call.

I was wondering whether there are better ways to perform this task, where I want to retrieve a call stack from arbitrary places in an SMM driver. I'm not sure where to start, I have looked online but I would likely need unwind tables (which seem to be enabled in EDK II https://github.com/tianocore/edk2/commit/77b738b36f5f19b33a77ff74ce822c9c5a050876), but I'm not sure how I should use them in the EDK II project or whether they are even present in my .efi file. The sections available in my .efi file are .text, .rdata, .data, and .reloc.
Caveat emptor I’m speculating here…..

I’m not sure the unwind tables are part of the ABI, they are likely treated more like debugging info. I did find this [1]. In olden time the MSFT PDB was a moving target and you needed a DLL to parse it, it might have gotten more stable over the years.

I think the unwind tables are to support partial debugging (walking a stack frame), but the unwind info is processed by something in the runtime or debugger. I don’t know of any edk2 to unwind. I did notice this [2], so you may want to look at that?

[1] https://docs.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-160
[2] https://www.nongnu.org/libunwind/

Thanks,

Andrew Fish

Kind regards,

Mick





mick21@...
 

Hi Andrew,

Thank you for your reply :). I should have been more specific, I have instrumented an SMM driver and want to retrieve stack traces at arbitrary points in program execution from within the SMM driver, so a debugger is not an option. Currently, I'm building edk2 on Arch with LLVM 9.0.0, but I was unable to successfully use the __builtin_return_address(N) function to retrieve return addresses for N > 0. Interesting that it works for Xcode, I thought that edk2 forced the compiler to emit function pro/epilogues as described here [1] and that that was causing __builtin_return_address() to fail.

This is useful to know, I had assumed this was the case.

I’m not sure the unwind tables are part of the ABI, they are likely treated more like debugging info.
I'm unfamiliar with how Xcode works, but if you are able to use __builtin_return_address() successfully, then I should be able to change some of the commandline flags to get the same result, or might something else cause this difference? Retrieving the first four return addresses is what I would like to do.

I already tried to supply "-fno-omit-frame-pointer", but this results in a frame pointer in the following way [2], where I still would have to find out how many local variables the function is using (see the picture where the frame pointer of function B is below the return address and the variable amount of local variables and nonvolatile registers used in function B). Correct me if I'm wrong, but if I'm unable to use the function __builtin_return_address(), then I should use a manual implementation, assuming that unwind tables are no option.

Sincerely,

Mick

[1] https://docs.microsoft.com/en-us/cpp/build/prolog-and-epilog?view=msvc-160
[2] https://docs.microsoft.com/en-us/cpp/build/stack-usage?view=msvc-160