Yes it is very strange.
toggle quoted messageShow quoted text
On Mar 18, 2021, at 7:17 PM, Yao, Jiewen <jiewen.yao@...> wrote:
The CR0.WP is set. But the page table protection may be turn OFF/ON again, if CPU driver need update it to protect an EFI image. Maybe it a bug somewhere.
I just read the final debug log to see the final result.
RSVD in exception data is weird. I think we need confirm what at first.
From: Andrew Fish <afish@...>
Sent: Friday, March 19, 2021 9:44 AM
To: Yao, Jiewen <jiewen.yao@...>
Cc: firstname.lastname@example.org; annie.li@...; Laszlo Ersek
<lersek@...>; Wang, Jian J <jian.j.wang@...>; Aaron Young
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due
to commit 5267926
Stupid question? Is there a reason the page tables are not write protected and
the write to the page table would fault?
On Mar 18, 2021, at 5:37 PM, Yao, Jiewen <jiewen.yao@...> wrote:Hi Arie
I added some of my thought in the Bugzilla. -
If you can dump paging structure info for further analysis, we can help to
From: email@example.com <firstname.lastname@example.org> On Behalf Of annie
Sent: Friday, March 19, 2021 3:27 AM
To: Laszlo Ersek <lersek@...>; email@example.com
Cc: Wang, Jian J <jian.j.wang@...>; Andrew Fish <afish@...>;
Aaron Young <aaron.young@...>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode
to commit 5267926
In my previous email, the exception is reproduced with pretty old code base
from where I started bisecting the comments. This time I reproduce this issue
with the code of branch 'stable/202011' of upstream. All the log I am
is from this code base(75ab038). Since the overall size of all log is pretty big,
attach all the data you required in to this
I dump the register by qmp-regdump, and the result(regdump) is uploaded
this bug. If this log doesn't suffice, can you please suggest the way you prefer?
The objdump is uploaded, as well as the details of page fault exception,
check the details there.
From: Laszlo Ersek [mailto:lersek@...]
Sent: Thursday, March 18, 2021 9:23 AM
To: Annie Li <annie.li@...>; firstname.lastname@example.org
Cc: jian.j.wang@...; Andrew Fish <afish@...>
Subject: Re: Windows guest fails to boot into recovery mode due to commit
On 03/18/21 02:48, Annie Li wrote:
I ran into a windows booting failure issue(a page fault exception),
and narrow down it to the following patch,
MdeModulePkg/DxeIpl: support more NX related PCDs
This issue always happens after QMP is terminated by <ctrl-C> twice, see
should boot into recovery mode in this round, and this is due to the previous
1. Boot Windows VM up, and <ctrl-C> to exit the QMP
2. Repeat 1
3. Boot Windows VM, and this page fault issue happens. (Note: Windows
consecutive boot failure, see
During above 3 windows booting procedures, the value of following
variables are always the same, PcdSetNxForStack 0
PcdDxeNxMemoryProtectionPolicy 0 PcdImageProtectionPolicy 2
However, Windows guest fails to boot up into recovery mode in the 3rd
due to the patch above(5267926). I modified the return value to
(PcdSetNxForStack)" in function "IsEnableNonExecNeeded" in
MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.c, this page fault issue
gone with this change. The patch(5267926) is for fixing bug
sjSaIZXk_W_MusXNUlQBxGqsJBTLSxdsog$ , where the comments show
PcdImageProtectionPolicy needs also to enable NXE. But this does cause the
page fault exception in this scenario, any suggestion?
The page fault exception is pasted here,
!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000009 I:0 R:1 U:0 W:0 P:1 PK:0 SS:0 SGX:0
RIP - 000000003E0A7C75, CS - 0000000000000038, RFLAGS -
0000000000010202 RAX - 8000000000000003, RCX - 0000000000000001,
- 0000000001040001 RBX - 0000000000000001, RSP - 00000000001A6AA0,
RBP - 0000000001040001 RSI - 000000003F2E2010, RDI -
R8 - 0000000000000000, R9 - 000000003E0AEC90, R10 -
R11 - 00000000001A6E90, R12 - 0000000000000000, R13 -
R14 - 00000000001A6B28, R15 - 00000000001AB000
DS - 0000000000000030, ES - 0000000000000030, FS -
GS - 0000000000000030, SS - 00000000000000300000000000000000
CR0 - 0000000080010033, CR2 - 000000003F2E2010, CR3 -
CR4 - 0000000000040668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400 GDTR - 000000003F1EE698 0000000000000047, LDTR -
IDTR - 000000003ECCA018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 00000000001A6700
!!!! Find image based on IP(0x3E0A7C75) /builddir/build/BUILD/edk2-
(ImageBase=000000003E0A5000, EntryPoint=000000003E0A86E8) !!!!
In addition to what Andrew said, I suggest the following:
(1) Please rebuild OVMF *locally*, using the same edk2 tree, and the same
toolchain, and the same "build" flags.
(2) Reproduce the issue, capture the register dump.
(3) Run the following command:
objdump -f -S
The point of this exercise is to reproduce the issue with such an OVMF build
which you have a matching "TerminalDxe.debug" file. Once you do that, you
run "objdump" on the ".debug" file, and get a disassembly of the TerminalDxe
driver, inter-leaved with the C language source code.
Then, we can do two things:
- we can verify whether (EntryPoint - ImageBase), from the register dump,
matches the (relative) "start address" that "objdump -f" reports,
- we can take the crash offset (RIP - ImageBase), from the register dump, and
use that offset into the "objdump -S" disassembly, to narrow down what the
terminal driver may have been doing to trigger the crash.
It's not necessarily the terminal driver's fault that encounter a crash, but
knowing what TerminalDxe was up to, might shed light on the actual reason.
of course also possible that TerminalDxe *is* at fault. We'll see.
If possible, please post:
- your precise edk2 version (if you have local patches, it would be best to
reproduce with an upstream-only tree),
- your full firmware log (feel free to compress it),
- the register dump from serial,
- the objdump (disassembly) output (feel free to compress it).