[Bug 3974] New: enabling edk2 DEBUG_VERBOSE will cause pagefault and linux kernel panic


bugzilla-daemon@...
 

https://bugzilla.tianocore.org/show_bug.cgi?id=3974

Bug ID: 3974
Summary: enabling edk2 DEBUG_VERBOSE will cause pagefault and
linux kernel panic
Product: EDK2
Version: Current
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: major
Priority: Lowest
Component: Code
Assignee: unassigned@...
Reporter: yuanyu@...
CC: edk2+bugs+int+994+563148131503455288@groups.io

Summary:
If we turn on DEBUG_VERBOSE, booting Linux will end up with kernel panic.

How to repro:
1) enable DEBUG_VERBOSE, e.g.,
gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel|0x8040004F
2) `qemu-system-x86_64 -bios Build/OvmfX64/DEBUG_GCC5/FV/OVMF.fd -hda
debian-11.img --enable-kvm -m 4G -nographic -debugcon file:debug.log -global
isa-debugcon.iobase=0x402`

Note that only kernel after this commit will panic:
https://github.com/torvalds/linux/commit/08cfb38f3ef49cfd1bba11a00401451606477d80
(committed on Nov 30, 2018)

Analysis:

After recent TDX changes, DEBUG will end up calling CcProbe:

DEBUG
→ DebugPrint()
→ DebugVPrint()
→ DebugPrintMarker()
→ IoWriteFifo8()
→ IsTdxGuest()
→ CcProbe()

CcProbe reads PcdOvmfWorkAreaBase (OvmfPkg/Library/CcProbeLib/CcProbeLib.c),
which is 0x80b000, and it's NOT EfiRuntimeServicesData
(OvmfPkg/Library/PlatformInitLib/MemDetect.c), which means kernel paging
doesn't know it.

When kernel calls SetVariable after paging is set up, logging using DEBUG will
eventually call CcProbe, which reads 0x80b000. Then page fault happens and
kernel panics.

Why does it need a DEBUG_VERBOSE to trigger?
Any logging attempt after ExitBootService (or anything that uses CcProbe) will
trigger it. It just so happens that there is a DEBUG_VERBOSE in
ValidateSetVariable():
DEBUG ((DEBUG_VERBOSE, "%a - Variable (%g:%s) returning %r.\n", __FUNCTION__,
VendorGuid, VariableName, ReturnStatus));


Symptom when kernel panics:

[ 1.055498] BUG: unable to handle page fault for address: 000000000080b000
[ 1.057791] #PF: supervisor read access in kernel mode
[ 1.057791] #PF: error_code(0x0000) - not-present page
[ 1.057791] PGD 100191063 P4D 100191063 PUD 1001ad063 PMD 1001ae063 PTE 0
[ 1.057791] Oops: 0000 [#1] SMP PTI
[ 1.057791] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.0-13-cloud-amd64
#1 Debian 5.10.106-1
[ 1.057791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
0.0.0 02/06/2015
[ 1.057791] RIP: 0010:0xfffffffeffac6596
[ 1.057791] Code: d6 00 00 01 3c e9 0f 94 05 f7 d5 00 00 8a 05 f1 d5 00 00
48 83 c4 48 c3 8a 05 e6 d5 00 00 c3 41 54 49 89 cc 56 53 48 83 ec 30 <80> 3c 25
00 b0 80 00 02 75 63 4c 89 c3 49 8d 34 10 48 39 f3 74 64
[ 1.057791] RSP: 0000:ffffffff96a03ad0 EFLAGS: 00010282
[ 1.057791] RAX: 0000000000000060 RBX: 0000000000000000 RCX:
0000000000000402
[ 1.057791] RDX: 0000000000000060 RSI: ffffffff96a03c60 RDI:
0000000000000001
[ 1.057791] RBP: ffffffff96a03eb0 R08: ffffffff96a03b60 R09:
0000000000000000
[ 1.057791] R10: fffffffeffacf95e R11: 0000000000000002 R12:
0000000000000402
[ 1.057791] R13: ffffffff96a03c90 R14: 0000000000000007 R15:
0000000000000000
[ 1.057791] FS: 0000000000000000(0000) GS:ffff9173bbc00000(0000)
knlGS:0000000000000000
[ 1.057791] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.057791] CR2: 000000000080b000 CR3: 0000000100192000 CR4:
00000000000006b0
[ 1.057791] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1.057791] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 1.057791] Call Trace:
[ 1.057791] ? __alloc_pages_nodemask+0x161/0x310
[ 1.057791] ? __flush_tlb_all+0x18/0x30
[ 1.057791] ? kernel_map_pages_in_pgd+0xba/0xda
[ 1.057791] ? efi_update_mappings+0x66/0x91
[ 1.057791] ? __efi_call+0x25/0x30
[ 1.057791] ? load_new_mm_cr3+0x40/0xd0
[ 1.057791] ? switch_mm_irqs_off+0x193/0x3a0
[ 1.057791] ? virt_efi_set_variable_nonblocking+0xa2/0x100
[ 1.057791] ? efi_delete_dummy_variable+0x54/0x70
[ 1.057791] ? efi_enter_virtual_mode+0x409/0x424
[ 1.057791] ? start_kernel+0x4f1/0x585
[ 1.057791] ? secondary_startup_64_no_verify+0xb0/0xbb
[ 1.057791] Modules linked in:
[ 1.057791] CR2: 000000000080b000
[ 1.057791] ---[ end trace 0ba8e5e2f5add16d ]---

--
You are receiving this mail because:
You are on the CC list for the bug.