Re: [PATCH V7 36/37] UefiCpuPkg: Setting initial-count register as the last step
Henz, Patrick
Hi all,
toggle quoted message
Show quoted text
We (Hewlett Packard Enterprise) are also running into a race condition due to how InitializeApicTimer initializes the APIC timers, we figured this might be a good place to report our findings. On the occasion we notice that APs get stuck in the timer interrupt handling code after getting woken up by the BSP. It appears that if the CurrentCount timer value provided by the BSP is sufficiently small, the brief amount of time between an AP calling InitializeApicTimer and calling DisableApicTimerInterrupt (see SyncLocalApicTimerSetting as an example) leaves enough room for an APIC timer interrupt to occur. This seems to become more frequent on larger systems with higher processor counts, from what we've gathered the increase in the number of locking sequence invocations appears to be making this condition far more likely to occur. We work on scaled systems with node controllers and we've really only seen this on larger systems, but it seems to us this could feasibly happen on smaller systems too. Our current solution is to add an additional argument to InitializeApicTimer, allowing the caller to specify whether or not APIC timer interrupts are to be enabled for the current thread. Thanks, Patrick Henz Enterprise X86 Labs Hewlett Packard Enterprise patrick.henz@... -----Original Message-----
From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Lendacky, Thomas via groups.io Sent: Friday, May 13, 2022 5:13 PM To: devel@edk2.groups.io; min.m.xu@...; Ni, Ray <ray.ni@...> Cc: Yao, Jiewen <jiewen.yao@...>; Gerd Hoffmann <kraxel@...>; Anthony Perard <anthony.perard@...>; Julien Grall <julien@...>; Dong, Eric <eric.dong@...> Subject: Re: [edk2-devel] [PATCH V7 36/37] UefiCpuPkg: Setting initial-count register as the last step On 5/11/22 19:52, Min Xu via groups.io wrote: On May 11, 2022 10:06 PM, Lendacky, Thomas wrote:I guess you can theoretically miss an interrupt if your initial count is expires before you unmask the interrupt, so I think your fix is correct and no changes are needed.On 5/10/22 21:00, Xu, Min M wrote:I tested this fix in Td guest and it has no side effect.On May 11, 2022 4:30 AM, Tom Lendacky wrote:00000000,00000000,00000000,00000000,00000000,00000000,00000000,000I'm replying to this patch since I can't find patch V12 46/47 I need to double check whether I'm properly resetting the APIC when APs are booted multiple times. Since this only occurs with SNP, I think this is on my end. Thanks, Tom
|
|