Re: [RFC][PATCH v1] UefiCpuPkg/MpInitLib DXE: Reduce AP status check interval


Laszlo Ersek
 

Hi Hao,

On 03/13/20 14:22, Hao A Wu wrote:
This commit will reduce the interval of the AP status check event from 100
milliseconds to 10 milliseconds.

(I searched the history of the 100ms interval, it seems no comment or log
message was mentioned for the choice of this value. Looks like the value
is selected by experience.)

The purpose is to reduce the response time when the BSP calls below
EFI_MP_SERVICES_PROTOCOL services in a non-blocking manner:

* StartupAllAPs()
* StartupThisAP()

Reducing the check interval will benefit the performance for the case when
the BSP uses WaitForEvent() or uses CheckEvent() in a loop to wait for
AP(s) to complete the task, especially when the task can be finished
considerably fast on AP(s).

An example is within function CpuFeaturesInitialize() under
UefiCpuPkg/Library/RegisterCpuFeaturesLib/DxeRegisterCpuFeaturesLib.c,
where BSP will perform the same task with APs and requires all the
processors to finish the task before BSP proceeds to its next task.

Impact:
A. The impact is minimal when there is no non-blocking calls of the
StartupAllAPs/StartupThisAp MP services, because the check function
CheckAndUpdateApsStatus() will return directly when there is no
registered wait event (i.e. no non-blocking request).

B. There will be a performance tradeoff when BSP continues to proceed
other tasks after submitting a non-blocking StartupAllAPs/StartupThisAP
request. If the AP status check takes a good portion of the shortened
interval, BSP will have less time slice working on its own task before
all the APs complete their tasks.

My investigation for Impact B is that it is a rare scenario in the edk2
code base.

Unitests:
A. OS boot successfully.
B. System (with 24 threads) boot time reduced. Almost all the saved time
comes from the above-mentioned case in CpuFeaturesInitialize().

Cc: Eric Dong <eric.dong@...>
Cc: Ray Ni <ray.ni@...>
Cc: Laszlo Ersek <lersek@...>
Cc: Michael D Kinney <michael.d.kinney@...>
Cc: Star Zeng <star.zeng@...>
Signed-off-by: Hao A Wu <hao.a.wu@...>
---
UefiCpuPkg/Library/MpInitLib/DxeMpLib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
index a987c32109..9ba886e8ed 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
@@ -15,7 +15,7 @@

#include <Protocol/Timer.h>

-#define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (100))
+#define AP_CHECK_INTERVAL (EFI_TIMER_PERIOD_MILLISECONDS (10))
#define AP_SAFE_STACK_SIZE 128

CPU_MP_DATA *mCpuMpData = NULL;
The use case is valid, IMO. And the commit message is helpful.

But I really think this constant should be PCD. Here's why I think a
platform might want to control it:

- The best (finest) possible resolution for timer events is platform
dependent, IIUC. The duration of the "idle tick" is platform-specific.
And, it likely makes no sense to set AP_CHECK_INTERVAL to a duration
that's around, or under, what the arch timer resolution allows for.

- In the other direction, CheckAndUpdateApsStatus() contains a loop that
counts up to CpuMpData->CpuCount. In a very large system (hundreds or
maybe thousands of APs) this function may have non-negligible cost.

I suggest introducing a PCD for this (measured in msecs) and using 100
msecs as the default.

Thanks
Laszlo

Join rfc@edk2.groups.io to automatically receive all group messages.