Topics

[PATCH v8 02/10] OvmfPkg/CpuHotplugSmm: collect hot-unplug events


Ankur Arora
 

Process fw_remove events in QemuCpuhpCollectApicIds() and collect
corresponding APIC IDs for CPUs that are being hot-unplugged.

In addition, we now ignore CPUs which only have remove set. These
CPUs haven't been processed by OSPM yet.

This is based on the QEMU hot-unplug protocol documented here:
https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
Addresses the following review comments from v6:
(1,4) Move (and also rename) QEMU_CPUHP_STAT_EJECTED to patch 8,
where we actually use it.
(2) Downgrade debug mask from DEBUG_INFO to DEBUG_VERBOSE.
(3a,3b,3c) Keep the CurrentSelector increment operation at
the tail of the loop.
() As discussed elsewhere we also need to get the CpuSelector while
collecting ApicIds in QemuCpuhpCollectApicIds(). This patch adds a
separate parameter for the CpuSelector values, because that works
better alongside the hotplug ExtendIds logic.

OvmfPkg/CpuHotplugSmm/QemuCpuhp.h | 1 +
OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h | 1 +
OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 21 +++++-
OvmfPkg/CpuHotplugSmm/QemuCpuhp.c | 84 ++++++++++++++++-------
4 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
index 8adaa0ad91f0..1e23b150910e 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
@@ -55,6 +55,7 @@ QemuCpuhpCollectApicIds (
OUT APIC_ID *PluggedApicIds,
OUT UINT32 *PluggedCount,
OUT APIC_ID *ToUnplugApicIds,
+ OUT UINT32 *ToUnplugSelector,
OUT UINT32 *ToUnplugCount
);

diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
index a34a6d3fae61..2ec7a107a64d 100644
--- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
+++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
@@ -34,6 +34,7 @@
#define QEMU_CPUHP_STAT_ENABLED BIT0
#define QEMU_CPUHP_STAT_INSERT BIT1
#define QEMU_CPUHP_STAT_REMOVE BIT2
+#define QEMU_CPUHP_STAT_FW_REMOVE BIT4

#define QEMU_CPUHP_RW_CMD_DATA 0x8

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index bf68fcd42914..3192bfea1f15 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -52,6 +52,7 @@ STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
//
STATIC APIC_ID *mPluggedApicIds;
STATIC APIC_ID *mToUnplugApicIds;
+STATIC UINT32 *mToUnplugSelector;
//
// Address of the non-SMRAM reserved memory page that contains the Post-SMM Pen
// for hot-added CPUs.
@@ -289,6 +290,7 @@ CpuHotplugMmi (
mPluggedApicIds,
&PluggedCount,
mToUnplugApicIds,
+ mToUnplugSelector,
&ToUnplugCount
);
if (EFI_ERROR (Status)) {
@@ -333,7 +335,9 @@ CpuHotplugEntry (
)
{
EFI_STATUS Status;
+ UINTN Len;
UINTN Size;
+ UINTN SizeSel;

//
// This module should only be included when SMM support is required.
@@ -387,8 +391,9 @@ CpuHotplugEntry (
//
// Allocate the data structures that depend on the possible CPU count.
//
- if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Size)) ||
- RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Size, &Size))) {
+ if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Len)) ||
+ RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Len, &Size))||
+ RETURN_ERROR (SafeUintnMult (sizeof (UINT32), Len, &SizeSel))) {
Status = EFI_ABORTED;
DEBUG ((DEBUG_ERROR, "%a: invalid CPU_HOT_PLUG_DATA\n", __FUNCTION__));
goto Fatal;
@@ -405,6 +410,12 @@ CpuHotplugEntry (
DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
goto ReleasePluggedApicIds;
}
+ Status = gMmst->MmAllocatePool (EfiRuntimeServicesData, SizeSel,
+ (VOID **)&mToUnplugSelector);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
+ goto ReleaseToUnplugApicIds;
+ }

//
// Allocate the Post-SMM Pen for hot-added CPUs.
@@ -412,7 +423,7 @@ CpuHotplugEntry (
Status = SmbaseAllocatePostSmmPen (&mPostSmmPenAddress,
SystemTable->BootServices);
if (EFI_ERROR (Status)) {
- goto ReleaseToUnplugApicIds;
+ goto ReleaseToUnplugSelector;
}

//
@@ -472,6 +483,10 @@ ReleasePostSmmPen:
SmbaseReleasePostSmmPen (mPostSmmPenAddress, SystemTable->BootServices);
mPostSmmPenAddress = 0;

+ReleaseToUnplugSelector:
+ gMmst->MmFreePool (mToUnplugSelector);
+ mToUnplugSelector = NULL;
+
ReleaseToUnplugApicIds:
gMmst->MmFreePool (mToUnplugApicIds);
mToUnplugApicIds = NULL;
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index 8d4a6693c8d6..36372a5e6193 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -164,6 +164,9 @@ QemuCpuhpWriteCommand (
@param[out] ToUnplugApicIds The APIC IDs of the CPUs that are about to be
hot-unplugged.

+ @param[out] ToUnplugSelector The QEMU Selectors of the CPUs that are about to
+ be hot-unplugged.
+
@param[out] ToUnplugCount The number of filled-in APIC IDs in
ToUnplugApicIds.

@@ -187,6 +190,7 @@ QemuCpuhpCollectApicIds (
OUT APIC_ID *PluggedApicIds,
OUT UINT32 *PluggedCount,
OUT APIC_ID *ToUnplugApicIds,
+ OUT UINT32 *ToUnplugSelector,
OUT UINT32 *ToUnplugCount
)
{
@@ -204,6 +208,7 @@ QemuCpuhpCollectApicIds (
UINT32 PendingSelector;
UINT8 CpuStatus;
APIC_ID *ExtendIds;
+ UINT32 *ExtendSel;
UINT32 *ExtendCount;
APIC_ID NewApicId;

@@ -245,10 +250,10 @@ QemuCpuhpCollectApicIds (
if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
//
// The "insert" event guarantees the "enabled" status; plus it excludes
- // the "remove" event.
+ // the "fw_remove" event.
//
if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
- (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+ (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
"inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
CpuStatus));
@@ -259,40 +264,69 @@ QemuCpuhpCollectApicIds (
CurrentSelector));

ExtendIds = PluggedApicIds;
+ ExtendSel = NULL;
ExtendCount = PluggedCount;
- } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
- DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
- CurrentSelector));
+ } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
+ //
+ // "fw_remove" event guarantees "enabled".
+ //
+ if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
+ DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
+ "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
+ CpuStatus));
+ return EFI_PROTOCOL_ERROR;
+ }
+
+ DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
+ __FUNCTION__, CurrentSelector));

ExtendIds = ToUnplugApicIds;
+ ExtendSel = ToUnplugSelector;
ExtendCount = ToUnplugCount;
+ } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+ //
+ // Let the OSPM deal with the "remove" event.
+ //
+ DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove (ignored)\n",
+ __FUNCTION__, CurrentSelector));
+
+ ExtendIds = NULL;
+ ExtendSel = NULL;
+ ExtendCount = NULL;
} else {
DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
__FUNCTION__, CurrentSelector));
break;
}

- //
- // Save the APIC ID of the CPU with the pending event, to the corresponding
- // APIC ID array.
- //
- if (*ExtendCount == ApicIdCount) {
- DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
- return EFI_BUFFER_TOO_SMALL;
- }
- QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
- NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
- DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
- NewApicId));
- ExtendIds[(*ExtendCount)++] = NewApicId;
+ ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
+ if (ExtendIds != NULL) {
+ //
+ // Save the APIC ID of the CPU with the pending event, to the
+ // corresponding APIC ID array.
+ // For unplug events, also save the CurrentSelector.
+ //
+ if (*ExtendCount == ApicIdCount) {
+ DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
+ return EFI_BUFFER_TOO_SMALL;
+ }
+ QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
+ NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
+ DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
+ NewApicId));
+ if (ExtendSel != NULL) {
+ ExtendSel[(*ExtendCount)] = CurrentSelector;
+ }
+ ExtendIds[(*ExtendCount)++] = NewApicId;

- //
- // We've processed the CPU with (known) pending events, but we must never
- // clear events. Therefore we need to advance past this CPU manually;
- // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
- // selected CPU.
- //
- CurrentSelector++;
+ //
+ // We've processed the CPU with (known) pending events, but we must never
+ // clear events. Therefore we need to advance past this CPU manually;
+ // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
+ // selected CPU.
+ //
+ CurrentSelector++;
+ }
} while (CurrentSelector < PossibleCpuCount);

DEBUG ((DEBUG_VERBOSE, "%a: PluggedCount=%u ToUnplugCount=%u\n",
--
2.9.3


Laszlo Ersek
 

On 02/22/21 08:19, Ankur Arora wrote:
Process fw_remove events in QemuCpuhpCollectApicIds() and collect
corresponding APIC IDs for CPUs that are being hot-unplugged.
(1) We also collect selectors for those; please mention the fact here.



In addition, we now ignore CPUs which only have remove set. These
CPUs haven't been processed by OSPM yet.

This is based on the QEMU hot-unplug protocol documented here:
https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
Addresses the following review comments from v6:
(1,4) Move (and also rename) QEMU_CPUHP_STAT_EJECTED to patch 8,
where we actually use it.
(2) Downgrade debug mask from DEBUG_INFO to DEBUG_VERBOSE.
(3a,3b,3c) Keep the CurrentSelector increment operation at
the tail of the loop.
() As discussed elsewhere we also need to get the CpuSelector while
collecting ApicIds in QemuCpuhpCollectApicIds(). This patch adds a
separate parameter for the CpuSelector values, because that works
better alongside the hotplug ExtendIds logic.

OvmfPkg/CpuHotplugSmm/QemuCpuhp.h | 1 +
OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h | 1 +
OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 21 +++++-
OvmfPkg/CpuHotplugSmm/QemuCpuhp.c | 84 ++++++++++++++++-------
4 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
index 8adaa0ad91f0..1e23b150910e 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
@@ -55,6 +55,7 @@ QemuCpuhpCollectApicIds (
OUT APIC_ID *PluggedApicIds,
OUT UINT32 *PluggedCount,
OUT APIC_ID *ToUnplugApicIds,
+ OUT UINT32 *ToUnplugSelector,
OUT UINT32 *ToUnplugCount
);
(2) For consistency with the parameter names "PluggedApicIds" (plural)
and "ToUnplugApicIds" (plural), please rename this parameter to
"ToUnplugSelectors".


diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
index a34a6d3fae61..2ec7a107a64d 100644
--- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
+++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
@@ -34,6 +34,7 @@
#define QEMU_CPUHP_STAT_ENABLED BIT0
#define QEMU_CPUHP_STAT_INSERT BIT1
#define QEMU_CPUHP_STAT_REMOVE BIT2
+#define QEMU_CPUHP_STAT_FW_REMOVE BIT4

#define QEMU_CPUHP_RW_CMD_DATA 0x8

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index bf68fcd42914..3192bfea1f15 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -52,6 +52,7 @@ STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
//
STATIC APIC_ID *mPluggedApicIds;
STATIC APIC_ID *mToUnplugApicIds;
+STATIC UINT32 *mToUnplugSelector;
(3) For consistency with the other two global variable identifiers,
please call this "mToUnplugSelector" (plural), too.

(4) The comment above these global variables only talks about APIC IDs;
please update the comment with a reference / explanation of the selectors.


//
// Address of the non-SMRAM reserved memory page that contains the Post-SMM Pen
// for hot-added CPUs.
@@ -289,6 +290,7 @@ CpuHotplugMmi (
mPluggedApicIds,
&PluggedCount,
mToUnplugApicIds,
+ mToUnplugSelector,
&ToUnplugCount
);
if (EFI_ERROR (Status)) {
@@ -333,7 +335,9 @@ CpuHotplugEntry (
)
{
EFI_STATUS Status;
+ UINTN Len;
UINTN Size;
+ UINTN SizeSel;

//
// This module should only be included when SMM support is required.
@@ -387,8 +391,9 @@ CpuHotplugEntry (
//
// Allocate the data structures that depend on the possible CPU count.
//
- if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Size)) ||
- RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Size, &Size))) {
+ if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Len)) ||
+ RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Len, &Size))||
(5) Missing space character before "||".


+ RETURN_ERROR (SafeUintnMult (sizeof (UINT32), Len, &SizeSel))) {
Status = EFI_ABORTED;
DEBUG ((DEBUG_ERROR, "%a: invalid CPU_HOT_PLUG_DATA\n", __FUNCTION__));
goto Fatal;
@@ -405,6 +410,12 @@ CpuHotplugEntry (
DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
goto ReleasePluggedApicIds;
}
+ Status = gMmst->MmAllocatePool (EfiRuntimeServicesData, SizeSel,
+ (VOID **)&mToUnplugSelector);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
+ goto ReleaseToUnplugApicIds;
+ }

//
// Allocate the Post-SMM Pen for hot-added CPUs.
@@ -412,7 +423,7 @@ CpuHotplugEntry (
Status = SmbaseAllocatePostSmmPen (&mPostSmmPenAddress,
SystemTable->BootServices);
if (EFI_ERROR (Status)) {
- goto ReleaseToUnplugApicIds;
+ goto ReleaseToUnplugSelector;
}

//
@@ -472,6 +483,10 @@ ReleasePostSmmPen:
SmbaseReleasePostSmmPen (mPostSmmPenAddress, SystemTable->BootServices);
mPostSmmPenAddress = 0;

+ReleaseToUnplugSelector:
(6) Please call this label "ReleaseToUnplugSelectors" (plural).


+ gMmst->MmFreePool (mToUnplugSelector);
+ mToUnplugSelector = NULL;
+
ReleaseToUnplugApicIds:
gMmst->MmFreePool (mToUnplugApicIds);
mToUnplugApicIds = NULL;
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index 8d4a6693c8d6..36372a5e6193 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -164,6 +164,9 @@ QemuCpuhpWriteCommand (
@param[out] ToUnplugApicIds The APIC IDs of the CPUs that are about to be
hot-unplugged.

+ @param[out] ToUnplugSelector The QEMU Selectors of the CPUs that are about to
+ be hot-unplugged.
+
@param[out] ToUnplugCount The number of filled-in APIC IDs in
ToUnplugApicIds.
(7) Please (a) call the parameter "ToUnplugSelectors" (plural), and (b)
make sure that there are two space characters between the variable name
"column" and the documentation "column". (All in all, please move the
RHS column to the right by two spaces.)


@@ -187,6 +190,7 @@ QemuCpuhpCollectApicIds (
OUT APIC_ID *PluggedApicIds,
OUT UINT32 *PluggedCount,
OUT APIC_ID *ToUnplugApicIds,
+ OUT UINT32 *ToUnplugSelector,
OUT UINT32 *ToUnplugCount
)
{
@@ -204,6 +208,7 @@ QemuCpuhpCollectApicIds (
UINT32 PendingSelector;
UINT8 CpuStatus;
APIC_ID *ExtendIds;
+ UINT32 *ExtendSel;
(8) Please use plural -- "ExtendSels" --, consistently with "ExtendIds".


UINT32 *ExtendCount;
APIC_ID NewApicId;

@@ -245,10 +250,10 @@ QemuCpuhpCollectApicIds (
if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
//
// The "insert" event guarantees the "enabled" status; plus it excludes
- // the "remove" event.
+ // the "fw_remove" event.
//
if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
- (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+ (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
"inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
CpuStatus));
@@ -259,40 +264,69 @@ QemuCpuhpCollectApicIds (
CurrentSelector));

ExtendIds = PluggedApicIds;
+ ExtendSel = NULL;
ExtendCount = PluggedCount;
- } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
- DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
- CurrentSelector));
+ } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
+ //
+ // "fw_remove" event guarantees "enabled".
+ //
+ if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
+ DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
+ "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
+ CpuStatus));
+ return EFI_PROTOCOL_ERROR;
+ }
+
+ DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
+ __FUNCTION__, CurrentSelector));

ExtendIds = ToUnplugApicIds;
+ ExtendSel = ToUnplugSelector;
ExtendCount = ToUnplugCount;
+ } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+ //
+ // Let the OSPM deal with the "remove" event.
+ //
+ DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove (ignored)\n",
+ __FUNCTION__, CurrentSelector));
+
+ ExtendIds = NULL;
+ ExtendSel = NULL;
+ ExtendCount = NULL;
} else {
DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
__FUNCTION__, CurrentSelector));
break;
}

- //
- // Save the APIC ID of the CPU with the pending event, to the corresponding
- // APIC ID array.
- //
- if (*ExtendCount == ApicIdCount) {
- DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
- return EFI_BUFFER_TOO_SMALL;
- }
- QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
- NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
- DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
- NewApicId));
- ExtendIds[(*ExtendCount)++] = NewApicId;
+ ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
(9) Please follow this ASSERT with another ASSERT(), on a separate line:

ASSERT (ExtendSels == NULL || ExtendIds != NULL);

Explanation:

- I'd like us to capture: if ExtendSels is not NULL, then ExtendIds is
also not NULL.

- Furthermore, express the A-->B implication using its equivalent
formulation (!A)||B.


+ if (ExtendIds != NULL) {
+ //
+ // Save the APIC ID of the CPU with the pending event, to the
+ // corresponding APIC ID array.
+ // For unplug events, also save the CurrentSelector.
+ //
+ if (*ExtendCount == ApicIdCount) {
+ DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
+ return EFI_BUFFER_TOO_SMALL;
+ }
+ QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
+ NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
+ DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
+ NewApicId));
+ if (ExtendSel != NULL) {
+ ExtendSel[(*ExtendCount)] = CurrentSelector;
+ }
+ ExtendIds[(*ExtendCount)++] = NewApicId;
OK thus far...


- //
- // We've processed the CPU with (known) pending events, but we must never
- // clear events. Therefore we need to advance past this CPU manually;
- // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
- // selected CPU.
- //
- CurrentSelector++;
+ //
+ // We've processed the CPU with (known) pending events, but we must never
+ // clear events. Therefore we need to advance past this CPU manually;
+ // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
+ // selected CPU.
+ //
+ CurrentSelector++;
+ }
(10) ... but this is a logic bug: the comment and the CurrentSelector
increment must not depend on (ExtendIds != NULL).

As written, if we ever see a QEMU_CPUHP_STAT_REMOVE event (which we're
supposed to ignore), we'll never move past that CPU, but will be stuck
in an infinite loop. The QEMU_CPUHP_CMD_GET_PENDING command will keep
returning the CPU with the QEMU_CPUHP_STAT_REMOVE event pending.

If this was unclear from my previous feedback, then I apologize, I
should have been clearer.

Thanks,
Laszlo

} while (CurrentSelector < PossibleCpuCount);

DEBUG ((DEBUG_VERBOSE, "%a: PluggedCount=%u ToUnplugCount=%u\n",


Ankur Arora
 

On 2021-02-22 4:27 a.m., Laszlo Ersek wrote:
On 02/22/21 08:19, Ankur Arora wrote:
Process fw_remove events in QemuCpuhpCollectApicIds() and collect
corresponding APIC IDs for CPUs that are being hot-unplugged.
(1) We also collect selectors for those; please mention the fact here.


In addition, we now ignore CPUs which only have remove set. These
CPUs haven't been processed by OSPM yet.

This is based on the QEMU hot-unplug protocol documented here:
https://lore.kernel.org/qemu-devel/20201204170939.1815522-3-imammedo@redhat.com/

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
Addresses the following review comments from v6:
(1,4) Move (and also rename) QEMU_CPUHP_STAT_EJECTED to patch 8,
where we actually use it.
(2) Downgrade debug mask from DEBUG_INFO to DEBUG_VERBOSE.
(3a,3b,3c) Keep the CurrentSelector increment operation at
the tail of the loop.
() As discussed elsewhere we also need to get the CpuSelector while
collecting ApicIds in QemuCpuhpCollectApicIds(). This patch adds a
separate parameter for the CpuSelector values, because that works
better alongside the hotplug ExtendIds logic.

OvmfPkg/CpuHotplugSmm/QemuCpuhp.h | 1 +
OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h | 1 +
OvmfPkg/CpuHotplugSmm/CpuHotplug.c | 21 +++++-
OvmfPkg/CpuHotplugSmm/QemuCpuhp.c | 84 ++++++++++++++++-------
4 files changed, 79 insertions(+), 28 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
index 8adaa0ad91f0..1e23b150910e 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.h
@@ -55,6 +55,7 @@ QemuCpuhpCollectApicIds (
OUT APIC_ID *PluggedApicIds,
OUT UINT32 *PluggedCount,
OUT APIC_ID *ToUnplugApicIds,
+ OUT UINT32 *ToUnplugSelector,
OUT UINT32 *ToUnplugCount
);
(2) For consistency with the parameter names "PluggedApicIds" (plural)
and "ToUnplugApicIds" (plural), please rename this parameter to
"ToUnplugSelectors".

diff --git a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
index a34a6d3fae61..2ec7a107a64d 100644
--- a/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
+++ b/OvmfPkg/Include/IndustryStandard/QemuCpuHotplug.h
@@ -34,6 +34,7 @@
#define QEMU_CPUHP_STAT_ENABLED BIT0
#define QEMU_CPUHP_STAT_INSERT BIT1
#define QEMU_CPUHP_STAT_REMOVE BIT2
+#define QEMU_CPUHP_STAT_FW_REMOVE BIT4
#define QEMU_CPUHP_RW_CMD_DATA 0x8
diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index bf68fcd42914..3192bfea1f15 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -52,6 +52,7 @@ STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
//
STATIC APIC_ID *mPluggedApicIds;
STATIC APIC_ID *mToUnplugApicIds;
+STATIC UINT32 *mToUnplugSelector;
(3) For consistency with the other two global variable identifiers,
please call this "mToUnplugSelector" (plural), too.
(4) The comment above these global variables only talks about APIC IDs;
please update the comment with a reference / explanation of the selectors.

//
// Address of the non-SMRAM reserved memory page that contains the Post-SMM Pen
// for hot-added CPUs.
@@ -289,6 +290,7 @@ CpuHotplugMmi (
mPluggedApicIds,
&PluggedCount,
mToUnplugApicIds,
+ mToUnplugSelector,
&ToUnplugCount
);
if (EFI_ERROR (Status)) {
@@ -333,7 +335,9 @@ CpuHotplugEntry (
)
{
EFI_STATUS Status;
+ UINTN Len;
UINTN Size;
+ UINTN SizeSel;
//
// This module should only be included when SMM support is required.
@@ -387,8 +391,9 @@ CpuHotplugEntry (
//
// Allocate the data structures that depend on the possible CPU count.
//
- if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Size)) ||
- RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Size, &Size))) {
+ if (RETURN_ERROR (SafeUintnSub (mCpuHotPlugData->ArrayLength, 1, &Len)) ||
+ RETURN_ERROR (SafeUintnMult (sizeof (APIC_ID), Len, &Size))||
(5) Missing space character before "||".

+ RETURN_ERROR (SafeUintnMult (sizeof (UINT32), Len, &SizeSel))) {
Status = EFI_ABORTED;
DEBUG ((DEBUG_ERROR, "%a: invalid CPU_HOT_PLUG_DATA\n", __FUNCTION__));
goto Fatal;
@@ -405,6 +410,12 @@ CpuHotplugEntry (
DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
goto ReleasePluggedApicIds;
}
+ Status = gMmst->MmAllocatePool (EfiRuntimeServicesData, SizeSel,
+ (VOID **)&mToUnplugSelector);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: MmAllocatePool(): %r\n", __FUNCTION__, Status));
+ goto ReleaseToUnplugApicIds;
+ }
//
// Allocate the Post-SMM Pen for hot-added CPUs.
@@ -412,7 +423,7 @@ CpuHotplugEntry (
Status = SmbaseAllocatePostSmmPen (&mPostSmmPenAddress,
SystemTable->BootServices);
if (EFI_ERROR (Status)) {
- goto ReleaseToUnplugApicIds;
+ goto ReleaseToUnplugSelector;
}
//
@@ -472,6 +483,10 @@ ReleasePostSmmPen:
SmbaseReleasePostSmmPen (mPostSmmPenAddress, SystemTable->BootServices);
mPostSmmPenAddress = 0;
+ReleaseToUnplugSelector:
(6) Please call this label "ReleaseToUnplugSelectors" (plural).

+ gMmst->MmFreePool (mToUnplugSelector);
+ mToUnplugSelector = NULL;
+
ReleaseToUnplugApicIds:
gMmst->MmFreePool (mToUnplugApicIds);
mToUnplugApicIds = NULL;
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index 8d4a6693c8d6..36372a5e6193 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -164,6 +164,9 @@ QemuCpuhpWriteCommand (
@param[out] ToUnplugApicIds The APIC IDs of the CPUs that are about to be
hot-unplugged.
+ @param[out] ToUnplugSelector The QEMU Selectors of the CPUs that are about to
+ be hot-unplugged.
+
@param[out] ToUnplugCount The number of filled-in APIC IDs in
ToUnplugApicIds.
Acking the comments above.

(7) Please (a) call the parameter "ToUnplugSelectors" (plural), and (b)
make sure that there are two space characters between the variable name
"column" and the documentation "column". (All in all, please move the
RHS column to the right by two spaces.)
That would make the RHS of ToUnplugSelectors not line up with the other
two out params. (Even though this mail does not seem to show that, they
do line up in the code.) Is that okay?


@@ -187,6 +190,7 @@ QemuCpuhpCollectApicIds (
OUT APIC_ID *PluggedApicIds,
OUT UINT32 *PluggedCount,
OUT APIC_ID *ToUnplugApicIds,
+ OUT UINT32 *ToUnplugSelector,
OUT UINT32 *ToUnplugCount
)
{
@@ -204,6 +208,7 @@ QemuCpuhpCollectApicIds (
UINT32 PendingSelector;
UINT8 CpuStatus;
APIC_ID *ExtendIds;
+ UINT32 *ExtendSel;
(8) Please use plural -- "ExtendSels" --, consistently with "ExtendIds".

UINT32 *ExtendCount;
APIC_ID NewApicId;
@@ -245,10 +250,10 @@ QemuCpuhpCollectApicIds (
if ((CpuStatus & QEMU_CPUHP_STAT_INSERT) != 0) {
//
// The "insert" event guarantees the "enabled" status; plus it excludes
- // the "remove" event.
+ // the "fw_remove" event.
//
if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0 ||
- (CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+ (CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
"inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
CpuStatus));
@@ -259,40 +264,69 @@ QemuCpuhpCollectApicIds (
CurrentSelector));
ExtendIds = PluggedApicIds;
+ ExtendSel = NULL;
ExtendCount = PluggedCount;
- } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
- DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove\n", __FUNCTION__,
- CurrentSelector));
+ } else if ((CpuStatus & QEMU_CPUHP_STAT_FW_REMOVE) != 0) {
+ //
+ // "fw_remove" event guarantees "enabled".
+ //
+ if ((CpuStatus & QEMU_CPUHP_STAT_ENABLED) == 0) {
+ DEBUG ((DEBUG_ERROR, "%a: CurrentSelector=%u CpuStatus=0x%x: "
+ "inconsistent CPU status\n", __FUNCTION__, CurrentSelector,
+ CpuStatus));
+ return EFI_PROTOCOL_ERROR;
+ }
+
+ DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: fw_remove\n",
+ __FUNCTION__, CurrentSelector));
ExtendIds = ToUnplugApicIds;
+ ExtendSel = ToUnplugSelector;
ExtendCount = ToUnplugCount;
+ } else if ((CpuStatus & QEMU_CPUHP_STAT_REMOVE) != 0) {
+ //
+ // Let the OSPM deal with the "remove" event.
+ //
+ DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: remove (ignored)\n",
+ __FUNCTION__, CurrentSelector));
+
+ ExtendIds = NULL;
+ ExtendSel = NULL;
+ ExtendCount = NULL;
} else {
DEBUG ((DEBUG_VERBOSE, "%a: CurrentSelector=%u: no event\n",
__FUNCTION__, CurrentSelector));
break;
}
- //
- // Save the APIC ID of the CPU with the pending event, to the corresponding
- // APIC ID array.
- //
- if (*ExtendCount == ApicIdCount) {
- DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
- return EFI_BUFFER_TOO_SMALL;
- }
- QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
- NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
- DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
- NewApicId));
- ExtendIds[(*ExtendCount)++] = NewApicId;
+ ASSERT ((ExtendIds == NULL) == (ExtendCount == NULL));
(9) Please follow this ASSERT with another ASSERT(), on a separate line:
ASSERT (ExtendSels == NULL || ExtendIds != NULL);
Explanation:
- I'd like us to capture: if ExtendSels is not NULL, then ExtendIds is
also not NULL.
- Furthermore, express the A-->B implication using its equivalent
formulation (!A)||B.
This is a good check. Will add.


+ if (ExtendIds != NULL) {
+ //
+ // Save the APIC ID of the CPU with the pending event, to the
+ // corresponding APIC ID array.
+ // For unplug events, also save the CurrentSelector.
+ //
+ if (*ExtendCount == ApicIdCount) {
+ DEBUG ((DEBUG_ERROR, "%a: APIC ID array too small\n", __FUNCTION__));
+ return EFI_BUFFER_TOO_SMALL;
+ }
+ QemuCpuhpWriteCommand (MmCpuIo, QEMU_CPUHP_CMD_GET_ARCH_ID);
+ NewApicId = QemuCpuhpReadCommandData (MmCpuIo);
+ DEBUG ((DEBUG_VERBOSE, "%a: ApicId=" FMT_APIC_ID "\n", __FUNCTION__,
+ NewApicId));
+ if (ExtendSel != NULL) {
+ ExtendSel[(*ExtendCount)] = CurrentSelector;
+ }
+ ExtendIds[(*ExtendCount)++] = NewApicId;
OK thus far...

- //
- // We've processed the CPU with (known) pending events, but we must never
- // clear events. Therefore we need to advance past this CPU manually;
- // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
- // selected CPU.
- //
- CurrentSelector++;
+ //
+ // We've processed the CPU with (known) pending events, but we must never
+ // clear events. Therefore we need to advance past this CPU manually;
+ // otherwise, QEMU_CPUHP_CMD_GET_PENDING would stick to the currently
+ // selected CPU.
+ //
+ CurrentSelector++;
+ }
(10) ... but this is a logic bug: the comment and the CurrentSelector
increment must not depend on (ExtendIds != NULL).
As written, if we ever see a QEMU_CPUHP_STAT_REMOVE event (which we're
supposed to ignore), we'll never move past that CPU, but will be stuck
in an infinite loop. The QEMU_CPUHP_CMD_GET_PENDING command will keep
returning the CPU with the QEMU_CPUHP_STAT_REMOVE event pending.
If this was unclear from my previous feedback, then I apologize, I
should have been clearer.
No, my bad. Don't quite know how I missed that -- that was the whole
reason why I had a CurrentSelector++ before the continue statement
in v6.

Will fix.

Thanks
Ankur

Thanks,
Laszlo

} while (CurrentSelector < PossibleCpuCount);
DEBUG ((DEBUG_VERBOSE, "%a: PluggedCount=%u ToUnplugCount=%u\n",


Laszlo Ersek
 

On 02/22/21 23:03, Ankur Arora wrote:
On 2021-02-22 4:27 a.m., Laszlo Ersek wrote:
On 02/22/21 08:19, Ankur Arora wrote:
diff --git a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
index 8d4a6693c8d6..36372a5e6193 100644
--- a/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
+++ b/OvmfPkg/CpuHotplugSmm/QemuCpuhp.c
@@ -164,6 +164,9 @@ QemuCpuhpWriteCommand (
    @param[out] ToUnplugApicIds  The APIC IDs of the CPUs that are
about to be
                                 hot-unplugged.
  +  @param[out] ToUnplugSelector The QEMU Selectors of the CPUs that
are about to
+                               be hot-unplugged.
+
    @param[out] ToUnplugCount    The number of filled-in APIC IDs in
                                 ToUnplugApicIds.
 
Acking the comments above.

(7) Please (a) call the parameter "ToUnplugSelectors" (plural), and (b)
make sure that there are two space characters between the variable name
"column" and the documentation "column". (All in all, please move the
RHS column to the right by two spaces.)
That would make the RHS of ToUnplugSelectors not line up with the other
two out params. (Even though this mail does not seem to show that, they
do line up in the code.) Is that okay?
Sorry, I was unclear. I meant the request as follows: please move the
full-height RHS column, containing *all* the parameter descriptions, to
the right, by 2 spaces; please also re-wrap wherever necessary, to
maintain <= 80 chars width.

Thanks
Laszlo