Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Igor Mammedov <imammedo@...>

On Thu, 29 Aug 2019 19:01:35 +0200
Laszlo Ersek <lersek@...> wrote:

On 08/27/19 20:31, Igor Mammedov wrote:
On Sat, 24 Aug 2019 01:48:09 +0000
"Yao, Jiewen" <jiewen.yao@...> wrote:
(05) Host CPU: (OS) Port 0xB2 write, all CPUs enter SMM (NOTE: New CPU
will not enter CPU because SMI is disabled)
I think only CPU that does the write will enter SMM
That used to be the case (and it is still the default QEMU behavior, if
broadcast SMI is not negotiated). However, OVMF does negotiate broadcast
SMI whenever QEMU offers the feature. Broadcast SMI is important for the
stability of the edk2 SMM infrastructure on QEMU/KVM, we've found.

and we might not need to pull in all already initialized CPUs into SMM.
That, on the other hand, could be a valid idea. But then the CPU should
use a different method for raising a synchronous SMI for itself (not a
write to IO port 0xB2). Is a "directed SMI for self" possible?
theoretically depending on argument in 0xb3, it should be possible to
rise directed SMI even if broadcast ones are negotiated.

I've tried to read through the procedure with your suggested changes,
but I'm failing at composing a coherent mental image, in this email
response format.

If you have the time, can you write up the suggested list of steps in a
"flat" format? (I believe you are suggesting to eliminate some steps
if I'd sum it up:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.

... jumping to another point:

2) Let trusted software (SMM and init code) guarantee SMREBASE one by one (include any code runs before SMREBASE)
that would mean pulling all present CPUs into SMM mode so no attack
code could be executing before doing hotplug. With a lot of present CPUs
it could be quite expensive and unlike physical hardware, guest's CPUs
could be preempted arbitrarily long causing long delays.
I agree with your analysis, but I slightly disagree about the impact:

- CPU hotplug is not a frequent administrative action, so the CPU load
should be temporary (it should be a spike). I don't worry that it would
trip up OS kernel code. (SMI handling is known to take long on physical
platforms oo.) In practice, all "normal" SMIs are broadcast already (for
example when calling the runtime UEFI variable services from the OS kernel).

- The fact that QEMU/KVM introduces some jitter into the execution of
multi-core code (including SMM code) has proved useful in the past, for
catching edk2 regressions.

Again, this is not a strong disagreement from my side. I'm open to
better ways for synching CPUs during muti-CPU-hotplug.


I expect someone could be curious why (a) I find it acceptable (even
beneficial) that "some jitter" injected by the QEMU/KVM scheduling
exposes multi-core regressions in edk2, but at the same time (b) I found
it really important to add broadcast SMI to QEMU and OVMF. After all,
both "jitter" and "unicast SMIs" are QEMU/KVM platform specifics, so why
the different treatment?

The reason is that the "jitter" does not interfere with normal
operation, and it has been good for catching *regressions*. IOW, there
is a working edk2 state, someone posts a patch, works on physical
hardware, but breaks on QEMU/KVM --> then we can still reject or rework
or revert the patch. And we're back to a working state again (in the
best case, with a fixed feature patch).

With the unicast SMIs however, it was impossible to enable the SMM stack
reliably in the first place. There was no functional state to return to.
I don't really get the last statement, but the I know nothing about OVMF.
I don't insist on unicast SMI being used, it's just some ideas about what
we could do. It could be done later, broadcast SMI (might be not the best)
is sufficient to implement CPU hotplug.

Digression ends.)

lets first see if if we can ignore race
Makes me uncomfortable, but if this is the consensus, I'll go along.
same here, as mentioned in another reply as it's only possible in
attack case (multiple SMIs + multiple SIPI) so it could be fine to just
explode in case it happens (point is fw in not leaking anything from SMRAM
and OS did something illegeal).

and if it's not then
we probably end up with implementing some form of #1


Join to automatically receive all group messages.