CPU hotplug using SMM with QEMU+OVMF


Laszlo Ersek
 

Hi,

this message is a problem statement, and an initial recommendation for
solving it, from Jiewen, Paolo, Yingwen, and others. I'm cross-posting
the thread starter to the <devel@edk2.groups.io>, <rfc@edk2.groups.io>
and <qemu-devel@nongnu.org> lists. Please use "Reply All" when
commenting.

In response to the initial posting, I plan to ask a number of questions.

The related TianoCore bugzillas are:

https://bugzilla.tianocore.org/show_bug.cgi?id=1512
https://bugzilla.tianocore.org/show_bug.cgi?id=1515

SMM is used as a security barrier between the OS kernel and the
firmware. When a CPU is plugged into a running system where this barrier
exists fine otherwise, the new CPU can be considered a means to attack
SMM. When the next SMI is raised (globally, or targeted at the new CPU),
the SMBASE for that CPU is still at 0x30000, which is normal RAM, not
SMRAM. Therefore the OS could place attack code in that area prior to
the SMI. Once in SMM, the new CPU would execute OS-owned code (from
normal RAM) with access to SMRAM and to other SMM-protected stuff, such
as flash. [I stole a few words from Paolo here.]

Jiewen summarized the problem as follows:

- Asset: SMM

- Adversary:

- System Software Attacker, who can control any OS memory or silicon
register from OS level, or read write BIOS data.

- Simple hardware attacker, who can hot add or hot remove a CPU.

- Non-adversary: The attacker cannot modify the flash BIOS code or
read only BIOS data. The flash part itself is treated as TCB and
protected.

- Threat: The attacker may hot add or hot remove a CPU, then modify
system memory to tamper the SMRAM content, or trigger SMI to get the
privilege escalation by executing code in SMM mode.

We'd like to solve this problem for QEMU/KVM and OVMF.

(At the moment, CPU hotplug doesn't work with OVMF *iff* OVMF was built
with -D SMM_REQUIRE. SMBASE relocation never happens for the new CPU,
the SMM infrastructure in edk2 doesn't know about the new CPU, and so
when the first SMI is broadcast afterwards, we crash. We'd like this
functionality to *work*, in the first place -- but securely at that, so
that an actively malicious guest kernel can't break into SMM.)

Yingwen and Jiewen suggested the following process.

Legend:

- "New CPU": CPU being hot-added
- "Host CPU": existing CPU
- (Flash): code running from flash
- (SMM): code running from SMRAM

Steps:

(01) New CPU: (Flash) enter reset vector, Global SMI disabled by
default.

(02) New CPU: (Flash) configure memory control to let it access global
host memory.

(03) New CPU: (Flash) send board message to tell host CPU (GPIO->SCI) --
I am waiting for hot-add message. (NOTE: Host CPU can only send
instruction in SMM mode. -- The register is SMM only)

(04) Host CPU: (OS) get message from board that a new CPU is added.
(GPIO -> SCI)

(05) Host CPU: (OS) All CPUs enter SMM (SCI->SWSMI) (NOTE: New CPU will
not enter CPU because SMI is disabled)

(06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM rebase
code.

(07) Host CPU: (SMM) Send message to New CPU to Enable SMI.

(08) New CPU: (Flash) Get message - Enable SMI.

(09) Host CPU: (SMM) Send SMI to the new CPU only.

(10) New CPU: (SMM) Response first SMI at 38000, and rebase SMBASE to
TSEG.

(11) Host CPU: (SMM) Restore 38000.

(12) Host CPU: (SMM) Update located data structure to add the new CPU
information. (This step will involve CPU_SERVICE protocol)

===================== (now, the next SMI will bring all CPU into TSEG)

(13) New CPU: (Flash) run MRC code, to init its own memory.

(14) New CPU: (Flash) Deadloop, and wait for INIT-SIPI-SIPI.

(15) Host CPU: (OS) Send INIT-SIPI-SIPI to pull new CPU in.

Thanks
Laszlo

Join devel@edk2.groups.io to automatically receive all group messages.