Date
1 - 4 of 4
[Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Laszlo Ersek
On 09/03/19 16:53, Igor Mammedov wrote:
can modify or rearrange anything I want in edk2. :)
If we can solve the above in OVMF platform code, that's great. If not
(e.g. UefiCpuPkg code needs to be updated), then things will get tricky.
If we can introduce another platform hook for this, that would help. I
can't say before I try.
(from the QEMU perspective) is exposed as "SMRAM" to the guest, hidden
with a "black hole" overlay (outside of SMM) if SMRAM is closed.
If that's correct, then #1 looks more attractive to me than #2.
Thanks
Laszlo
On Mon, 2 Sep 2019 21:09:58 +0200I'm sure you are *technically* right, but you seem to be assuming that I
Laszlo Ersek <lersek@...> wrote:On 09/02/19 10:45, Igor Mammedov wrote:currently there is no SMRAM at 0x30000, so all access falls throughOn Fri, 30 Aug 2019 20:46:14 +0200This sounds convincing enough, for the hotplugged CPU; thanks.
Laszlo Ersek <lersek@...> wrote:
On 08/30/19 16:48, Igor Mammedov wrote:It sure can but this way it won't get access to privileged SMRAM
(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)In step (03), it is the OS that handles the SCI; it transfers control to
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)
(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI
(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)
(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet
(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)
(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)
(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.
If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?
so OS can't subvert firmware.
The next time SMI broadcast is sent the CPU will use SMI handler at
default 30000 SMBASE. It's up to us to define behavior here (for example
relocation handler can put such CPU in shutdown state).
It's in the best interest of OS to cooperate and execute AML
provided by firmware, if it does not follow proper cpu hotplug flow
we can't guarantee that stolen CPU will work.
So now my concern is with step (01). While preparing for the initial
relocation (of cold-plugged CPUs), the code assumes the memory at the
default SMBASE (0x30000) is normal RAM.
Is it not a problem that the area is written initially while running in
normal 32-bit or 64-bit mode, but then executed (in response to the
first, synchronous, SMI) as SMRAM?
into RAM address space and we are about to change that.
but firmware doesn't have to use it as RAM, it can check if QEMU
supports SMRAM at 0x30000 and if supported map it to configure
and then lock it down.
can modify or rearrange anything I want in edk2. :)
If we can solve the above in OVMF platform code, that's great. If not
(e.g. UefiCpuPkg code needs to be updated), then things will get tricky.
If we can introduce another platform hook for this, that would help. I
can't say before I try.
Option#1 would be similar to how TSEG works now, correct? IOW normal RAM
Basically I'm confused by the alias.I don't really know how firmware uses A/B segments and I'm afraid that
TSEG (and presumably, A/B seg) work like this:
- when open, looks like RAM to normal mode and SMM
- when closed, looks like black-hole to normal mode, and like RAM to SMM
The generic edk2 code knows this, and manages the SMRAM areas accordingly.
The area at 0x30000 is different:
- looks like RAM to both normal mode and SMM
If we set up the alias at 0x30000 into A/B seg,
- will that *permanently* hide the normal RAM at 0x30000?
- will 0x30000 start behaving like A/B seg?
Basically my concern is that the universal code in edk2 might or might
not keep A/B seg open while initially populating the area at the default
SMBASE. Specifically, I can imagine two issues:
- if the alias into A/B seg is inactive during the initial population,
then the initial writes go to RAM, but the execution (the first SMBASE
relocation) will occur from A/B seg through the alias
- alternatively, if the alias is always active, but A/B seg is closed
during initial population (which happens in normal mode), then the
initial writes go to the black hole, and execution will occur from a
"blank" A/B seg.
Am I seeing things? (Sorry, I keep feeling dumber and dumber in this
thread.)
cannibalizing one for configuring 0x30000 might break something.
Since we are inventing something out of q35 spec anyway, How about
leaving A/B/TSEG to be and using fwcfg to configure when/where
SMRAM(0x30000+128K) should be mapped into RAM address space.
I see a couple of options:
1: use identity mapping where SMRAM(0x30000+128K) maps into the same
range in RAM address space when firmware writes into fwcfg
file and unmaps/locks on the second write (until HW reset)
2: let firmware choose where to map SMRAM(0x30000+128K) in RAM address
space, logic is essentially the same as above only firmware
picks and writes into fwcfg an address where SMRAM(0x30000+128K)
should be mapped.
(from the QEMU perspective) is exposed as "SMRAM" to the guest, hidden
with a "black hole" overlay (outside of SMM) if SMRAM is closed.
If that's correct, then #1 looks more attractive to me than #2.
Thanks
Laszlo
Igor Mammedov <imammedo@...>
On Tue, 3 Sep 2019 19:20:25 +0200
Laszlo Ersek <lersek@...> wrote:
but what could be done in reality might be limited.
(in this case fwcfg interface would only work for locking down the range)
or
we can actually have a dedicated SMRAM (like in my earlier RFC),
in this case FW can use RAM(0x30000+128K) when SMRAM isn't mapped into RAM address space
(in this case fwcfg would be used to temporarily map SMRAM into normal RAM and unmap/lock
after SMI relocation handler was initialized).
If possible I'd prefer a simpler TSEG like variant.
Laszlo Ersek <lersek@...> wrote:
On 09/03/19 16:53, Igor Mammedov wrote:yep, I'm looking at it from theoretical perspective so far,On Mon, 2 Sep 2019 21:09:58 +0200I'm sure you are *technically* right, but you seem to be assuming that I
Laszlo Ersek <lersek@...> wrote:
On 09/02/19 10:45, Igor Mammedov wrote:currently there is no SMRAM at 0x30000, so all access falls throughOn Fri, 30 Aug 2019 20:46:14 +0200This sounds convincing enough, for the hotplugged CPU; thanks.
Laszlo Ersek <lersek@...> wrote:
On 08/30/19 16:48, Igor Mammedov wrote:It sure can but this way it won't get access to privileged SMRAM
(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)In step (03), it is the OS that handles the SCI; it transfers control to
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)
(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI
(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)
(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet
(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)
(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)
(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.
If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?
so OS can't subvert firmware.
The next time SMI broadcast is sent the CPU will use SMI handler at
default 30000 SMBASE. It's up to us to define behavior here (for example
relocation handler can put such CPU in shutdown state).
It's in the best interest of OS to cooperate and execute AML
provided by firmware, if it does not follow proper cpu hotplug flow
we can't guarantee that stolen CPU will work.
So now my concern is with step (01). While preparing for the initial
relocation (of cold-plugged CPUs), the code assumes the memory at the
default SMBASE (0x30000) is normal RAM.
Is it not a problem that the area is written initially while running in
normal 32-bit or 64-bit mode, but then executed (in response to the
first, synchronous, SMI) as SMRAM?
into RAM address space and we are about to change that.
but firmware doesn't have to use it as RAM, it can check if QEMU
supports SMRAM at 0x30000 and if supported map it to configure
and then lock it down.
can modify or rearrange anything I want in edk2. :)
but what could be done in reality might be limited.
If we can solve the above in OVMF platform code, that's great. If notit could be stolen RAM + black hole like TSEG, assuming fw can live without RAM(0x30000+128K) range
(e.g. UefiCpuPkg code needs to be updated), then things will get tricky.
If we can introduce another platform hook for this, that would help. I
can't say before I try.Option#1 would be similar to how TSEG works now, correct? IOW normal RAM
Basically I'm confused by the alias.I don't really know how firmware uses A/B segments and I'm afraid that
TSEG (and presumably, A/B seg) work like this:
- when open, looks like RAM to normal mode and SMM
- when closed, looks like black-hole to normal mode, and like RAM to SMM
The generic edk2 code knows this, and manages the SMRAM areas accordingly.
The area at 0x30000 is different:
- looks like RAM to both normal mode and SMM
If we set up the alias at 0x30000 into A/B seg,
- will that *permanently* hide the normal RAM at 0x30000?
- will 0x30000 start behaving like A/B seg?
Basically my concern is that the universal code in edk2 might or might
not keep A/B seg open while initially populating the area at the default
SMBASE. Specifically, I can imagine two issues:
- if the alias into A/B seg is inactive during the initial population,
then the initial writes go to RAM, but the execution (the first SMBASE
relocation) will occur from A/B seg through the alias
- alternatively, if the alias is always active, but A/B seg is closed
during initial population (which happens in normal mode), then the
initial writes go to the black hole, and execution will occur from a
"blank" A/B seg.
Am I seeing things? (Sorry, I keep feeling dumber and dumber in this
thread.)
cannibalizing one for configuring 0x30000 might break something.
Since we are inventing something out of q35 spec anyway, How about
leaving A/B/TSEG to be and using fwcfg to configure when/where
SMRAM(0x30000+128K) should be mapped into RAM address space.
I see a couple of options:
1: use identity mapping where SMRAM(0x30000+128K) maps into the same
range in RAM address space when firmware writes into fwcfg
file and unmaps/locks on the second write (until HW reset)
2: let firmware choose where to map SMRAM(0x30000+128K) in RAM address
space, logic is essentially the same as above only firmware
picks and writes into fwcfg an address where SMRAM(0x30000+128K)
should be mapped.
(from the QEMU perspective) is exposed as "SMRAM" to the guest, hidden
with a "black hole" overlay (outside of SMM) if SMRAM is closed.
(in this case fwcfg interface would only work for locking down the range)
or
we can actually have a dedicated SMRAM (like in my earlier RFC),
in this case FW can use RAM(0x30000+128K) when SMRAM isn't mapped into RAM address space
(in this case fwcfg would be used to temporarily map SMRAM into normal RAM and unmap/lock
after SMI relocation handler was initialized).
If possible I'd prefer a simpler TSEG like variant.
If that's correct, then #1 looks more attractive to me than #2.
Thanks
Laszlo
Laszlo Ersek
On 09/04/19 11:52, Igor Mammedov wrote:
should have explicit open/close/lock operations. And, when the range is
closed (meaning, closed+unlocked, or closed+locked), then the black hole
should take effect for code that's not running in SMM.
Put differently, its like the second choice, except the range never
appears as normal RAM. "When SMRAM isn't mapped into RAM address space",
then the address range shows "nothing" (black hole).
Regarding "fw can live without RAM(0x30000+128K) range" -- do you mean
whether the firmware could use another RAM area for fw_cfg DMA?
If that's the question, then I wouldn't worry about it. I'd remove the
0x30000+128K range from the memory map, so the fw_cfg stuff (or anything
else) would never allocate memory from the range. It's much more
concerning to me however how the SMM infrastructure would deal with a
hole in the memory map right there.
Thanks
Laszlo
it could be stolen RAM + black hole like TSEG, assuming fw can live without RAM(0x30000+128K) rangeI think TSEG-like behavior is between these two. That is, I believe we
(in this case fwcfg interface would only work for locking down the range)
or
we can actually have a dedicated SMRAM (like in my earlier RFC),
in this case FW can use RAM(0x30000+128K) when SMRAM isn't mapped into RAM address space
(in this case fwcfg would be used to temporarily map SMRAM into normal RAM and unmap/lock
after SMI relocation handler was initialized).
If possible I'd prefer a simpler TSEG like variant.
should have explicit open/close/lock operations. And, when the range is
closed (meaning, closed+unlocked, or closed+locked), then the black hole
should take effect for code that's not running in SMM.
Put differently, its like the second choice, except the range never
appears as normal RAM. "When SMRAM isn't mapped into RAM address space",
then the address range shows "nothing" (black hole).
Regarding "fw can live without RAM(0x30000+128K) range" -- do you mean
whether the firmware could use another RAM area for fw_cfg DMA?
If that's the question, then I wouldn't worry about it. I'd remove the
0x30000+128K range from the memory map, so the fw_cfg stuff (or anything
else) would never allocate memory from the range. It's much more
concerning to me however how the SMM infrastructure would deal with a
hole in the memory map right there.
Thanks
Laszlo
Igor Mammedov <imammedo@...>
On Thu, 5 Sep 2019 15:08:31 +0200
Laszlo Ersek <lersek@...> wrote:
I've just implemented subset of above (opened, closed+locked).
to avoid using RAM(0x30000+128K) range (since it becomes unusable after locking).
Looks like you just answered it here
Laszlo Ersek <lersek@...> wrote:
On 09/04/19 11:52, Igor Mammedov wrote:I guess we at point where patch is better then words, I'll send one as reply here shortly.it could be stolen RAM + black hole like TSEG, assuming fw can live without RAM(0x30000+128K) rangeI think TSEG-like behavior is between these two. That is, I believe we
(in this case fwcfg interface would only work for locking down the range)
or
we can actually have a dedicated SMRAM (like in my earlier RFC),
in this case FW can use RAM(0x30000+128K) when SMRAM isn't mapped into RAM address space
(in this case fwcfg would be used to temporarily map SMRAM into normal RAM and unmap/lock
after SMI relocation handler was initialized).
If possible I'd prefer a simpler TSEG like variant.
should have explicit open/close/lock operations. And, when the range is
closed (meaning, closed+unlocked, or closed+locked), then the black hole
should take effect for code that's not running in SMM.
Put differently, its like the second choice, except the range never
appears as normal RAM. "When SMRAM isn't mapped into RAM address space",
then the address range shows "nothing" (black hole).
I've just implemented subset of above (opened, closed+locked).
Regarding "fw can live without RAM(0x30000+128K) range" -- do you meanI didn't mean fwcfg in this context, what I meant if firmware were able
whether the firmware could use another RAM area for fw_cfg DMA?
If that's the question, then I wouldn't worry about it. I'd remove the
0x30000+128K range from the memory map, so the fw_cfg stuff (or anything
else) would never allocate memory from the range. It's much more
concerning to me however how the SMM infrastructure would deal with a
hole in the memory map right there.
to avoid using RAM(0x30000+128K) range (since it becomes unusable after locking).
Looks like you just answered it here