Re: [edk2-rfc] [edk2-devel] RFC: design review for TDVF in OVMF


Min Xu
 

On June 4, 2021 12:12 AM, Laszlo wrote:
On 06/03/21 15:51, Yao, Jiewen wrote:
Hi, All
We plan to do a design review for TDVF in OVMF package.


The TDVF Design slides for TinaoCore Design Review Meeting (Jun 11) is
now available in blow link:
https://edk2.groups.io/g/devel/files/Designs/2021/0611.

The Bugzilla is https://bugzilla.tianocore.org/show_bug.cgi?id=3429



You can have an offline review first. You comments will be warmly
welcomed and we will continuously update the slides based on the
feedbacks.
Resending my earlier comments in this mailing list thread, with the feedback
inserted at the proper spots that has been given in the off-list thread since
then:
Continue my comments from here.

*** Slide 35-36: DXE phase

(16) "Some DXE Drivers not allowed to load/start in Td guest -- Network
stack, RNG, ..."

Same comment as (several times) above. The Linuxboot project is a good
example for eliminating cruft from DXEFV (in fact, for eliminating most of the
DXE phase). In a TDX environment, why include drivers in the firmware binary
that are never used? Meanwhile, DXEFV in OVMF grows by a MB every 1.5
years or so. Again, remove these drivers from the DSC/FDF then, and it needs
to be a separate platform.
It is because of the *one binary* so that we have to include all the drivers in the
firmware binary. Some of the DXE drivers are not called in TDX, but they're dispatched
in Non-Tdx environment.
I will explain more about this topic in next version of design slides.
Also I would wait for the conclusion to the *one binary*.

(17) "Other DXE Phase drivers -- [...] AcpiPlatformDxe"

I'm not sure what this section is supposed to mean. Other DXE phase drivers
included, or excluded? Without AcpiPlatformDxe, the guest OS will not see
QEMU's ACPI content, and will almost certainly malfunction.
Sorry for the confusion about the title. This slides means Other DXE phase drivers
including AcpiPlatformDxe is included. I will update the slides to be more clear
and accurate.

... Back from slide 48: ignore this for now, I'll comment in more detail later.


*** Slide 37: DXE Core

(18) says "SMM is not supported in Td guest" -- how is the variable store
protected from direct hardware (pflash) access from the guest OS?
Without SMM, the guest OS need not go through gRT->SetVariable() to
update authenticated non-volatile UEFI variables, and that undermines
Secure Boot.
Let me explain the SMM and Secure boot in TDX like below:
1) TDX doesn't support virtual SMM in guest. Virtual SMI cannot be injected
into TD guest.
2) SMI/SMM is used to manage variable update to avoid expose Flash direct.
So SMM is not must-to-have for secure boot, but help to mitigate the security risk.
3) We don't trust VMM. That is why we need TDX.
4) If you trust VMM to emulate SMM, then you don't need TDX.

Note that, while SEV-ES has the same limitation wrt. SMM, the
"OvmfPkg/AmdSev/AmdSevX64.dsc" platform doesn't even include the
Secure Boot firmware feature. For another example, the OVMF firmware
binary in RHEL that's going to support SEV-ES is such a build of
"OvmfPkgX64.dsc"
that does not include the Secure Boot feature either.

But in this TDX slide deck, Secure Boot certificates are embedded in the CFV
(configuration firmware volume) -- see slide 11 and slide 16 --, which
suggests that this platform does want secure boot.
Yes, TDVF does support Secure Boot.

... Back from slide 48: I'm going to make *additional* comments on this,
when I'm at slide 48, too.

The rest of this slide (slide 37) looks reasonable (generic DXE Core changes --
possibly PI spec changes too).
Yes, there is new attribute (EFI_RESOURCE_ATTRIBUTE_ENCRYPTED) in PiHob.h



*** Slides 38 through 39:

These seem reasonable (TdxDxe assumes some responsibilities of
OvmfPkg/PlatformPei)


*** Slides 40 through 42:

*If* you really can implement TDX support for the IOMMU driver *this*
surgically, then I'm OK with it. The general logic in the IOMMU driver was
truly difficult to write, and I'd be seriously concerned if those parts would
have to be modified. Customizing just the page encryption/decryption
primitives for TDX vs. SEV is OK.
Actually all the changes we do in IOMMU is to customize the page encryption
/ decryption primitive for TDX and SEV. We will probe the Td or Non-Td in
run-time, then the corresponding APIs will be called.



*** Slides 43 through 47:

(19) Slide 46 and slide 47 are almost identical. Please consolidate them into a
single slide.
Slide 46 is of Td measurement, like TpmMeasurement.
SecurityPkg/Library/DxeTpmMeasurementLib
Slide 47 is of Measure boot.
SecurityPkg/Library/DxeTpm2MeasureBootLib
I will refine the two slides to make it more clear. Thanks for reminder.

(20) the TPM2 infrastructure in edk2 is baroque (over-engineered), in my
opinion. It has so many layers that I can never keep them in mind. When we
added TPM support to OVMF, I required commit messages that would help us
recall the layering. In particular, please refer to commit 0c0a50d6b3ff
("OvmfPkg: include Tcg2Dxe module", 2018-03-09). Here's an
excerpt:

TPM 2 consumer driver
|
v
Tpm2DeviceLib class: Tpm2DeviceLibTcg2 instance
|
v
TCG2 protocol interface
|
v
TCG2 protocol provider: Tcg2Dxe.inf driver
|
v
Tpm2DeviceLib class: Tpm2DeviceLibRouter instance
|
v
NULL class: Tpm2InstanceLibDTpm instance
(via earlier registration)
|
v
TPM2 chip (actual hardware)

The slide deck says that EFI_TD_PROTOCOL is supposed to reuse the
EFI_TCG2_PROTOCOL definition. If that's the case, then why don't we push
the TDX specifics (more or less: the replacement of PCRs with RTMR) down
to the lowest possible level?

Can we have "Tpm2InstanceLibTdxRtmr", plugged into the same Tcg2Dxe.inf
driver?

If not, can we have a new TdTcg2Dxe.inf driver, but make it so that it install
the same protocol as before (EFI_TCG2_PROTOCOL -- same old protocol
GUID)? Then DxeTpmMeasurementLib doesn't have to change.

As long as there is *at most* one EFI_TCG2_PROTOCOL instance published in
the protocol database, DxeTpmMeasurementLib should be fine. In SEV*
guests, the standard Tcg2Dxe driver provides that protocol. In TDX guests,
TdTcg2Dxe.inf should provide the protocol. Arbitration between the two can
be implemented with the pattern seen in the following
commits:

1 05db0948cc60 EmbeddedPkg: introduce EDKII Platform Has ACPI GUID
2 786f476323a6 EmbeddedPkg: introduce PlatformHasAcpiLib
3 65a69b214840 EmbeddedPkg: introduce EDKII Platform Has Device Tree
GUID
4 2558bfe3e907 ArmVirtPkg: add PlatformHasAcpiDtDxe

The basic idea is that Tcg2Dxe can have a depex on a new "running in SEV"
protocol GUID, and TdTcg2Dxe can have a depex on a new "running in TDX"
protocol GUID. A separate platform driver can install the proper GUID --
possibly *neither* of those GUIDs.

And, we don't have to change the depex section of
"SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.inf" for this; we can implement a library
instance with an empty constructor, but a non-empty depex, and then hook
this lib instance into Tcg2Dxe.inf via module scope NULL lib class override in
the DSC file. Basically we could forcibly restrict Tcg2Dxe's DEPEX by making it
inherit the new DEPEX from the library.
This is a big topic. I need to discuss it internally first then give my comments.
Thanks for your patience.


*** Slide 48: DXE Phase -- Other Modules

Regarding IncompatiblePciDeviceSupportDxe: the proposed update sounds
plausible and simple enough.

(21) "AcpiPlatformDxe: Support for MADT/ACPI addition to report Td Mailbox
entry"

Firmware-owned tables must not be installed from this driver.

Please refer to my "Xen removal" patch set again, for
<https://bugzilla.tianocore.org/show_bug.cgi?id=2122>, which I mention
above in point (14). As part of the Xen removal, the AcpiPlatformDxe driver in
OvmfPkg is significantly trimmed: all unused (dead) cruft is removed,
including any ACPI table templates that are built into the firmware.

OvmfPkg/AcpiPlatformDxe is responsible *solely* for implementing the
client side of the QEMU ACPI linker/loader.

If you need to prepare & install different ACPI tables, please do it elsewhere,
in another DXE driver. A bunch of other firmware modules do that (NFIT, IBFT,
BGRT, ...).

For example, the OvmfPkg/TdxDxe DXE_DRIVER is supposed to be launched
early in the DXE phase, via APRIORI section -- please consider registering a
protocol notify in that driver, for EFI_ACPI_TABLE_PROTOCOL, and when it
becomes available, install whatever
*extra* tables you need.

Note that if you need to *customize* an ACPI table that QEMU already
provides, then you will have to modify the ACPI generator on the QEMU side.
It is a design tenet between QEMU and OVMF that OVMF include no business
logic for either parsing or fixing up ACPI tables that QEMU provides.
AcpiPlatformDxe contains the minimum (which is already a whole lot,
unfortunately) that's necessary for implementing the QEMU ACPI
linker/loader client in the UEFI environment.

The slide deck mentions MADT, which is also known as the "APIC" table --
and indeed, QEMU does provide that. (See acpi_build_madt()
[hw/i386/acpi-common.c].) So if TDX needs MADT customizations, that
should go into QEMU.
Thanks for reminder. We will exam the implementation of MADT/ACPI carefully.

(22) EmuVariableFvbRuntimeDxe

Ouch, this is an unpleasant surprise.

First, if you know for a fact that pflash is not part of the *board* in any TDX
setup, then pulling

OvmfPkg/QemuFlashFvbServicesRuntimeDxe/FvbServicesRuntimeDxe.inf

into the firmware platform is useless, as it is mutually exclusive with

OvmfPkg/EmuVariableFvbRuntimeDxe/Fvb.inf

(via dynamic means -- a dynamic PCD).

Note that the FDF file places QemuFlashFvbServicesRuntimeDxe in APRIORI
DXE when SMM_REQUIRE is FALSE. This driver checks for pflash presence,
and lets EmuVariableFvbRuntimeDxe perform its in-RAM flash emulation only
in case pflash is not found.

So this is again in favor of a separate platform -- if we know pflash is never
part of the board, then QemuFlashFvbServicesRuntimeDxe is never needed,
but you cannot remove it from the traditional DSC/FDF files.

Second, EmuVariableFvbRuntimeDxe consumes the PlatformFvbLib class, for
the PlatformFvbDataWritten() API (among other things). This lib class is
implemented by two instances in OvmfPkg, PlatformFvbLibNull and
EmuVariableFvbLib. The latter instance allows Platform BDS to hook an event
(for signaling) via "PcdEmuVariableEvent" into the
EmuVariableFvbRuntimeDxe driver.

In old (very old) QEMU board configurations, namely those without pflash,
this (mis)feature is used by OVMF's PlatformBootManagerLib to write out all
variables to the EFI system partition in a regular file called \NvVars, with the
help of NvVarsFileLib, whenever EmuVariableFvbRuntimeDxe writes out an
emulated "flash" block. For this purpose, the traditional OVMF DSC files link
EmuVariableFvbLib into EmuVariableFvbRuntimeDxe.

But it counts as an absolute disaster nowadays, and should not be revived in
any platform. If you don't have pflash in TDX guests, just accept that you
won't have non-volatile variables. And link PlatformFvbLibNull into
EmuVariableFvbRuntimeDxe. You're going to need a separate
PlatformBootManagerLib instance anyway.

(We should have removed EmuVariableFvbRuntimeDxe a long time ago from
the traditional OVMF platforms, i.e. made pflash a hard requirement, even
when SMM is not built into the platform -- but whenever I tried that, Jordan
always shot me down.)

My point is: using EmuVariableFvbRuntimeDxe in the TDX platform may be
defensible per se, but we must be very clear that it will never provide a
standards-conformant service for non-volatile UEFI variables, and we must
keep as much of the \NvVars mess out of EmuVariableFvbRuntimeDxe as
possible. This will require a separate PlatformBootManagerLib instance for
TDX anyway (or maybe share PlatformBootManagerLibGrub with
"OvmfPkg/AmdSev/AmdSevX64.dsc").

Apart from the volatility aspect, let's assume we have this in-RAM emulated
"flash device", storing authenticated UEFI variables for Secure Boot purposes.
And we don't have SMM.

What protects this in-RAM variable store from tampering by the guest OS?
It's all guest RAM only, after all. What provides the privilege barrier between
the guest firmware and the guest OS?
Thanks Laszlo. I will carefully read your comments and discuss it internally first.


*** Slide 50: Library

(23) Should we introduce Null instances for all (or most) new lib classes here?
Code size is a concern (static linking). If we extend a common OvmfPkg
module with new hook points, it's one thing to return from that hook point
early *dynamically*, but it's even better (given separate platforms) to allow
the traditional platform firmware to use a Null lib instance, to cut out all the
dead code statically.
Yes, agree. We will introduce the NULL instance for the new libs.


*** Slides 51 through 52

Seems OK.


*** Slide 53:

(24) It might be worth noting that BaseIoLibIntrinsic already has some SEV
enlightenment, as the FIFO IO port operations (which normally use the REP
prefix) are not handled on SEV. I don't have an immediate idea why this
might matter, we should just minimize code duplication if possible.
Agree that we should minimize the code duplication if possible.

Before we start to enable IoLib for Tdx, we search out the EDK2 and find:
For BaseIoLibIntrinsicSev.inf, it is imported in OvmfPkg, such as
- OvmfPkg/OvmfXen.dsc
- OvmfPkg/Bhyve/BhyveX64.dsc
- OvmfPkg/OvmfPkgIa32.dsc
- OvmfPkg/OvmfPkgX64.dsc
- OvmfPkg/OvmfPkgIa32X64.dsc
- OvmfPkg/AmdSev/AmdSevX64.dsc

But for BaseIoLibIntrinsic.inf it seems it is not imported in any dsc in OvmfPkg.

BaseIoLibIntrinsic is the right IoLib base that we can enable TDX. But
it doesn't support SEV for the FIFO IO port operations.
BaseIoLibIntrinsicSev handles the FIFO IO on SEV. But we have an open about
the name.
Anyway we agree code duplication should be minimized.


*** Slides 54-56:

No comments, this stuff seems reasonable.


*** Slide 57: MpInitLib

I don't know enough to give a summary judgement.


All in all, I see the controversial / messy parts in the platform bringup, and
how all that differs from the traditional ("legacy") OVMF platforms. I admit I
*may* be biased in favor of SEV, possibly because SEV landed first -- if you
find signs of such a bias in my comments, please don't hesitate to debunk
those points. Yet my general impression is that the early bringup stuff is
significantly different from everything before, and because of this, a
separate platform is justified.

Definitely separate from the traditional OVMF IA32, IA32X64, and X64
platforms, and *possibly* separate from the "remote attestation"
AmdSevX64.dsc platform. I would approach the TDX feature-set in complete
isolation (exactly how Intel commenced the work, if I understand correctly),
modulo obviously shareable / reusable parts, and then slowly & gradually
work on extracting / refactoring commonalities.

(But, given my stance on Xen for example, I could disagree even with the
latter, retroactive kind of unification -- it all boils down to shared developer
and user base. Component sharing should reflect the community structure,
otherwise maintenance will be a nightmare.)

Thanks
Laszlo
Some of the comments need be discussed internally in intel first. So I cannot answer
all your comments right now.
Thanks again Laszlo for your valuable review comments!

Min

Join devel@edk2.groups.io to automatically receive all group messages.