Re: [edk2-rfc] [edk2-devel] RFC: design review for TDVF in OVMF


Min Xu
 

On 06/04/2021 12:12 AM, Laszlo wrote:
On 06/03/21 15:51, Yao, Jiewen wrote:
Hi, All
We plan to do a design review for TDVF in OVMF package.


The TDVF Design slides for TinaoCore Design Review Meeting (Jun 11) is
now available in blow link:
https://edk2.groups.io/g/devel/files/Designs/2021/0611.

The Bugzilla is https://bugzilla.tianocore.org/show_bug.cgi?id=3429



You can have an offline review first. You comments will be warmly
welcomed and we will continuously update the slides based on the
feedbacks.
Resending my earlier comments in this mailing list thread, with the feedback
inserted at the proper spots that has been given in the off-list thread since
then:


*** Slides 4, 6, 7: the "one binary requirement".
I would like to hear other reviewers' comments on the "one binary requirement".

(1) The presentation refers to "OvmfPkgX64.dsc" as "the one" on slide#4, but
then the explanation for the requirement, given on slide 7, speaks about
"common attestation interface".

I think we have a misunderstanding here. The "OvmfPkgX64.dsc" platform
indeed contains SEV, SEV-ES, and (in the future, will contain) SEV-SNP
support. In that sense, adding TDX support to the same platform should be
(hopefully) possible, at the cost of ugly gymnastics in the reset vector.

But "OvmfPkgX64.dsc" is *already* different from the remotely attested
OVMF platform, namely "OvmfPkg/AmdSev/AmdSevX64.dsc".

The latter has some additional modules (secret PEIM and DXE driver), it has
the Grub binary built in, and -- probably most importantly -- it trusts host-
originated information less than "OvmfPkgX64.dsc".

For example, "OvmfPkg/AmdSev/AmdSevX64.dsc" has a dedicated
PlatformBootManagerLib instance, one that ignores the non-volatile UEFI
variables Boot#### and BootOrder, and ignores (thus far) the fw_cfg
originated kernel/initrd/cmdline as well.

It remains an "area of research" to see what else should be removed from
the traditional host-guest integration (which integration is usually desirable
for management and convenience), in the remotely-attested boot scenario.
See e.g.
<https://bugzilla.tianocore.org/show_bug.cgi?id=3180>.

My point is that the "one binary" that the slide deck refers to (i.e.,
OvmfPkgX64.dsc) might prove OK for utilizing the Intel TDX *technology* in
itself. Simply enabling OVMF + a guest OS to boot in a TDX domain.

But "OvmfPkgX64.dsc" is *not* the "one binary" that is suitable for remote
attestation, which has a much broader scope, involving multiple computers,
networking, deployment, guest-owner/host-owner separation, whatever.
For the latter, we needed a separate platform anyway, even with only SEV in
mind; that's why "OvmfPkg/AmdSev/AmdSevX64.dsc" exists.


*** Slides 8-9: general boot flow -- TDVF; TDVF Flow

I'm likely missing a big bunch of background here, so just a few
questions:

(2) Not sure what RTMR is, but it's associated with "Enable TrustedBoot"
-- so is a virtual TPM a hard requirement?

... Back from slide 10: "TCG measurement and event log framework w/o
TPM" -- that's curious.
Comments of Dave & Erdem are pretty good to explain what RTMR is.
[response from Dave Gilbert:]
My reading of this is that the RTMR (and another set of similar
registers) are a TDX thing that is like the PCRs from a TPM but
without the rest of the TPM features; so you can do the one-way
measurement into the RTMRs just like you do into a TPM PCR, and the
measurements pop out somewhere in the TDX quote. Just like a TPM you
need the event log to make any sense of how the final hashed value
supposedly got to where it did.
[response from Erdem Aktas:]
+1 to David on this. TDX provides 2 kinds of measurement registers:
MRTDs and RTMRs
(https://software.intel.com/content/dam/develop/external/us/en/documen
ts/tdx-module-1eas-v0.85.039.pdf section 10.1.2) . MRTDs are
build-time measurement registers which are updated when TDX is being
created. Once TDX is finalized (before the first run), the MRTDs are
finalized and cannot be updated anymore. On the other hand, while the
TD is running, TD can extend RTMRs through TDCALLs which will provide
TPM PCR kind of capabilities.
... Back from slide 43: feel free to skip this now; I will comment in more detail
below.

(3) Prepare AcpiTable -- OVMF fetches ACPI tables from QEMU; is this a new
(firmware originated) table that is installed in addition, or does it replace
QEMU's tables?

... Ignore for now, will comment on the MADT stuff later.

(4) Does DMA management mean a different IOMMU protocol? That is going
to conflict with the SEV IOMMU protocol. Depexes in OVMF expect one or
zero IOMMU protocols to be present.
There is only one IOMMU protocol. We will merge TDX features into the current SEV IOMMU implementation. Td or Non-Td will be probed in run-time, so that the corresponding APIs will be called to clear/set the memory encryption mask for SEV or the shared bit for TDX.
... Back from slide 40: feel free to skip this now; I'll comment on this
separately, below.

(5) Enumerate VirtIO -- virtio enumeration is PCI based on x86. But I see no
mention of PCI. If you mean VirtioMmioDeviceLib, that's no good, because it
does not support virtio-1.0, only virtio-0.9.5.

... Back from slide 42: I got my answer to this on slide 42, so don't worry
about this point.
I will comment it in my later response.
[...]
(6) The PEI phase is skipped as a whole. I don't see how that can be
reasonably brought together with either "OvmfPkgX64.dsc" or
"OvmfPkg/AmdSev/AmdSevX64.dsc". I guess you can always modify SEC to
jump into DXE directly, but then why keep the PEI core and a bunch of PEIMs
in the firmware binary?
This is because of the *one binary*. In non-Td guest, Legacy OVMF still need PEI core and the PEIMs.

Also touched on by slide 9, TDVF Flow -- PEI is eliminated but SEC becomes
more heavy-weight.
I have to admit SEC is more heavy-weight in Td guest. TdxStartupLib wraps the whole stuff in SEC phase.

Wouldn't this deserve a dedicated, separate platform DSC? The 8-bit/32-bit
branching at the front of the reset vector is a smaller complication in
comparison.

Slide 6 references the mailing list archive:

https://edk2.groups.io/g/devel/topic/81969494#74319

and in that message, I wrote:

I'm doubtful that this is a unique problem ("just fix the reset
vector") the likes of which will supposedly never return during the
integration of SEV and TDX

See also:

https://listman.redhat.com/archives/edk2-devel-archive/2021-
April/msg00784.html

where I said:

It's not lost on me that we're talking about ~3 instructions.

Let's keep a close eye on further "polymorphisms" that would require
hacks.

The first 9 slides in the presentation introduce much-much more intrusive
problems than the "polymorphism" of the reset vector. Would I be correct to
say that my concern in the above messages was right? I think I was only given
a fraction of the necessary information when I was asked "do you agree 'one
binary' is viable".

[response from Erdem Aktas:]
Let's not worry about this for now. We want the one binary solution
for practical reasons and also for simplicity. In the end, we want to
do what is right and good for everyone.
Those are legit concerns and I think what Intel is trying to do (sorry
for mind reading) is to discuss all those concerns and questions to
make the right decision. I really appreciate their effort on preparing
those slides and bringing it to the community to review.

I will also read your comments more carefully and provide my thoughts
on them.
Sorry for being a little slow on this.

*** Slide 10 -- Key impact to firmware

(7) "CPUs are left running in long mode after exiting firmware" -- what kind of
"pen" are we talking about? Does a HLT loop still work?
It is expected that TDX-VMM lauchs all CPUs to the ResetVector(0xfffffff0). After reset, all CPUs run the same initialization code (protected mode -> long mode) until Sec Entrypoint. (see slides 22)
After that APs spin at TdMailBox waiting for the commands (from BSP). Before jumping to DXE phase, TdMailBox will be relocated to a new address which is recorded in MADT table. APs then are waked up by BSP and spin at that new TdMailbox and wait for command.
After exiting firmware, APs are spinning and waiting for OS to wake them up. OS send the wake up command to the TdMailbox by reading the MADT table.

(8) "I/O, MMIO, MSR accesses are different" -- those are implemented by
low-level libraries in edk2; how can they be configured dynamically?
I will add more description of the basic IoLib in next version design slides.
Simply say we probe Td or Non-Td in run-time and then call the corresponding IO operation.
UINT8
EFIAPI
MmioRead8 (
IN UINTN Address
)
{
UINT8 Value;

if (IsTdGuest ()) {
Value = TdMmioRead8 (Address);
return Value;
}

MemoryFence ();
Value = *(volatile UINT8*)Address;
MemoryFence ();

return Value;
}

... Back from slide 53: I'll comment on slide 53 separately; ignore this.


*** Slide 11 -- TDVF Image (1)

(9) CFV -- Configuration Firmware Volume (VarStore.fdf.inc), containing SB
keys -- how is this firmware volume populated (at build time)? Is this a
hexdump?
CFV is populated in post build. We can provide such python scripts to do the SB keys enrollment.

I will continue my comments in my next mail.
[To be continued]
Thanks!

Join devel@edk2.groups.io to automatically receive all group messages.