[edk2-devel] [edk2-discuss] GSoC Proposal

Brian J. Johnson

On 4/15/22 23:23, Desimone, Nathaniel L wrote:
Hi Marvin,

-----Original Message-----
From: Marvin Häuser <mhaeuser@...>
Sent: Friday, April 15, 2022 9:44 AM
To: discuss@edk2.groups.io; Johnson, Brian <brian.johnson@...>;
Desimone, Nathaniel L <nathaniel.l.desimone@...>; Andrew Fish
<afish@...>; devel@edk2.groups.io
Cc: Pedro Falcato <pedro.falcato@...>; adachristine18@...;
Shi, Steven <steven.shi@...>
Subject: Re: [edk2-devel] [edk2-discuss] GSoC Proposal

Hey Brian,

On 15.04.22 18:22, Brian J. Johnson wrote:
Nate, Andrew, Marvin, Pedro, Ada, et al,

This is a great discussion.  I've been debating where to weigh in...
Personally, I don't think this topic can get enough attention. Thanks! :)

I agree that some sort of library sharing to reduce image size would
be very helpful.  Although some parts of the BIOS are built
separately, large parts of it are compiled at the same time, so there
should be ample opportunity for tool-directed sharing.  Some form of
pre-linking modules together may be the easiest way to do that.  But
however it gets implemented, we should try to make it automatic:
requiring library writers to manually add a lot of metadata or write
thunks or shims would be a barrier to adoption.
My suggestion to use a private function array would indeed require shims,
however this would be the "better than nothing I guess" solution where only
the most used and most painful to duplicate functions are shared, e.g.,
memory and maybe common string printing functions. This would quickly and
efficiently achieve the goal of "reduce size". And some form of sophisticated
prelinking would be my proposal for something more systematic, which
might come a lot cheaper with the security features to be added.

I also agree that PEI has acquired far too much functionality over
time, and we badly need to find some way to reverse that.  PEI is
pulling in more and more of the h/w initialization, leaving DXE mainly
to provide the boot interface (ACPI, UEFI services, setup screens,
etc.)  That puts more and more pressure on XIP storage and
cache-as-RAM.  How can we encourage a change?
I'm not in the loop enough to comment on the XIP and CAR issues.
However, HW init moving to PEI (or more generally, into a single stage) is not
necessarily a bad thing, is it? As I've said before, I'm not sure what else to
think of PEI than "DXE light". Both "initialise some of the hardware" with no
clear scope boundary but with similar interfaces. If they were sort of fused to
provide a minimal "pre-memory PEI" and a more DXE-like "post-memory PEI"
without having a clear boundary between HW init steps, would that be too
bad? Now I'm *really* just blindly guessing because I barely looked into this
project, but isn't that somewhat the idea of SlimBootloader?

Your description of DXE sort of is it collapsing into an advanced BDS.
Nate earlier said he'd like a clear distinction between DXE and BDS, which I
didn't really get the point of.
I guess what I'm trying to express is where does DXE end and where
does BDS begin? A lot of drivers that are only used in BDS get
dispatched early in DXE. Conversely, it is entirely possible for some
DXE drivers to only run after we have technically entered the BDS
phase. The line where DXE ends and BDS begins is blurry and not
really a line but more like a rectangle.
To Isaac's point, there is a good argument to be made for beefing up
PEI's capabilities to the point that DXE and BDS more or less become
one thing. But what does that mean for PEI? Do we change from the
1-D PPI database to the 2-D Handle database in order to express
device topologies better for example? That would be useful for stuff
like the PEI UFS drivers as they run into difficulties expressing
LUN IDs that the DXE version does not for example. Not sure if I
would actually formally propose that in the PI WG but interesting
stuff to think about regardless.
Andrew pointed out that PEI was originally intended for memory init, and DXE for the rest. One nice aspect of that is that there's a simple, architected, consolidated handoff of state between them: the HOB list. That makes it easier to do "unusual" things in SEC+PEI (special security, address map changes, etc.) but have it all hidden from DXE, where the majority of "standard" code lives. The trend of moving more and more into PEI is eroding that advantage.

Maybe a similar handoff of state is needed between pre-memory and post-memory PEI? Or between DXE and BDS?

One could argue that we already have that for BDS: the UEFI boot services.

But if PEI consumed the HW init portions of DXE,
a more advanced BDS (which also powers e.g. the HID stuff on the way)
would be the logical remainder stage to separately support UEFI booting as a
payload (for SlimBootloader, coreboot, ...).

All of that was logically speaking. Even if the development was driven to its
extreme, PEI and DXE cores would still share a bunch of code and it might
make sense to share a foundation technically.

Best regards,

Brian J. Johnson

On 4/15/22 03:15, Nate DeSimone wrote:
Hi Andrew,

On 4/14/22, 7:43 PM, "Andrew Fish" <afish@...> wrote:

On Apr 14, 2022, at 6:06 PM, Nate DeSimone
<nathaniel.l.desimone@...> wrote:

Hi Marvin,

-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of
Marvin Häuser
Sent: Thursday, April 14, 2022 12:56 AM
To: discuss@edk2.groups.io; Desimone, Nathaniel L
Cc: Pedro Falcato <pedro.falcato@...>; edk2-devel-groups-io
<devel@edk2.groups.io>; adachristine18@...; Shi, Steven
Subject: Re: [edk2-devel] [edk2-discuss] GSoC Proposal

I feel like there are so many much easier solutions to this
problem that are at most limited by the clear specification. The
UEFI specification with regards to booting and all of that
obviously is of utmost importance.
If you have a better idea that retains compatibility with the
existing UEFI PI then I would be happy to hear it. Ultimately
anything we do needs to be a pure extension that retains
compatibility with old code. Given that restriction having the
ability to coalesce all the LibraryClasses into a single module and
have cross-module linking seems like the best way to handle it to me.

The UEFI PI specification parts that deal about internal
structure, as far as I know, are only in place to make it easy to
integrate Intel IP.
Its not just for Intel. The biggest reason for it to increase the
standardization of the boot flow across the PC ecosystem. We have
learned from experience that firmware is super critical to get a
product out the door but it is also difficult to write. So we try
to make it as reusable as humanly possible.

In fact, I don’t *know*, but I’m pretty sure the very strict
separation between PEI and DXE was preserved mostly because MRC
was 32-bit-only for a long time. Glad that seems to have been
resolved, AMD does memory init by PSP nowadays.
Having less complex early stages chain load more complex later
stages is a common design pattern in firmware, not just UEFI. For
example, your typical ARM system loads kinda like this:

PBL (SoC ROM) --> SBL (RAM Init) --> ARM Trust Zone --> Hypervisor
--> EDK II or U-Boot or LittleKernel (which runs android fastboot)

Comparing relative complexity I believe the Intel UEFI PI design is
actually pretty simple when you consider how much it gets done:

Ucode ROM --> SEC + PEI --> DXE + SMM + BDS

My biggest criticism of the design is that the strict separation
between PEI and DXE doesn't exist between DXE, SMM, and BDS 😊

There are a few reasons why PEI was 32-bit for quite some time. The
biggest one is the code size increase, 64-bit x86 code is 20-30%
larger than 32-bit x86 code. Since the only RAM Pre-Memory code has
access to is the cache onboard the processor and for security
reasons all that code has to fit within that RAM we generally do
everything we can to make that image as small as possible. Second,
64-bit requires a page table and since we desired to keep PEI
simple we tried to avoid that. Finally, the PI spec didn't allow a
64-bit PEI until recently. MRC is 32-bit code just because that is
what PEI happens to be. Porting it to 64-bit is not terribly

Ultimately the mix of 32/64-bit does cause some difficulties. Any
data structures that get shared between PEI and DXE (HOBs, PCDs,
etc.) need to resolve to the same size and packing. LibraryClasses
need to be written to compile properly in both modes. In the case
of FSP API mode you need to resort to thunking between 32 and
64-bit modes during DXE. More or less we decided that the costs are
starting to outweigh the advantages.
I’d also point out that x86 VMs use X64 (x86-64) PEI today and it
works so the 32-bit/64-bit mix has nothing to do with UEFI/PI/edk2.

In the PC ecosystem a single chipset family can power thousands of
unique designs. So the  DRAM memory needs to be external, support
lots of different chipset packages(signal integrity...), support the
lowest cost through the highest cost DRAM and thousands of different
board layouts. So programing DRAM takes a masters degree in antenna
design. I’ve seen MRC (Memory Reference Code) with over a MiB of
DEBUG prints in it, and it literally is printing histograms of what
it is tuning. So all this code has to run before the system has any
DRAM, thus it is running using the cache as RAM. I’ve not looked at
the x86 architecture specs form the vendors in a while, but back in
the day they did not support page tables in ROM or pinned cached.
Now it might work, but if it breaks your CPU vendor blames you so
you don’t code PEI in X64….

We contributed the 1st edk2 ARM platform, Beagle Board, and It was a
long time ago but I seem to remember the mask ROM used a table in
NOR FLASH to init memory and then copied an image from NOR FLASH
into DRAM and jumped to it. So PEI is kind of not really needed and
we implemented a PrePEI and jumped directly to DXE.

Given I was around back in the day when all this stuff was designed
I can say SEC was always a place holder for security code, as
security code always has to run 1st. PEI (Pre EFI) was designed to
get DRAM programmed and then jump to DXE. It kind of also fell in
naturally to ACPI S3 flow since that was turning memory back on.
When we designed PEI we kind of though of it more like a boot loader
stage for the firmware that turned on memory and all the work would
happen in DXE. Then reality strikes and the existing BIOS assembly
programmers start learning C (lots of  cranky people) and they start
having to learn all about PEI to turn on memory. They had to write a
big chunk of code for the memory init in PEI. These folks had never
written any EFI code, so to them it was easier to move a lot of the
chipset init code into PEI as that is the world they had to figure
out to get memory turned on. I mean why learn EFI if you don’t have
to? So that is how we get so much code in IA32 (i386) on some
platforms. This start kind of biased future choices and how to
enable non edk2 code bases….
One of the big reasons a lot of code that should have been written in
DXE ended up in PEI is unfortunately due to the FSP and its inability
to support DXE code.


Andrew Fish

For many good reasons, Linux does not provide a stable kernel API.
allows to easily deploy breaking changes across the entire codebase.
Obviously, this is infeasible at a large scale when you need to
integrate IP blobs, but it would already help to break the
expectation that UEFI PI is a perfectly forwards- and
backwards-compatible system. DXE has SetMem and CopyMem as
part of
gBS. Maybe I don’t like it that much as part of the spec itself,
but it’s a good idea technically. I’d probably opt to have a
quickly accessible implementation detail of important function
pointers appended to PEI and DXE services, used by in-tree
modules. This may break both
forwards- and backwards-compatibility, but as it only applies to
in-tree modules, that is fine so long as we let go of the
compatibility notions. PPIs and protocols are an option too of
course, but they have a lookup performance penalty. Compared to
dynamic linking, that should hopefully be negligible however.

Absolutely optional to read, I don’t intend to waste anyone’s time
much, some philosophical stuff about my rationale:

If you started UEFI from scratch right now, would it have strictly
separated PEI and DXE?
For sure a clean slate design started today would look a lot
different than PEI/DXE... which was a clean slate design circa 1998
😊. In my opinion, if we were to suddenly go back to the drawing
board we would build something that is much closer to a full OS now
than we did back then. There have been cases where not being able
to use interrupt handlers and not having thread scheduling has
prevented implementation of desired features. The ARM guys built
LittleKernel (https://github.com/littlekernel/lk) for a lot of
these reasons. In the data center world some have decided to go to
the extreme of putting an entire copy of Linux in SPI so they can
do a network boot that downloads the OS image using BitTorrent!

The duplication between PEI and DXE core, and by extension MM
core, would be my most obvious place to start reducing size. I
would probably opt for a PEI/DXE hybrid where it starts in
„minimal mode“ (maybe think of PEI more like a microkernel here)
and after memory is initialised, the rest of DXE is loaded. And MM
core would get no loading at all, this requirement has gladly been
dropped ages ago. Just one prelinked snapshot of the address space
with a relocation table and a safe self-relocator on entry (this
is needed at the very least for ARM).

Ironically, with my idea for MM, dynamic loading would be free as
everything is prelinked anyway. The same is true for PEI XIP, it
is prelinked by nature.
Actually Post-Memory PEI can have non-prelinked PEIMs. And that
does get used for the PEI GOP driver.

What I do not like is the additional dynamic linking code at
load-time for non- XIP modules. Though, the more I think about it,
the more I wonder whether not the entirety of UEFI could be
composed of prelinked, relocatable memory snapshots without
traditional image loading entirely (for in-FW stuff). macOS has a
similar concept with its “Kernel Collections”.
Well, way
too much off-topic now. :)
If you make the assumption that 100% of the code is compiled all at
once then yes that works. UEFI was designed so that assumption does
not need to be true. There are good use cases for it: OpROMs,
generic OS loaders, network boot, etc.


Why am I explaining all this despite the fact everyone knows this
will never happen? Because I don’t like the notion of fixing
issues of an already overcomplicated system by adding even more
complicated stuff.
when the existing overcomplicated stuff is already uncomfortably

Best regards,

For XIP PEI code… it will really help and would be very timely
since the
transition of PEI from 32-bit to 64-bit is going to increase the
size of PEI by ~20%.


From: Pedro Falcato <pedro.falcato@...>
Sent: Wednesday, April 13, 2022 11:43 AM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Marvin Häuser
Cc: discuss@edk2.groups.io; adachristine18@...; Desimone,
Nathaniel L <nathaniel.l.desimone@...>; Shi, Steven
Subject: Re: [edk2-devel] [edk2-discuss] GSoC Proposal

Hi Marvin, Ada,

Some comments:

I don't think the purpose of the dynamic linker is to treat EFI
as a complete operating system, but to try to eliminate the
static linking that may be needlessly duplicating code that could
instead be put in a
single dynamic library. For instance, MdePkg and MdeModulePkg are
linked into a *lot* of .efi, instead of being just a library. It'd
be nice to see some numbers on this (something like Google's
bloaty could be run on every .efi file, in order to understand how
much file space we would actually save).

Other comments inline.

On Wed, Apr 13, 2022 at 4:15 PM Marvin Häuser
<mhaeuser@...<mailto:mhaeuser@...>> wrote:

On 13. Apr 2022, at 16:38, Ada Christine
i was replying via the groups.io<http://groups.io > web
interface, I'm guessing that messed up the thread? i haven't used
mailing lists before and don't know how they work. I'll use my
mail client from here on.

I'm on board with not treating EFI as an operating system. the
more i think about it the more it looks like scope creep.


I'm not quite as enthusiastic
about it as i was at first glance.

I'm still keen on doing my gsoc proposal for edk, though, and
even if this task and the acpica application are decided to be
out of scope unit testing,

How about fuzz-testing? This is also something edk2 needs quite
badly. At
Acidanthera, we compile edk2 code in userspace outside the edk2
build system and fuzz with dummy applications.

Note: fuzzing is also part of the LLVM instrumentation suite (see
https://llvm.org/docs/LibFuzzer.html ) and is something I could
happily mentor.
clang integration

Pedro and Vitaly are looking for someone to finish ASan:
There are working UBSan concepts, but they also need to be

Is Vitaly going to be a mentor? I was assuming it was going to be
me and
some other, more senior, mentor (possibly Steven Shi, which I
included in the task).
Anyway, re: ASAN, if the project includes ASAN, UBSAN and
possibly some other sanitizer it's quite possible that it could
be considered a large
project (which means more hours but a larger stipend too). Fuzzing
+ coverage could be very nice additions to this project idea.
Also, is stress-testing a decent idea?

and source-level debugging are all relevant to my interests.

how about your ideas for security stuff?

I want the entirety of MM to leverage SmmMemLib and to support
SmmMemLib would then handle UEFI->MMRAM and BaseMemoryLib
only work on MMRAM. Also evaluation of how to best avoid pointers
in MM communication buffers would be nice.

There also is a bunch of other stuff, like working out moving a
part of
CpuDxe into DxeCore to have memory protection live immediately,
memory protection in PEI, a replacement for the TE format (it’s
buggy and most platforms mostly abandoned it over various issues),
and alternatives to guarding critical code with SMM (like allowing
NVRAM commits only as part of a reboot).

I personally find all of those projects very important, but I
cannot promise many people agree. Especially those that impose
global changes (most notably the TE replacement) may be very
tedious to submit.
Gladly, I believe you can submit multiple proposals (?)

Best regards,

I'm not very knowledgeable about
trusted platform or secure boot but I'm willing to learn whatever
is necessary to get something spun up for my proposal.

On Wed, Apr 13, 2022, 12:05 Marvin Häuser
<mhaeuser@...<mailto:mhaeuser@...>> wrote:

Do you use the “reply all” option in your mail client? Looks like
my CCs have been dropped again. Comments inline.

On 13. Apr 2022, at 12:54, Ada Christine
Hi, Marvin

Its similarity to my own latest experiment is the key to what
grabbed my attention. I have no particular use case in mind for
it, but I see its potential for anybody developing larger
applications in that when a library is changed there's no need to
distribute a new version of the whole binary, just the relevant
library module.

I really do not like the trend of treating UEFI as a full-fledged
OS - it is not. The most used UEFI applications, OS loaders, are
really not that huge and are distributed as part of the OS image
Even for
less used applications, you will always get a full snapshot anyhow.
Gladly we don’t have auto-update and package management yet. :)

I slept on it and it occurred to me that the whole thing could
similarly to the shell protocol in that the linker/loader is
itself an
application that does a LoadImage() on the application needing
linking facilities.

That would mean the linker itself is shipped with every application
that requires it? Otherwise it doesn’t make much sense for it to
be an
app and below’s problems apply.

If however the whole plan is making the linker as a DXE and
it with the firmware, that I'm not quite as sure about. That would
necessarily tie any applications using dynamic linking to
TianoCore or
any firmware distribution that derives from it.

I think that was the idea referred to as “edk2 core” by Steven, but
I’d like to hear his proposal to be sure. Virtually everyone uses
edk2, so that itself is not the problem, but versioning is. Vendors
are slow to update their snapshots or have just given up doing that
entirely. Distributing it for external applications like OS loaders
would mean this can be leveraged probably no earlier than 10 years
from now. And for in-firmware things, I have a hard time thinking
about a
use-case that outweighs the drawbacks.

To shift the topic slightly back to GSoC, however, I'm willing to
on other items on the task list. Unit testing and an ACPICA
application are the alternative projects I had thought about. I need
to choose fairly soon as the proposal deadline is next Tuesday. I
a tiny bit about porting ACPICA as I also have plans to
incorporate it into my
own project.

I have a few more ideas for security stuff, but Nate did not confirm
them as appropriate yet and I’m not here to drive you away from
specific task (or the others). However, I’m still curious and
concerned. :)

Best regards,

Pedro Falcato