GSoC Proposal


Ada Christine <adachristine18@...>
 

Hello, edk2 developers!

I've registered as a contributor candidate for GSoC 2022 and am interested in working on one of the items from the Tasks list here https://github.com/tianocore/tianocore.github.io/wiki/Tasks. Specifically, adding dynamic linking support caught my attention as this is something i've been investigating and learning more about in one of my own personal projects. As a little background, my personal project is an experiment in OS development and I use a very small subset of the boot services to get started and loaded. It can be found here: https://github.com/adachristine/sophia. Recently I've started investigating (and begun to implement) using ELF's dynamic facilities to dynamically load kernel modules. I know PE is slightly different to ELF, but the principles seem similar enough.

I've had a few glances at the EDKII source code in the past and have a general idea of how it all fits together. What I have in mind to implement this would be the following:
- create a dynamic linker as a module package to be compiled into the main application
(alternatively, implement dynamic linking as a runtime service driver?)
- adjust the build system to enable building as DLLs and dynamic linking of module packages to the main application
(module packages could be per-application and optionally site packages in a subdirectory of the ESP?)

I know the details of how this would all fit together are a little more involved, but this is just the rough first idea that came to my mind. Happy to hear feedback, and if my idea seems feasible I can get to work on a more in-depth plan to put this together.

Thanks!

- Ada


Marvin Häuser <mhaeuser@...>
 

CC Nate (GSoC admin)
CC Steven (task mentor)
CC edk2-devel (you picked the logically correct list, but it’s pretty dead and barely anyone reads it)

Hey Ada,

Out of mere curiosity, why did you pick this item? :)

Hey Steven,

I feel like there is more to your proposal than is given on the task page. Why is it “ELF first”, is it something useful for UefiPayloadPkg or Linux somehow?

As for supporting it in the EDK II core, I personally feel like this is much too late. The entire ecosystem is centred around protocols (and the services tables) already. “Loading only when necessary” doesn’t sound very important to me personally, as the firmware image is already supposed to be fairly minimal. I’d rather like to see the introduction of “lazy protocols” (which do not require any new fundamental concepts), e.g., for network and HID stuff like mice and touch, which go through the driver connection procedure only when a protocol function is called for the first time. A big issue with this of course are non-function pointers in the protocol structure.

This will not only require a dynamic linker in the firmware to maybe double the size of the already disgusting and vastly unmaintained PE loader, it will also require further format conversion from ELF and Mach-O, both of which already are buggy (the former much more so than the latter). This is a tremendous effort in my opinion and introducing partial support will cause more awkward toolchain limitations.

Can you please outline why this (in my opinion, big) tradeoff is worth it? Just curious. :)

Best regards,
Marvin

On 13. Apr 2022, at 03:05, Ada Christine <adachristine18@...> wrote:
Hello, edk2 developers!

I've registered as a contributor candidate for GSoC 2022 and am interested in working on one of the items from the Tasks list here https://github.com/tianocore/tianocore.github.io/wiki/Tasks. Specifically, adding dynamic linking support caught my attention as this is something i've been investigating and learning more about in one of my own personal projects. As a little background, my personal project is an experiment in OS development and I use a very small subset of the boot services to get started and loaded. It can be found here: https://github.com/adachristine/sophia. Recently I've started investigating (and begun to implement) using ELF's dynamic facilities to dynamically load kernel modules. I know PE is slightly different to ELF, but the principles seem similar enough.

I've had a few glances at the EDKII source code in the past and have a general idea of how it all fits together. What I have in mind to implement this would be the following:
- create a dynamic linker as a module package to be compiled into the main application
(alternatively, implement dynamic linking as a runtime service driver?)
- adjust the build system to enable building as DLLs and dynamic linking of module packages to the main application
(module packages could be per-application and optionally site packages in a subdirectory of the ESP?)

I know the details of how this would all fit together are a little more involved, but this is just the rough first idea that came to my mind. Happy to hear feedback, and if my idea seems feasible I can get to work on a more in-depth plan to put this together.

Thanks!

- Ada



Ada Christine <adachristine18@...>
 

Hi, Marvin

Its similarity to my own latest experiment is the key to what grabbed my attention. I have no particular use case in mind for it, but I see its potential for anybody developing larger applications in that when a library is changed there's no need to distribute a new version of the whole binary, just the relevant library module.

I slept on it and it occurred to me that the whole thing could operate similarly to the shell protocol in that the linker/loader is itself an application that does a LoadImage() on the application needing dynamic linking facilities. If however the whole plan is making the linker as a DXE and including it with the firmware, that I'm not quite as sure about. That would necessarily tie any applications using dynamic linking to TianoCore or any firmware distribution that derives from it.

To shift the topic slightly back to GSoC, however, I'm willing to work on other items on the task list. Unit testing and an ACPICA application are the alternative projects I had thought about. I need to choose fairly soon as the proposal deadline is next Tuesday. I know a tiny bit about porting ACPICA as I also have plans to incorporate it into my own project.


Marvin Häuser <mhaeuser@...>
 

Do you use the “reply all” option in your mail client? Looks like my CCs have been dropped again. Comments inline.

On 13. Apr 2022, at 12:54, Ada Christine <adachristine18@...> wrote:
Hi, Marvin

Its similarity to my own latest experiment is the key to what grabbed my attention. I have no particular use case in mind for it, but I see its potential for anybody developing larger applications in that when a library is changed there's no need to distribute a new version of the whole binary, just the relevant library module.
I really do not like the trend of treating UEFI as a full-fledged OS - it is not. The most used UEFI applications, OS loaders, are really not that huge and are distributed as part of the OS image anyway. Even for less used applications, you will always get a full snapshot anyhow. Gladly we don’t have auto-update and package management yet. :)


I slept on it and it occurred to me that the whole thing could operate similarly to the shell protocol in that the linker/loader is itself an application that does a LoadImage() on the application needing dynamic linking facilities.
That would mean the linker itself is shipped with every application that requires it? Otherwise it doesn’t make much sense for it to be an app and below’s problems apply.

If however the whole plan is making the linker as a DXE and including it with the firmware, that I'm not quite as sure about. That would necessarily tie any applications using dynamic linking to TianoCore or any firmware distribution that derives from it.
I think that was the idea referred to as “edk2 core” by Steven, but I’d like to hear his proposal to be sure. Virtually everyone uses edk2, so that itself is not the problem, but versioning is. Vendors are slow to update their snapshots or have just given up doing that entirely. Distributing it for external applications like OS loaders would mean this can be leveraged probably no earlier than 10 years from now. And for in-firmware things, I have a hard time thinking about a use-case that outweighs the drawbacks.


To shift the topic slightly back to GSoC, however, I'm willing to work on other items on the task list. Unit testing and an ACPICA application are the alternative projects I had thought about. I need to choose fairly soon as the proposal deadline is next Tuesday. I know a tiny bit about porting ACPICA as I also have plans to incorporate it into my own project.
I have a few more ideas for security stuff, but Nate did not confirm them as appropriate yet and I’m not here to drive you away from this specific task (or the others). However, I’m still curious and concerned. :)

Best regards,
Marvin


Ada Christine <adachristine18@...>
 

i was replying via the groups.io web interface, I'm guessing that messed up
the thread? i haven't used mailing lists before and don't know how they
work. I'll use my mail client from here on.

I'm on board with not treating EFI as an operating system. the more i think
about it the more it looks like scope creep. I'm not quite as enthusiastic
about it as i was at first glance.

I'm still keen on doing my gsoc proposal for edk, though, and even if this
task and the acpica application are decided to be out of scope unit
testing, clang integration and source-level debugging are all relevant to
my interests.

how about your ideas for security stuff? I'm not very knowledgeable about
trusted platform or secure boot but I'm willing to learn whatever is
necessary to get something spun up for my proposal.

On Wed, Apr 13, 2022, 12:05 Marvin Häuser <mhaeuser@...> wrote:

Do you use the “reply all” option in your mail client? Looks like my CCs
have been dropped again. Comments inline.

On 13. Apr 2022, at 12:54, Ada Christine <adachristine18@...>
wrote:
Hi, Marvin

Its similarity to my own latest experiment is the key to what grabbed my
attention. I have no particular use case in mind for it, but I see its
potential for anybody developing larger applications in that when a library
is changed there's no need to distribute a new version of the whole binary,
just the relevant library module.

I really do not like the trend of treating UEFI as a full-fledged OS - it
is not. The most used UEFI applications, OS loaders, are really not that
huge and are distributed as part of the OS image anyway. Even for less used
applications, you will always get a full snapshot anyhow. Gladly we don’t
have auto-update and package management yet. :)


I slept on it and it occurred to me that the whole thing could operate
similarly to the shell protocol in that the linker/loader is itself an
application that does a LoadImage() on the application needing dynamic
linking facilities.

That would mean the linker itself is shipped with every application that
requires it? Otherwise it doesn’t make much sense for it to be an app and
below’s problems apply.

If however the whole plan is making the linker as a DXE and including it
with the firmware, that I'm not quite as sure about. That would necessarily
tie any applications using dynamic linking to TianoCore or any firmware
distribution that derives from it.

I think that was the idea referred to as “edk2 core” by Steven, but I’d
like to hear his proposal to be sure. Virtually everyone uses edk2, so that
itself is not the problem, but versioning is. Vendors are slow to update
their snapshots or have just given up doing that entirely. Distributing it
for external applications like OS loaders would mean this can be leveraged
probably no earlier than 10 years from now. And for in-firmware things, I
have a hard time thinking about a use-case that outweighs the drawbacks.


To shift the topic slightly back to GSoC, however, I'm willing to work
on other items on the task list. Unit testing and an ACPICA application are
the alternative projects I had thought about. I need to choose fairly soon
as the proposal deadline is next Tuesday. I know a tiny bit about porting
ACPICA as I also have plans to incorporate it into my own project.

I have a few more ideas for security stuff, but Nate did not confirm them
as appropriate yet and I’m not here to drive you away from this specific
task (or the others). However, I’m still curious and concerned. :)

Best regards,
Marvin


Marvin Häuser <mhaeuser@...>
 

On 13. Apr 2022, at 16:38, Ada Christine <adachristine18@...> wrote:
i was replying via the groups.io web interface, I'm guessing that messed up
the thread? i haven't used mailing lists before and don't know how they
work. I'll use my mail client from here on.

I'm on board with not treating EFI as an operating system. the more i think
about it the more it looks like scope creep.
Agreed.

I'm not quite as enthusiastic
about it as i was at first glance.

I'm still keen on doing my gsoc proposal for edk, though, and even if this
task and the acpica application are decided to be out of scope unit
testing,
How about fuzz-testing? This is also something edk2 needs quite badly. At Acidanthera, we compile edk2 code in userspace outside the edk2 build system and fuzz with dummy applications.

clang integration
Pedro and Vitaly are looking for someone to finish ASan: https://edk2.groups.io/g/devel/topic/90010978#87991
There are working UBSan concepts, but they also need to be mainlined.

and source-level debugging are all relevant to
my interests.

how about your ideas for security stuff?
I want the entirety of MM to leverage SmmMemLib and to support SMAP. SmmMemLib would then handle UEFI->MMRAM and BaseMemoryLib would only work on MMRAM. Also evaluation of how to best avoid pointers in MM communication buffers would be nice.

There also is a bunch of other stuff, like working out moving a part of CpuDxe into DxeCore to have memory protection live immediately, memory protection in PEI, a replacement for the TE format (it’s buggy and most platforms mostly abandoned it over various issues), and alternatives to guarding critical code with SMM (like allowing NVRAM commits only as part of a reboot).

I personally find all of those projects very important, but I cannot promise many people agree. Especially those that impose global changes (most notably the TE replacement) may be very tedious to submit. Gladly, I believe you can submit multiple proposals (?)

Best regards,
Marvin

I'm not very knowledgeable about
trusted platform or secure boot but I'm willing to learn whatever is
necessary to get something spun up for my proposal.

On Wed, Apr 13, 2022, 12:05 Marvin Häuser <mhaeuser@...> wrote:

Do you use the “reply all” option in your mail client? Looks like my CCs
have been dropped again. Comments inline.

On 13. Apr 2022, at 12:54, Ada Christine <adachristine18@...>
wrote:
Hi, Marvin
Its similarity to my own latest experiment is the key to what grabbed my
attention. I have no particular use case in mind for it, but I see its
potential for anybody developing larger applications in that when a library
is changed there's no need to distribute a new version of the whole binary,
just the relevant library module.

I really do not like the trend of treating UEFI as a full-fledged OS - it
is not. The most used UEFI applications, OS loaders, are really not that
huge and are distributed as part of the OS image anyway. Even for less used
applications, you will always get a full snapshot anyhow. Gladly we don’t
have auto-update and package management yet. :)

I slept on it and it occurred to me that the whole thing could operate
similarly to the shell protocol in that the linker/loader is itself an
application that does a LoadImage() on the application needing dynamic
linking facilities.

That would mean the linker itself is shipped with every application that
requires it? Otherwise it doesn’t make much sense for it to be an app and
below’s problems apply.

If however the whole plan is making the linker as a DXE and including it
with the firmware, that I'm not quite as sure about. That would necessarily
tie any applications using dynamic linking to TianoCore or any firmware
distribution that derives from it.

I think that was the idea referred to as “edk2 core” by Steven, but I’d
like to hear his proposal to be sure. Virtually everyone uses edk2, so that
itself is not the problem, but versioning is. Vendors are slow to update
their snapshots or have just given up doing that entirely. Distributing it
for external applications like OS loaders would mean this can be leveraged
probably no earlier than 10 years from now. And for in-firmware things, I
have a hard time thinking about a use-case that outweighs the drawbacks.

To shift the topic slightly back to GSoC, however, I'm willing to work
on other items on the task list. Unit testing and an ACPICA application are
the alternative projects I had thought about. I need to choose fairly soon
as the proposal deadline is next Tuesday. I know a tiny bit about porting
ACPICA as I also have plans to incorporate it into my own project.

I have a few more ideas for security stuff, but Nate did not confirm them
as appropriate yet and I’m not here to drive you away from this specific
task (or the others). However, I’m still curious and concerned. :)

Best regards,
Marvin


Ada Christine <adachristine18@...>
 

i can submit up to three proposals. I'll give all of this some thought and
a little research and start putting at least two proposals together
starting Friday. :)

On Wed, Apr 13, 2022, 15:15 Marvin Häuser <mhaeuser@...> wrote:


On 13. Apr 2022, at 16:38, Ada Christine <adachristine18@...> wrote:

i was replying via the groups.io web interface, I'm guessing that messed
up
the thread? i haven't used mailing lists before and don't know how they
work. I'll use my mail client from here on.

I'm on board with not treating EFI as an operating system. the more i think
about it the more it looks like scope creep.


Agreed.

I'm not quite as enthusiastic
about it as i was at first glance.

I'm still keen on doing my gsoc proposal for edk, though, and even if this
task and the acpica application are decided to be out of scope unit
testing,


How about fuzz-testing? This is also something edk2 needs quite badly. At
Acidanthera, we compile edk2 code in userspace outside the edk2 build
system and fuzz with dummy applications.

clang integration


Pedro and Vitaly are looking for someone to finish ASan:
https://edk2.groups.io/g/devel/topic/90010978#87991
There are working UBSan concepts, but they also need to be mainlined.

and source-level debugging are all relevant to
my interests.

how about your ideas for security stuff?


I want the entirety of MM to leverage SmmMemLib and to support SMAP.
SmmMemLib would then handle UEFI->MMRAM and BaseMemoryLib would only work
on MMRAM. Also evaluation of how to best avoid pointers in MM communication
buffers would be nice.

There also is a bunch of other stuff, like working out moving a part of
CpuDxe into DxeCore to have memory protection live immediately, memory
protection in PEI, a replacement for the TE format (it’s buggy and most
platforms mostly abandoned it over various issues), and alternatives to
guarding critical code with SMM (like allowing NVRAM commits only as part
of a reboot).

I personally find all of those projects very important, but I cannot
promise many people agree. Especially those that impose global changes
(most notably the TE replacement) may be very tedious to submit. Gladly, I
believe you can submit multiple proposals (?)

Best regards,
Marvin

I'm not very knowledgeable about
trusted platform or secure boot but I'm willing to learn whatever is
necessary to get something spun up for my proposal.

On Wed, Apr 13, 2022, 12:05 Marvin Häuser <mhaeuser@...> wrote:

Do you use the “reply all” option in your mail client? Looks like my CCs

have been dropped again. Comments inline.


On 13. Apr 2022, at 12:54, Ada Christine <adachristine18@...>

wrote:

Hi, Marvin


Its similarity to my own latest experiment is the key to what grabbed my

attention. I have no particular use case in mind for it, but I see its

potential for anybody developing larger applications in that when a library

is changed there's no need to distribute a new version of the whole binary,

just the relevant library module.


I really do not like the trend of treating UEFI as a full-fledged OS - it

is not. The most used UEFI applications, OS loaders, are really not that

huge and are distributed as part of the OS image anyway. Even for less used

applications, you will always get a full snapshot anyhow. Gladly we don’t

have auto-update and package management yet. :)



I slept on it and it occurred to me that the whole thing could operate

similarly to the shell protocol in that the linker/loader is itself an

application that does a LoadImage() on the application needing dynamic

linking facilities.


That would mean the linker itself is shipped with every application that

requires it? Otherwise it doesn’t make much sense for it to be an app and

below’s problems apply.


If however the whole plan is making the linker as a DXE and including it

with the firmware, that I'm not quite as sure about. That would necessarily

tie any applications using dynamic linking to TianoCore or any firmware

distribution that derives from it.


I think that was the idea referred to as “edk2 core” by Steven, but I’d

like to hear his proposal to be sure. Virtually everyone uses edk2, so that

itself is not the problem, but versioning is. Vendors are slow to update

their snapshots or have just given up doing that entirely. Distributing it

for external applications like OS loaders would mean this can be leveraged

probably no earlier than 10 years from now. And for in-firmware things, I

have a hard time thinking about a use-case that outweighs the drawbacks.



To shift the topic slightly back to GSoC, however, I'm willing to work

on other items on the task list. Unit testing and an ACPICA application are

the alternative projects I had thought about. I need to choose fairly soon

as the proposal deadline is next Tuesday. I know a tiny bit about porting

ACPICA as I also have plans to incorporate it into my own project.


I have a few more ideas for security stuff, but Nate did not confirm them

as appropriate yet and I’m not here to drive you away from this specific

task (or the others). However, I’m still curious and concerned. :)


Best regards,

Marvin








Nate DeSimone
 

Hi Ada,

Great to meet you and welcome to the TianoCore project! Great to hear you are interested! Despite Marvin's misgivings, I think dynamic linking would be an excellent addition to EDK II! Marvin is right that we would not want to use it in UEFI spec compliant applications or OpROMs at least for now, but even if you make the assumption that this feature would only be used with EDK II it is still valuable. The primary value I see is reducing the size of BIOS images. Every DXE driver and PEIM includes a lot of statically linked libraries from MdePkg and/or MdeModulePkg depending on the exact driver and build configuration, and it adds up very quickly. On debug builds, the infrastructure for debug messages becomes particularly large. Every PEIM/DXE driver needs to have a copy of the parser for printf() style format strings. The overhead works out to ~12KB for every PEIM and DXE driver. Most BIOS images have ~250 DXE drivers and ~30 PEIMs these days. Uncompressed that works out to >3MB of overhead. The DXE drivers are typically compressed, but LZMA isn't perfect at deduplication and as optimizing compliers have gotten better the deduplication has gotten worse, which is causing this to become more important over time. I did some experiments with the Intel reference BIOS and I found that we can save roughly 1.5-2MB of space with dynamic linking, which is >10% of the flash budget. For PEI where there is no compression, the flash budget savings is >20%, and it has the extra benefit of reducing usage of precious NEM memory. Another thing to consider is that linking OpenSSL for doing code signing checks add ~65KB to the size of the PEIM doing the signature check. When you consider that moving PEI from 32-bit to 64-bit is going to increase code size by ~20%, this feature is extremely valuable for many reasons.

More forward looking, if the project to add Rust is successful then suddenly the rich set of libraries available in Rust crates becomes available to EDK II. Those are going to be big and the only way we will be able to use them is with some deduplication facility like dynamic linking. That will require having some way of using type safe Rust objects across driver boundaries... which is completely out of scope for this GSoC project, but at the same time this GSoC project would be a necessary prerequisite before we could even start thinking about that 😊.

I agree with Marvin that PE/COFF should be the preferred starting point. I believe ELF to be a security risk. It looks easy to parse on the surface but it has been well documented that the devil is in the details and as the CVE history shows it is very easy to build a ELF parser with unintended overflows and other security vulnerabilities. Perhaps most worrying to me is that I don't believe the security community has done enough research on ELF yet, whereas PE/COFF has been very heavily researched and has a mature literature.

Let me know if any of this piques your interest, I would be happy to answer any questions you have!

Hope this helps and welcome to the project!

With Best Regards,
Nate

-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of Ada
Christine
Sent: Tuesday, April 12, 2022 11:09 AM
To: discuss@edk2.groups.io
Subject: [edk2-discuss] GSoC Proposal

Hello, edk2 developers!

I've registered as a contributor candidate for GSoC 2022 and am interested in
working on one of the items from the Tasks list here
https://github.com/tianocore/tianocore.github.io/wiki/Tasks. Specifically,
adding dynamic linking support caught my attention as this is something i've
been investigating and learning more about in one of my own personal
projects. As a little background, my personal project is an experiment in OS
development and I use a very small subset of the boot services to get started
and loaded. It can be found here: https://github.com/adachristine/sophia.
Recently I've started investigating (and begun to implement) using ELF's
dynamic facilities to dynamically load kernel modules. I know PE is slightly
different to ELF, but the principles seem similar enough.

I've had a few glances at the EDKII source code in the past and have a general
idea of how it all fits together. What I have in mind to implement this would
be the following:
- create a dynamic linker as a module package to be compiled into the main
application
(alternatively, implement dynamic linking as a runtime service driver?)
- adjust the build system to enable building as DLLs and dynamic linking of
module packages to the main application
(module packages could be per-application and optionally site packages in a
subdirectory of the ESP?)

I know the details of how this would all fit together are a little more involved,
but this is just the rough first idea that came to my mind. Happy to hear
feedback, and if my idea seems feasible I can get to work on a more in-depth
plan to put this together.

Thanks!

- Ada




Nate DeSimone
 

Hi Marvin,

-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
Marvin Häuser
Sent: Wednesday, April 13, 2022 5:06 AM
To: Ada Christine <adachristine18@...>
Cc: discuss@edk2.groups.io; Desimone, Nathaniel L
<nathaniel.l.desimone@...>; Shi, Steven <steven.shi@...>;
devel@edk2.groups.io
Subject: Re: [edk2-discuss] GSoC Proposal

I have a few more ideas for security stuff, but Nate did not confirm them as
appropriate yet and I’m not here to drive you away from this specific task (or
the others). However, I’m still curious and concerned. :)
I think I must have missed that message, could you forward it to me again? 😊

Thanks,
Nate


Best regards,
Marvin


Rebecca Cran
 

On 4/13/22 00:54, Marvin Häuser wrote:
CC edk2-devel (you picked the logically correct list, but it’s pretty dead and barely anyone reads it)
Just a minor correction - it's edk2-discuss (discuss@edk2.groups.io) that was cc'd but that's pretty dead.
edk2-devel (devel@edk2.groups.io) is much more active!

--
Rebecca Cran