Re: VariablePolicy: Final Changes Thread 1 - TPL Ordering
Bret Barkelew <bret.barkelew@...>
The problem with the other solution is that it doesn’t give the platform any flexibility to move which event the lock is attached to. For EDK2 default, we’re making it EndOfDxe, because that matches most platforms’ threat models. For Mu, we use ReadyToBoot.
The current “hack” makes that more difficult without multiple patches that we have to carry to remove the “fixed” ordering of the events.
That said, I do think I’ll clean that up and submit it, unless Sean reads this and stops me. 😉
From: Michael D Kinney via groups.io
Sent: Tuesday, October 13, 2020 8:05 AM
To: Bret Barkelew; Andrew Fish; edk2-devel-groups-io; Kinney, Michael D
Subject: [EXTERNAL] Re: [edk2-devel] VariablePolicy: Final Changes Thread 1 - TPL Ordering
If the other solution more directly expresses the dependency and guarantees ordering, then it may not be considered a hack.
I prefer solutions that minimize the chances of failures and the need for platform developers to do extra work to integrate a feature or do extra work when they add an unrelated feature.
I see Laszlo’s proposal to add a new tag GUID protocol to detect and enforce ordering for this specific use case.
The network stack added a DPC (Deferred Procedure Call) Protocol to help with ordering issues and remove the introduction of extra TPL levels in an earlier version of the network stack. I am wondering of there is a more generic solution to this specific problem through the use of DPC.
From: Bret Barkelew <Bret.Barkelew@...>
Like I said, I’m also happy to go with the lesser solution of replacing the hack that was already in the code. The last person didn’t care to solve this problem, and I’m good to not solve it, too. I mean, I think it’s turtles all the way down no matter what.
It was actually the ASSERT in the code that highlighted this problem to being with, so I would say that it’s doing its job. It’s incumbent upon the code author to determine what resource they’re trying to access and whether they’ve accessed it successfully, and I agree that it seems like an appropriate use of ASSERTs so long as it’s backed up with some OTHER appropriate action (even if that action is ignoring it).
From: Kinney, Michael D
How to platform creators know for the complete set of drivers if there is anything then need to worry about and why and what they need to address the concern? This is about order that events are signaled for a given event trigger. When a platform adds more driver that may use the same event triggers, how do they know if there is a potential for a race condition or not? If event notification functions are design to be independent of signaling order, then there is no issue. As soon as there are requirements for event notification functions to be executed in a specific order at a specific event trigger, we have to make sure the platform creator knows and preferably, the FW can tell them if they got it wrong.
Can your data/device manipulators and data/device protectors use case generate an ASSERT() if they are signaled in the wrong order?
Agreed with your concern, Mike. This mechanism (and we can document it as such) should NOT be used to accomplish an explicit ordering (a la the “apriori list”). It’s just to provide a little separation for two patterns that we’ve seen time and again in our code: data/device manipulators and data/device protectors. It does not eliminate the necessity for platform creators to understand things like driver ordering if they have one driver that requires a protocol be installed or a bus connected.
From: Kinney, Michael D
I agree DXE drivers could use a PCD to make it configurable and prevent collisions with UEFI defined TPL levels.
Bret’s suggestion of adding a DXE scoped services to create events using non-UEFI defined TPL levels could be used with these TPL levels from PCDs. Would also allow DXE drivers to use TPL levels associated with the firmware interrupts in the range 17..30. Perhaps extensions to the DXE Services Table?
Still does not address my concern that many DXE drivers using these extra TPL levels may run into race conditions if more than one DXE driver selects the same TPL level. Platform integrators will need to understand the relative priorities of all DXE drivers that use extra TPL levels so they can assign values that both avoid collisions with future UEFI specs and prevent race conditions among DXE drivers.
When I’ve done things like this in the past I think of making them configurable like via a PCD.
In terms of the #defines I think it makes more sense to just do math on the spec defined values. So TPL_CALLBACK + 1 or TPL_CALLBACK - 1 etc. I’ve got an lldb type formatter for TPL and it prints out <UEFI Spec TPL> [+ <extra>] as I think this is the clearest way to do it.