Date   

Re: MemoryFence()

Ni, Ray
 

So if I understand correctly
1) volatile basically tells C to always read the memory. So it impacts the C memory model.
2) CompilerFence() tells the C serialize and complete everything before the barrier, and read memory again the 1st time after the barrier.
3) MemoryFence() is really dealing with speculative fetches and maybe coherency. Basically the CPU reordering things.
I agree with this summary.
Volatile using in MpInitLib or PiSmmCpu driver cannot be emitted because different CPU threads change the same memory content to synchronize with each other.
I don’t quite understand if removing “volatile” what mechanism can be used to force the compiler generating code that always reads from memory?


Thanks,
Ray


Re: MemoryFence()

Laszlo Ersek
 

On 02/04/21 22:31, Andrew Fish wrote:


On Feb 4, 2021, at 10:09 AM, Laszlo Ersek <lersek@...> wrote:

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are guaranteed to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?

In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?
Laszlo,

Thanks for looking into this. I noticed most of the MemoryFence() usage[1] is in the virtual platforms so changes seem manageable. I guess 3rd party code that defines its own low level primitive could be impacted, but they chose to reinvent the wheel.

It looks like the generic platform performance risk is BaseIoLibIntrinsic. I know from debugging crashes a long time ago the the MemoryFence() for x86 is really a CompilerFence() in the new scheme. If the new AcquireMemoryFence()/ReleaseMemoryFence()/CompilerFence() replace MemoryFence() this likely just works. If not then we might need an architecture specific pattern to maintain current performance, we could always make a function internal to the lib that does the right thing for the given architecture. Also making it more precise might actually help ARM/AARCH64 performance?

So if I understand correctly
1) volatile basically tells C to always read the memory. So it impacts the C memory model.
2) CompilerFence() tells the C serialize and complete everything before the barrier, and read memory again the 1st time after the barrier.
3) MemoryFence() is really dealing with speculative fetches and maybe coherency. Basically the CPU reordering things.

So for example in BaseIoLibIntrinsic() the volatile is used to force the memory read and the CompilerFence() is used to enforce the order of that read relative to other operations. Thus wrapping the volatile access in CompilerFence() is in some ways redundant as long as there is only a single access inside the CompilerFence() wrapper.
Right, I agree with everything you said thus far. (I don't claim to be
an expert on this, of course.) CompilerFence() is a "better volatile"
(for RAM only, not for MMIO), where we can allow the compiler to
optimize, up to a certain location.

I like the idea of properly annotating the code. Given x86 defaults to so much cache coherency for compatibility with older simpler times it can tend to hide issues. I’m not sure removing the volatile keyword is required, and it does imply this variable has special rules. I guess we could add a REQUIRE_FENCE or some such? But it it feels like defining the variables in some way to imply the require special handling has some value?
Objects that are shared between processors, or between processor and
some device, should always be highlighted in some way (naming, comments,
...).


The only reason you would need a volatile is if you had more than a single read inside of the fences. Something like:

UINT8 *Address;
UINT8. Value

MemoryFence ();
do {
Value = *Address
} while (Value == 0xFF) ;
MemoryFence ();

The compiler could optimize the above code (current MemoryFence() model) to a single read and infinite loop if 0xFF. Yes I realize you could move the MemoryFence () into the loop, but what if you did not care about that? Mostly just trying to think this through we some examples….
I think it depends on what Address refers to. If it refers to an MMIO
register, then volatile is needed, IMO (and the MemoryFence()s are
unnecessary). If Address points to a byte in RAM that's manipulated by
multiple CPUs, then I think:

- volatile is not needed

- the two MemoryFence() calls should be removed

- an AcquireMemoryFence() call should be inserted into the loop, after
the read access. Because, presumably, once the flag assumes the
appropriate value (it has been "released")), we'll proceed to reading
some memory resource that was protected by the flag until then.


Regarding the many MemoryFence() calls in the virt platforms, most of
them do not exist for inter-CPU communication, but for communication
with devices through RAM. I admit I'm unsure what we should do with them.

Thanks
Laszlo


[1] /Volumes/Case/edk2-github(master)>git grep MemoryFence | grep ';'
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:320: MemoryFence ();
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:334: MemoryFence ();
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:344: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:1493: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:1497: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4702: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4707: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4712: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4765: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4770: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4775: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4828: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4833: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4838: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4892: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4894: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:86: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:88: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:115: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:117: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:147: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:149: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:179: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:181: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:211: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:213: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:243: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:245: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:275: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:277: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:305: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:307: MemoryFence ();
MdePkg/Library/BaseLib/AArch64/MemoryFence.asm:3:; MemoryFence() for AArch64
MdePkg/Library/BaseLib/AArch64/MemoryFence.asm:23:;MemoryFence (
MdePkg/Library/BaseLib/Arm/MemoryFence.asm:3:; MemoryFence() for AArch64
MdePkg/Library/BaseLib/Arm/MemoryFence.asm:24:;MemoryFence (
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:412: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:425: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:438: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:173: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:186: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:199: MemoryFence ();
OvmfPkg/Library/VirtioLib/VirtioLib.c:305: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:312: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:327: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:334: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:337: MemoryFence();
OvmfPkg/Library/VmgExitLib/VmgExitLib.c:122: MemoryFence ();
OvmfPkg/Library/VmgExitLib/VmgExitLib.c:124: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:117: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:122: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:171: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:177: MemoryFence ();
OvmfPkg/PvScsiDxe/PvScsi.c:595: MemoryFence ();
OvmfPkg/PvScsiDxe/PvScsi.c:625: MemoryFence ();
OvmfPkg/VirtioNetDxe/Events.c:56: MemoryFence ();
OvmfPkg/VirtioNetDxe/Events.c:58: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpGetStatus.c:99: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpGetStatus.c:102: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:257: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:384: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:426: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:437: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:101: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:103: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:173: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:176: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpTransmit.c:169: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpTransmit.c:172: MemoryFence ();
OvmfPkg/XenBusDxe/GrantTable.c:86: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:496: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:503: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:569: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:580: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:587: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:473: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:528: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:578: MemoryFence ();
UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c:204: MemoryFence ();


Thanks,

Andrew Fish

Would there be interest in reviewing such work?

Thanks
Laszlo


Re: [RFC] Request for new git repository for EdkRepo

Michael D Kinney
 

Hi Nate,

I have created the following Tianocore repositories:

* edk2-edkrepo
* edk2-edkrepo-manifest

I have added you and Ashley to the EDK II Tools Maintainers team with write access to both of these new repositories.

The EDK II Maintainers team has also been grated write access to the edk2-edkrepo-manifest repo.

Best regards,

Mike

-----Original Message-----
From: rfc@edk2.groups.io <rfc@edk2.groups.io> On Behalf Of Nate DeSimone
Sent: Friday, November 13, 2020 3:18 PM
To: Bjorge, Erik C <erik.c.bjorge@...>; devel@edk2.groups.io; rfc@edk2.groups.io
Cc: Desimone, Ashley E <ashley.e.desimone@...>
Subject: Re: [edk2-rfc] [RFC] Request for new git repository for EdkRepo

Hi Everyone,

This RFC has been sitting for a while and it seems like we have reached a conclusion. I would like to make a request to
the TianoCore stewards to create 2 new git repositories in the project:

1. edkrepo
2. edkrepo-manifest

Based on the current Maintainers.txt content, the following individuals should initially be given write access to these
repositories:

Nate DeSimone <nathaniel.l.desimone@...>
Ashley DeSimone <ashley.e.desimone@...>

I have taken a poll between myself and the other maintainer, and given that the project is currently in transition to
using GitHub pull requests we both prefer to start out using the pull request method of code review for these new
repositories. If there is any setup needed to enable pull request code review please do so.

Let me know if the stewards need any more information to complete this request.

Thanks!
Nate

-----Original Message-----
From: Desimone, Nathaniel L
Sent: Wednesday, September 30, 2020 4:21 PM
To: Bjorge, Erik C <erik.c.bjorge@...>; devel@edk2.groups.io; rfc@edk2.groups.io
Cc: Desimone, Ashley E <ashley.e.desimone@...>
Subject: RE: [RFC] Request for new git repository for EdkRepo

Hi Erik,

A separate manifest repository would probably be a good idea. The manifest repo is used for a lot of EdkRepo's operations
so having a smaller repo dedicated repo could be beneficial from a performance standpoint.

Thanks,
Nate

-----Original Message-----
From: Bjorge, Erik C <erik.c.bjorge@...>
Sent: Wednesday, September 30, 2020 3:29 PM
To: Desimone, Nathaniel L <nathaniel.l.desimone@...>; devel@edk2.groups.io; rfc@edk2.groups.io
Cc: Desimone, Ashley E <ashley.e.desimone@...>
Subject: RE: [RFC] Request for new git repository for EdkRepo

I am fine with a new repo. This also supports a good workflow to get a tool that then lets you pull full platforms. In
theory you would only ever really need to clone a single repo manually (assuming reasonable manifest support).

Are you also looking at creating a separate manifest repo as well or just creating a manifest branch in the new EdkRepo
repository?

Thanks,
-Erik

-----Original Message-----
From: Desimone, Nathaniel L <nathaniel.l.desimone@...>
Sent: Wednesday, September 30, 2020 2:56 PM
To: devel@edk2.groups.io; rfc@edk2.groups.io
Cc: Desimone, Ashley E <ashley.e.desimone@...>; Bjorge, Erik C <erik.c.bjorge@...>
Subject: [RFC] Request for new git repository for EdkRepo

Hi Everyone,

Given that EdkRepo has existed in the project for over a year now (in edk2-staging) I think it is time to get out of
staging. I have considered multiple possible landing zones:

1. edk2
2. edk2-pytool-library
3. A new repository

Edk2 does not seem like a good location as EdkRepo isn't strictly necessary to build EDK II, and I think all of us would
prefer that edk2 not become a dumping ground. I have talked with the other maintainers of edk2-pytool-library and they
would prefer that EdkRepo not enter that repository because EdkRepo does not have a robust set of unit tests yet and they
don't want their test coverage metrics to decline. Therefore, the best choice seems to be a new repository. As always if
anyone has comments they are welcome!

Thanks,
Nate




Re: MemoryFence()

Ethin Probst
 

Yep, the point on volatile is pretty much correct. An object that has
the volatile type qualifier may be manipulated in ways unknown to the
implementation (the compiler) and, therefore, the implementation shall
not optimize away actions on that object or reorder those actions
except as permitted by expression evaluation rules. An implementation
needn't evaluate an expression (or part of one) if it determines that
its value is unused and that no needed side effects are produced. This
page (https://en.cppreference.com/w/c/language/volatile) explains the
volatile type qualifier in a lot more detail than I will here
(otherwise we'd get way off topic and I don't think we want that to
happen). I can't really provide much input to the other two points
though.

On 2/4/21, Andrew Fish via groups.io <afish@...> wrote:


On Feb 4, 2021, at 10:09 AM, Laszlo Ersek <lersek@...> wrote:

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are guaranteed
to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?

In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?
Laszlo,

Thanks for looking into this. I noticed most of the MemoryFence() usage[1]
is in the virtual platforms so changes seem manageable. I guess 3rd party
code that defines its own low level primitive could be impacted, but they
chose to reinvent the wheel.

It looks like the generic platform performance risk is BaseIoLibIntrinsic. I
know from debugging crashes a long time ago the the MemoryFence() for x86 is
really a CompilerFence() in the new scheme. If the new
AcquireMemoryFence()/ReleaseMemoryFence()/CompilerFence() replace
MemoryFence() this likely just works. If not then we might need an
architecture specific pattern to maintain current performance, we could
always make a function internal to the lib that does the right thing for the
given architecture. Also making it more precise might actually help
ARM/AARCH64 performance?

So if I understand correctly
1) volatile basically tells C to always read the memory. So it impacts the C
memory model.
2) CompilerFence() tells the C serialize and complete everything before the
barrier, and read memory again the 1st time after the barrier.
3) MemoryFence() is really dealing with speculative fetches and maybe
coherency. Basically the CPU reordering things.

So for example in BaseIoLibIntrinsic() the volatile is used to force the
memory read and the CompilerFence() is used to enforce the order of that
read relative to other operations. Thus wrapping the volatile access in
CompilerFence() is in some ways redundant as long as there is only a single
access inside the CompilerFence() wrapper.

I like the idea of properly annotating the code. Given x86 defaults to so
much cache coherency for compatibility with older simpler times it can tend
to hide issues. I’m not sure removing the volatile keyword is required, and
it does imply this variable has special rules. I guess we could add a
REQUIRE_FENCE or some such? But it it feels like defining the variables in
some way to imply the require special handling has some value?

The only reason you would need a volatile is if you had more than a single
read inside of the fences. Something like:

UINT8 *Address;
UINT8. Value

MemoryFence ();
do {
Value = *Address
} while (Value == 0xFF) ;
MemoryFence ();

The compiler could optimize the above code (current MemoryFence() model) to
a single read and infinite loop if 0xFF. Yes I realize you could move the
MemoryFence () into the loop, but what if you did not care about that?
Mostly just trying to think this through we some examples….

[1] /Volumes/Case/edk2-github(master)>git grep MemoryFence | grep ';'
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:320: MemoryFence ();
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:334: MemoryFence ();
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:344: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:1493: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:1497: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4702: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4707: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4712: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4765: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4770: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4775: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4828: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4833: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4838: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4892: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4894: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:86: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:88: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:115: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:117: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:147: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:149: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:179: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:181: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:211: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:213: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:243: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:245: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:275: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:277: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:305: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:307: MemoryFence ();
MdePkg/Library/BaseLib/AArch64/MemoryFence.asm:3:; MemoryFence() for
AArch64
MdePkg/Library/BaseLib/AArch64/MemoryFence.asm:23:;MemoryFence (
MdePkg/Library/BaseLib/Arm/MemoryFence.asm:3:; MemoryFence() for AArch64
MdePkg/Library/BaseLib/Arm/MemoryFence.asm:24:;MemoryFence (
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:412: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:425: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:438: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:173: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:186: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:199: MemoryFence ();
OvmfPkg/Library/VirtioLib/VirtioLib.c:305: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:312: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:327: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:334: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:337: MemoryFence();
OvmfPkg/Library/VmgExitLib/VmgExitLib.c:122: MemoryFence ();
OvmfPkg/Library/VmgExitLib/VmgExitLib.c:124: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:117:
MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:122:
MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:171:
MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:177:
MemoryFence ();
OvmfPkg/PvScsiDxe/PvScsi.c:595: MemoryFence ();
OvmfPkg/PvScsiDxe/PvScsi.c:625: MemoryFence ();
OvmfPkg/VirtioNetDxe/Events.c:56: MemoryFence ();
OvmfPkg/VirtioNetDxe/Events.c:58: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpGetStatus.c:99: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpGetStatus.c:102: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:257: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:384: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:426: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:437: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:101: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:103: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:173: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:176: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpTransmit.c:169: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpTransmit.c:172: MemoryFence ();
OvmfPkg/XenBusDxe/GrantTable.c:86: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:496: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:503: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:569: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:580: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:587: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:473: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:528: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:578: MemoryFence ();
UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c:204:
MemoryFence ();


Thanks,

Andrew Fish

Would there be interest in reviewing such work?

Thanks
Laszlo





--
Signed,
Ethin D. Probst


Re: MemoryFence()

Andrew Fish <afish@...>
 

On Feb 4, 2021, at 10:09 AM, Laszlo Ersek <lersek@...> wrote:

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are guaranteed to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?

In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?
Laszlo,

Thanks for looking into this. I noticed most of the MemoryFence() usage[1] is in the virtual platforms so changes seem manageable. I guess 3rd party code that defines its own low level primitive could be impacted, but they chose to reinvent the wheel.

It looks like the generic platform performance risk is BaseIoLibIntrinsic. I know from debugging crashes a long time ago the the MemoryFence() for x86 is really a CompilerFence() in the new scheme. If the new AcquireMemoryFence()/ReleaseMemoryFence()/CompilerFence() replace MemoryFence() this likely just works. If not then we might need an architecture specific pattern to maintain current performance, we could always make a function internal to the lib that does the right thing for the given architecture. Also making it more precise might actually help ARM/AARCH64 performance?

So if I understand correctly
1) volatile basically tells C to always read the memory. So it impacts the C memory model.
2) CompilerFence() tells the C serialize and complete everything before the barrier, and read memory again the 1st time after the barrier.
3) MemoryFence() is really dealing with speculative fetches and maybe coherency. Basically the CPU reordering things.

So for example in BaseIoLibIntrinsic() the volatile is used to force the memory read and the CompilerFence() is used to enforce the order of that read relative to other operations. Thus wrapping the volatile access in CompilerFence() is in some ways redundant as long as there is only a single access inside the CompilerFence() wrapper.

I like the idea of properly annotating the code. Given x86 defaults to so much cache coherency for compatibility with older simpler times it can tend to hide issues. I’m not sure removing the volatile keyword is required, and it does imply this variable has special rules. I guess we could add a REQUIRE_FENCE or some such? But it it feels like defining the variables in some way to imply the require special handling has some value?

The only reason you would need a volatile is if you had more than a single read inside of the fences. Something like:

UINT8 *Address;
UINT8. Value

MemoryFence ();
do {
Value = *Address
} while (Value == 0xFF) ;
MemoryFence ();

The compiler could optimize the above code (current MemoryFence() model) to a single read and infinite loop if 0xFF. Yes I realize you could move the MemoryFence () into the loop, but what if you did not care about that? Mostly just trying to think this through we some examples….

[1] /Volumes/Case/edk2-github(master)>git grep MemoryFence | grep ';'
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:320: MemoryFence ();
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:334: MemoryFence ();
ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c:344: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:1493: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:1497: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4702: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4707: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4712: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4765: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4770: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4775: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4828: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4833: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4838: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4892: MemoryFence ();
MdeModulePkg/Universal/EbcDxe/EbcExecute.c:4894: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:86: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:88: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:115: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:117: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:147: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:149: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:179: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:181: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:211: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:213: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:243: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:245: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:275: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:277: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:305: MemoryFence ();
MdePkg/Library/BaseIoLibIntrinsic/IoLib.c:307: MemoryFence ();
MdePkg/Library/BaseLib/AArch64/MemoryFence.asm:3:; MemoryFence() for AArch64
MdePkg/Library/BaseLib/AArch64/MemoryFence.asm:23:;MemoryFence (
MdePkg/Library/BaseLib/Arm/MemoryFence.asm:3:; MemoryFence() for AArch64
MdePkg/Library/BaseLib/Arm/MemoryFence.asm:24:;MemoryFence (
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:412: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:425: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c:438: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:173: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:186: MemoryFence ();
OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgPei.c:199: MemoryFence ();
OvmfPkg/Library/VirtioLib/VirtioLib.c:305: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:312: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:327: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:334: MemoryFence();
OvmfPkg/Library/VirtioLib/VirtioLib.c:337: MemoryFence();
OvmfPkg/Library/VmgExitLib/VmgExitLib.c:122: MemoryFence ();
OvmfPkg/Library/VmgExitLib/VmgExitLib.c:124: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:117: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:122: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:171: MemoryFence ();
OvmfPkg/Library/XenConsoleSerialPortLib/XenConsoleSerialPortLib.c:177: MemoryFence ();
OvmfPkg/PvScsiDxe/PvScsi.c:595: MemoryFence ();
OvmfPkg/PvScsiDxe/PvScsi.c:625: MemoryFence ();
OvmfPkg/VirtioNetDxe/Events.c:56: MemoryFence ();
OvmfPkg/VirtioNetDxe/Events.c:58: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpGetStatus.c:99: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpGetStatus.c:102: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:257: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:384: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:426: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpInitialize.c:437: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:101: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:103: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:173: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpReceive.c:176: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpTransmit.c:169: MemoryFence ();
OvmfPkg/VirtioNetDxe/SnpTransmit.c:172: MemoryFence ();
OvmfPkg/XenBusDxe/GrantTable.c:86: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:496: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:503: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:569: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:580: MemoryFence ();
OvmfPkg/XenBusDxe/XenStore.c:587: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:473: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:528: MemoryFence ();
OvmfPkg/XenPvBlkDxe/BlockFront.c:578: MemoryFence ();
UefiCpuPkg/Library/BaseXApicX2ApicLib/BaseXApicX2ApicLib.c:204: MemoryFence ();


Thanks,

Andrew Fish

Would there be interest in reviewing such work?

Thanks
Laszlo


Re: MemoryFence()

Paolo Bonzini <pbonzini@...>
 

Il gio 4 feb 2021, 21:09 Ethin Probst <harlydavidsen@...> ha scritto:

For fences x86 has the LFENCE instruction for load fences and
SFENCE for store fences. If I'm not mistaken, MFENCE, then, is just a
combination of the other two. Would it not then be more appropriate to
use an LFENCE for a load (release?) fence, and SFENCE for a store
(acquire?) fence?

No, LFENCE and SFENCE are only useful with special load and store
instructions and are not needed in edk2. In addition, neither of the two
orders earlier stores against subsequent loads; MFENCE is needed for that.

Paolo

(I ask for release/acquire because I might have
these mixed up.) I might of course be entirely wrong too.
I'd love to, perhaps, become a contributor of EDK2 sometime, but I
honestly have no idea where I'd even begin. :-)

On 2/4/21, Ard Biesheuvel <ardb@...> wrote:
On Thu, 4 Feb 2021 at 19:09, Laszlo Ersek <lersek@...> wrote:

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are
guaranteed to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".
Acquire semantics typically order writes before reads, not /between/
reads. Similarly, release semantics imply that all outstanding writes
complete before a barrier with acquire semantics is permitted to
complete.

Note that reasoning about this only makes sense in the context of
concurrency, i.e., different CPUs operating on the same memory (or
coherent DMA masters)

For non-coherent DMA, the 'ish' variants are not appropriate, and
given the low likelihood that any of this is creates a performance
bottleneck, I would suggest to only use full barriers on ARM.

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.
It is very important to be *aware* of the acquire/release semantics,
but I don't think it is necessary to use all the fine grained barrier
types in EDK2.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?
Who decides what 'memory fence' means anyway?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?
'volatile' affects code generation only, i.e., which loads and stores
are issued. The *order* in which those loads observe stores done by
other masters is what is managed by barrier instructions. So they are
two sides of the same coin: 'volatile' may help to guarantee that
every assignment through a pointer variable results in a store
instruction, but how these are observed by other CPUs still needs to
be managed if it needs to occur in a certain way.

I think volatile is the wrong tool in most cases: the ordering of
accesses are not a property of the type or of the variable, but of the
code sequence. So using asm("") to ensure that an assignment through a
pointer is emitted as a store instruction, followed by a barrier
instruction is a typical pattern. And given the fact that barrier
instructions require asm sequences in most cases anyway, the asm ("")
will often be implied.


In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?

Would there be interest in reviewing such work?

Thanks
Laszlo





--
Signed,
Ethin D. Probst


Re: MemoryFence()

Ethin Probst
 

I agree with Ard (I'm not a contributor of EDK2 but thought I'd add my
2 sense anyway). `volatile` should only be used if a pointer
load/store is utilized in a context where you expect the external
environment (e.g.: hardware) to alter the result of such an op. In the
case of fences, volatile, I think, doesn't really do much in terms of
helping. For fences x86 has the LFENCE instruction for load fences and
SFENCE for store fences. If I'm not mistaken, MFENCE, then, is just a
combination of the other two. Would it not then be more appropriate to
use an LFENCE for a load (release?) fence, and SFENCE for a store
(acquire?) fence? (I ask for release/acquire because I might have
these mixed up.) I might of course be entirely wrong too.
I'd love to, perhaps, become a contributor of EDK2 sometime, but I
honestly have no idea where I'd even begin. :-)

On 2/4/21, Ard Biesheuvel <ardb@...> wrote:
On Thu, 4 Feb 2021 at 19:09, Laszlo Ersek <lersek@...> wrote:

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are
guaranteed to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".
Acquire semantics typically order writes before reads, not /between/
reads. Similarly, release semantics imply that all outstanding writes
complete before a barrier with acquire semantics is permitted to
complete.

Note that reasoning about this only makes sense in the context of
concurrency, i.e., different CPUs operating on the same memory (or
coherent DMA masters)

For non-coherent DMA, the 'ish' variants are not appropriate, and
given the low likelihood that any of this is creates a performance
bottleneck, I would suggest to only use full barriers on ARM.

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.
It is very important to be *aware* of the acquire/release semantics,
but I don't think it is necessary to use all the fine grained barrier
types in EDK2.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?
Who decides what 'memory fence' means anyway?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?
'volatile' affects code generation only, i.e., which loads and stores
are issued. The *order* in which those loads observe stores done by
other masters is what is managed by barrier instructions. So they are
two sides of the same coin: 'volatile' may help to guarantee that
every assignment through a pointer variable results in a store
instruction, but how these are observed by other CPUs still needs to
be managed if it needs to occur in a certain way.

I think volatile is the wrong tool in most cases: the ordering of
accesses are not a property of the type or of the variable, but of the
code sequence. So using asm("") to ensure that an assignment through a
pointer is emitted as a store instruction, followed by a barrier
instruction is a typical pattern. And given the fact that barrier
instructions require asm sequences in most cases anyway, the asm ("")
will often be implied.


In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?

Would there be interest in reviewing such work?

Thanks
Laszlo




--
Signed,
Ethin D. Probst


Re: MemoryFence()

Paolo Bonzini <pbonzini@...>
 

Il gio 4 feb 2021, 20:46 Ard Biesheuvel <ardb@...> ha scritto:

(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".
Acquire semantics typically order writes before reads, not /between/
reads. Similarly, release semantics imply that all outstanding writes
complete before a barrier with acquire semantics is permitted to
complete.
Acquire fences are barriers between earlier loads and subsequent loads and
stores; those earlier loads then synchronize with release stores in other
threads.

Release fences are barriers been earlier loads and stores against
subsequent stores, and those subsequent stores synchronize with acquire
loads in other threads.

In both cases, however, fences only make sense between memory operations.
So something like "after reads" and "before writes" would have been more
precise in some sense, but in practice the usual idiom is "between"
reads/writes as Laszlo wrote.

Note that reasoning about this only makes sense in the context of
concurrency, i.e., different CPUs operating on the same memory (or
coherent DMA masters)

For non-coherent DMA, the 'ish' variants are not appropriate, and
given the low likelihood that any of this is creates a performance
bottleneck, I would suggest to only use full barriers on ARM.
Sure, that's a matter of how to implement the primitives. If you think that
non-coherent DMA is important, a full dmb can be used too.

As far as the compiler is concerned, an asm in the macros *should* block
enough optimizations, even without making the accesses volatile.
CompilerFence (or the edk2 equivalent of cpu_relax, whose name escapes me
right now) would be necessary in the body of busy-waiting loops. However
somebody should check the MSVC docs for asm, too.

It is very important to be *aware* of the acquire/release semantics,
but I don't think it is necessary to use all the fine grained barrier
types in EDK2.
I agree as long as the primitives are self-documenting. A single
MemoryFence() does not make it clear in which direction the data is flowing
(whether from other threads to this one, or vice versa).

Paolo


Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?
Who decides what 'memory fence' means anyway?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?
'volatile' affects code generation only, i.e., which loads and stores
are issued. The *order* in which those loads observe stores done by
other masters is what is managed by barrier instructions. So they are
two sides of the same coin: 'volatile' may help to guarantee that
every assignment through a pointer variable results in a store
instruction, but how these are observed by other CPUs still needs to
be managed if it needs to occur in a certain way.

I think volatile is the wrong tool in most cases: the ordering of
accesses are not a property of the type or of the variable, but of the
code sequence. So using asm("") to ensure that an assignment through a
pointer is emitted as a store instruction, followed by a barrier
instruction is a typical pattern. And given the fact that barrier
instructions require asm sequences in most cases anyway, the asm ("")
will often be implied.


In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?

Would there be interest in reviewing such work?

Thanks
Laszlo


Re: MemoryFence()

Ard Biesheuvel <ardb@...>
 

On Thu, 4 Feb 2021 at 19:09, Laszlo Ersek <lersek@...> wrote:

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are guaranteed to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".
Acquire semantics typically order writes before reads, not /between/
reads. Similarly, release semantics imply that all outstanding writes
complete before a barrier with acquire semantics is permitted to
complete.

Note that reasoning about this only makes sense in the context of
concurrency, i.e., different CPUs operating on the same memory (or
coherent DMA masters)

For non-coherent DMA, the 'ish' variants are not appropriate, and
given the low likelihood that any of this is creates a performance
bottleneck, I would suggest to only use full barriers on ARM.

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.
It is very important to be *aware* of the acquire/release semantics,
but I don't think it is necessary to use all the fine grained barrier
types in EDK2.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?
Who decides what 'memory fence' means anyway?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?
'volatile' affects code generation only, i.e., which loads and stores
are issued. The *order* in which those loads observe stores done by
other masters is what is managed by barrier instructions. So they are
two sides of the same coin: 'volatile' may help to guarantee that
every assignment through a pointer variable results in a store
instruction, but how these are observed by other CPUs still needs to
be managed if it needs to occur in a certain way.

I think volatile is the wrong tool in most cases: the ordering of
accesses are not a property of the type or of the variable, but of the
code sequence. So using asm("") to ensure that an assignment through a
pointer is emitted as a store instruction, followed by a barrier
instruction is a typical pattern. And given the fact that barrier
instructions require asm sequences in most cases anyway, the asm ("")
will often be implied.


In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?

Would there be interest in reviewing such work?

Thanks
Laszlo


MemoryFence()

Laszlo Ersek
 

Hi All,

here's the declaration of MemoryFence(), from <BaseLib.h>:

/**
Used to serialize load and store operations.

All loads and stores that proceed calls to this function are guaranteed to be
globally visible when this function returns.

**/
VOID
EFIAPI
MemoryFence (
VOID
);
The IA32 and X64 implementations of MemoryFence() are not memory
barriers, only compiler barriers, however.

This is a problem for two reasons:

- code that believes to add a memory barrier only adds a compiler
barrier, possibly creating an unsafe situation,

- code that knowingly calls MemoryFence() for the purposes of a compiler
barrier -- while it will hopefully work safely, relying on tacit
knowledge / assumptions about the x86 architecture -- results in a
source reading experience that mismatches known lockless patterns.


This topic has repeatedly come up in the past; most recently in the
review of Ankur's v6 series for
<https://bugzilla.tianocore.org/show_bug.cgi?id=3132>, where some
inter-processor communication needs to happen.

I asked Paolo for help with understanding some of the synchronziation
patterns in PiSmmCpuDxeSmm, the reason being that Ankur's series is
going to rely on, and/or imitate, some of those patterns. From this
discussion with Paolo, the following have emerged (I'm going to
liberally steal / paraphrase Paolo's words):


(1) We should introduce finer-grained fence primitives:

ARM AARCH64 i386

CompilerFence() asm("") asm("") asm("")
AcquireMemoryFence() dmb ish dmb ishld asm("")
ReleaseMemoryFence() dmb ish dmb ish asm("")
MemoryFence() dmb ish dmb ish mfence

"where AcquireMemoryFence() is used on the read side (i.e. between
reads) and ReleaseMemoryFence() is used on the write side (i.e. between
writes)".

(Note that this step would make MemoryFence() heavier-weight on x86 than
it currently is.)


(2) We should audit and rework existent uses (combinations) of
MemoryFence() into matched acquire / release pairs.

Less importantly, this would restore the x86 behavior (performance) to
the one seen with the current MemoryFence() implementation.

More importantly, it would fix what the code *means*: "it's very
important to stick to known lockless patterns, and having matching
acquire/release fences makes it much more intuitive to understand what's
going on".

--*--

I'm not proposing this as a pre-requisite to merging Ankur's series
(i.e., I don't expect the series to use acquire/release -- it can be
converted later with the rest of the audit).

However, I'd really like us to stop wasting more time on MemoryFence()
doubts such as:

- Hey it's not a memory fence at all, is it safe?

- But on x86 it kinda is, *dependent on* the spinlocks / lock prefixes
we use for synchronization, between the other accesses.

- Is it interchangeable with volatile? Yes? No? which one should we use?

In the longer term, multi-processing code like MpInitLib (CpuDxe,
CpuMpPei) and PiSmmCpuDxeSmm should likely abandon all use of
"volatile", and use the explicit barriers instead, for *clarity*.


Step (2) would take quite a bit of thinking (if not much code).

Comments?

Would there be interest in reviewing such work?

Thanks
Laszlo


Re: [RFC] Update NASM tool version to 2.15.05 for build image

Laszlo Ersek
 

On 02/03/21 08:14, Sheng Wei wrote:
Hi All,
This email is to collect feedback about update NASM version to 2.15.05 for build image.

NASM (https://nasm.us/) is an assembler for the x86 CPU architecture platform.
Current open CI is using NSAM version 2.14.02.
And we have below information in edk2 project.
# - NASM 2.10 or later for use with the GCC toolchain family
# - NASM 2.12.01 or later for use with all other toolchain families
The latest stable NASM version is 2.15.05.
The latest stable NASM version 2.15.05 is released at 2020-08-28.
The first 2.15.x NASM version is released at 2020-06-04.
The last 2.14.x NASM version is released at 2018-12-30.
From NASM 2.14.x to NASM 2.15.05, according to the release history (https://nasm.us/doc/nasmdocc.html), it fixes many bugs and add more new instructions support.
For example, it adds instructions for Intel Control Flow Enforcement Technology (CET) since NASM version 2.15.01.
Actually, we have met such unsupported instruction issue in the development.

There would be 2 possible options:

* Option 1: Update the NASM version to 2.15.05
* 2.15.05 is declared as a stable NASM version.
* New instructions will be supported automatically.
* But need effort to verify new NASM version on the project.
* Option 2: Keep stay at current NASM version
* we have to use hardcode binaries in the source code like "DB xx xx xx xx" for these new instructions.
* It would be not friendly to these new instructions and easy to make mistake in the new feature development.
I suppose option 1 will be better to us in the development.

I'd like to hear the community's feedback about which option is preferred, and let me know if you have any concerns with this change. Thanks!
From a RHEL perspective, I'd be okay with a nasm-2.15.03 version
requirement.

Thanks
Laszlo


[RFC] Update NASM tool version to 2.15.05 for build image

Sheng Wei
 

Hi All,
This email is to collect feedback about update NASM version to 2.15.05 for build image.

NASM (https://nasm.us/) is an assembler for the x86 CPU architecture platform.
Current open CI is using NSAM version 2.14.02.
And we have below information in edk2 project.
# - NASM 2.10 or later for use with the GCC toolchain family
# - NASM 2.12.01 or later for use with all other toolchain families
The latest stable NASM version is 2.15.05.
The latest stable NASM version 2.15.05 is released at 2020-08-28.
The first 2.15.x NASM version is released at 2020-06-04.
The last 2.14.x NASM version is released at 2018-12-30.
From NASM 2.14.x to NASM 2.15.05, according to the release history (https://nasm.us/doc/nasmdocc.html), it fixes many bugs and add more new instructions support.
For example, it adds instructions for Intel Control Flow Enforcement Technology (CET) since NASM version 2.15.01.
Actually, we have met such unsupported instruction issue in the development.

There would be 2 possible options:

* Option 1: Update the NASM version to 2.15.05
* 2.15.05 is declared as a stable NASM version.
* New instructions will be supported automatically.
* But need effort to verify new NASM version on the project.
* Option 2: Keep stay at current NASM version
* we have to use hardcode binaries in the source code like "DB xx xx xx xx" for these new instructions.
* It would be not friendly to these new instructions and easy to make mistake in the new feature development.
I suppose option 1 will be better to us in the development.

I'd like to hear the community's feedback about which option is preferred, and let me know if you have any concerns with this change. Thanks!

Best Regards
Sheng Wei


Re: [edk2-devel] RFC: Adding support for ARM (RNDR etc.) to RngDxe

Sami Mujawar <sami.mujawar@...>
 

Hi All,

I have shared some initial thoughts on the RNG implementation updates at https://edk2.groups.io/g/devel/files/Designs/2021/0116/EDKII%20-%20Proposed%20update%20to%20RNG%20implementation.pdf

Kindly let me know your feedback or if you have any queries.

Regards,

Sami Mujawar

-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Rebecca Cran via groups.io
Sent: 14 January 2021 09:05 PM
To: Sami Mujawar <Sami.Mujawar@...>; devel@edk2.groups.io; Samer El-Haj-Mahmoud <Samer.El-Haj-Mahmoud@...>; Ard Biesheuvel <Ard.Biesheuvel@...>; leif@...
Cc: rfc@edk2.groups.io; Jiewen Yao <jiewen.yao@...>; Rahul Kumar <rahul1.kumar@...>; nd <nd@...>
Subject: Re: [edk2-devel] RFC: Adding support for ARM (RNDR etc.) to RngDxe

On 12/10/20 4:26 AM, Sami Mujawar wrote:

I am working on the TRNG FW API interface and will share more details
for the discussion soon.

We had some thoughts about streamlining the RngDxe implementations and
would like to share some diagrams for the discussion.

My diagrams are in Visio that I can export as JPG images. However, I am
open to switching to any other suggested tool.
Hi Sami,

I don't see any further discussions on this. Have you made any progress
with sharing the design documents or scheduling a review?

--
Rebecca Cran


Re: [edk2-devel] RFC: Adding support for ARM (RNDR etc.) to RngDxe

Rebecca Cran <rebecca@...>
 

On 12/10/20 4:26 AM, Sami Mujawar wrote:

I am working on the TRNG FW API interface and will share more details for the discussion soon.
We had some thoughts about streamlining the RngDxe implementations and would like to share some diagrams for the discussion.
My diagrams are in Visio that I can export as JPG images. However, I am open to switching to any other suggested tool.
Hi Sami,

I don't see any further discussions on this. Have you made any progress with sharing the design documents or scheduling a review?

--
Rebecca Cran


回复: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka submodule alternatives

gaoliming
 

Rebecca:
My new mail address is Liming Gao <gaoliming@...>. It has updated in Maintainers.txt.

Thanks
Liming

-----邮件原件-----
发件人: bounce+27952+69347+4905953+8761045@groups.io
<bounce+27952+69347+4905953+8761045@groups.io> 代表 Rebecca Cran
发送时间: 2020年12月22日 4:45
收件人: rfc@edk2.groups.io; michael.d.kinney@...; gaoliming
<gaoliming@...>; devel@edk2.groups.io; 'Bret Barkelew'
<Bret.Barkelew@...>; 'Laszlo Ersek' <lersek@...>; 'Gao,
Liming' <liming.gao@...>
主题: Re: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka
submodule alternatives

On 12/21/20 1:42 PM, Rebecca Cran wrote:
On 12/21/20 1:14 PM, Michael D Kinney wrote:
Rebecca,

Are you ok with using a GitHub mirror?
Yes, that's fine with me.
By the way, I've started getting bounces from Liming's Intel email address.

--
Rebecca Cran





Re: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka submodule alternatives

Rebecca Cran
 

On 12/21/20 1:42 PM, Rebecca Cran wrote:
On 12/21/20 1:14 PM, Michael D Kinney wrote:
Rebecca,

Are you ok with using a GitHub mirror?
Yes, that's fine with me.
By the way, I've started getting bounces from Liming's Intel email address.

--
Rebecca Cran


Re: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka submodule alternatives

Rebecca Cran
 

On 12/21/20 1:14 PM, Michael D Kinney wrote:
Rebecca,
Are you ok with using a GitHub mirror?
Yes, that's fine with me.

--
Rebecca Cran


Re: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka submodule alternatives

Michael D Kinney
 

Rebecca,

Are you ok with using a GitHub mirror?

Mike

-----Original Message-----
From: gaoliming <gaoliming@...>
Sent: Sunday, December 20, 2020 5:18 PM
To: devel@edk2.groups.io; rebecca@...; rfc@edk2.groups.io; Kinney, Michael D <michael.d.kinney@...>; 'Bret
Barkelew' <Bret.Barkelew@...>; 'Laszlo Ersek' <lersek@...>; 'Gao, Liming' <liming.gao@...>
Subject: 回复: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka submodule alternatives

I also prefer to fetch all submodules from github. Then, all git repos will have the same behavior.

Thanks
Liming
-----邮件原件-----
发件人: bounce+27952+69261+4905953+8761045@groups.io
<bounce+27952+69261+4905953+8761045@groups.io> 代表 Rebecca Cran
发送时间: 2020年12月20日 9:06
收件人: rfc@edk2.groups.io; michael.d.kinney@...;
devel@edk2.groups.io; 'Bret Barkelew' <Bret.Barkelew@...>;
Laszlo Ersek <lersek@...>; Gao, Liming <liming.gao@...>
主题: Re: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka
submodule alternatives

On 12/19/20 11:58 AM, Michael D Kinney wrote:
There have been a few suggestions to create a mirror of cmocka in
TianoCore
org in GitHub.

I have found a GitHub action that can do a repo sync.

https://github.com/marketplace/actions/github-repo-sync

I have created a temporary mirror of cmocka in my personal GitHub area
that
uses this GitHub action to sync all branches and all tags once a day.

https://github.com/mdkinney/mirror-cmocka

Here is the GitHub workflow file. It must be in the default branch for the
repo using a branch name that is not present in the repo being mirrored.
In this case, I used a branch name of 'repo-sync'.

https://github.com/mdkinney/mirror-cmocka/blob/repo-sync/.github/workflo
ws/repo-sync.yml

Please provide feedback on this approach. If we like this approach, then
I suggest we create a new repo in TianoCore called edk2-cmocka that is a
mirror that is synced once a day and we update the cmocka submodule in
the
edk2 repo to use edk2-cmocka.
I'd suggest just using the Gitlab mirror. Unlike cryptomilk.org, Gitlab
should be just as reliable as Github and won't introduce another
potential failure point.

--
Rebecca Cran





回复: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka submodule alternatives

gaoliming
 

I also prefer to fetch all submodules from github. Then, all git repos will have the same behavior.

Thanks
Liming

-----邮件原件-----
发件人: bounce+27952+69261+4905953+8761045@groups.io
<bounce+27952+69261+4905953+8761045@groups.io> 代表 Rebecca Cran
发送时间: 2020年12月20日 9:06
收件人: rfc@edk2.groups.io; michael.d.kinney@...;
devel@edk2.groups.io; 'Bret Barkelew' <Bret.Barkelew@...>;
Laszlo Ersek <lersek@...>; Gao, Liming <liming.gao@...>
主题: Re: [edk2-devel] [edk2-rfc] [RFC] UnitTestFrameworkPkg cmocka
submodule alternatives

On 12/19/20 11:58 AM, Michael D Kinney wrote:
There have been a few suggestions to create a mirror of cmocka in
TianoCore
org in GitHub.

I have found a GitHub action that can do a repo sync.

https://github.com/marketplace/actions/github-repo-sync

I have created a temporary mirror of cmocka in my personal GitHub area
that
uses this GitHub action to sync all branches and all tags once a day.

https://github.com/mdkinney/mirror-cmocka

Here is the GitHub workflow file. It must be in the default branch for the
repo using a branch name that is not present in the repo being mirrored.
In this case, I used a branch name of 'repo-sync'.

https://github.com/mdkinney/mirror-cmocka/blob/repo-sync/.github/workflo
ws/repo-sync.yml

Please provide feedback on this approach. If we like this approach, then
I suggest we create a new repo in TianoCore called edk2-cmocka that is a
mirror that is synced once a day and we update the cmocka submodule in
the
edk2 repo to use edk2-cmocka.
I'd suggest just using the Gitlab mirror. Unlike cryptomilk.org, Gitlab
should be just as reliable as Github and won't introduce another
potential failure point.

--
Rebecca Cran





回复: [edk2-rfc] [RFC V3] Create supported branch from edk2-stable* tag (Required to address critical bug BZ3111)

gaoliming
 

Mike:
It is OK to me. I have no other comments.

Thanks
Liming

-----邮件原件-----
发件人: bounce+34240+470+4905953+8776774@groups.io
<bounce+34240+470+4905953+8776774@groups.io> 代表 Michael D
Kinney
发送时间: 2020年12月19日 10:03
收件人: rfc@edk2.groups.io; devel@edk2.groups.io; Kinney, Michael D
<michael.d.kinney@...>
主题: [edk2-rfc] [RFC V3] Create supported branch from edk2-stable* tag
(Required to address critical bug BZ3111)

Hello,



The following bug has been fixed on edk2/master



https://bugzilla.tianocore.org/show_bug.cgi?id=3111

https://github.com/tianocore/edk2/pull/1226



This bug is also considered a critical bug against edk2-stable202011. The
behavior

of the Variable Lock Protocol was changed in a non-backwards compatible
manner in

edk2-stable202011 and this is impacting some downstream platforms. The
following

2 commits on edk2/master restore the original behavior of the Variable Lock
Protocol.




https://github.com/tianocore/edk2/pull/1226/commits/893cfe2847b83da74f
53858d6acaa15a348bad7c


https://github.com/tianocore/edk2/pull/1226/commits/16491ba6a6e9a91ce
deeed45bc0fbdfde49f7968



The request here is to create a supported branch from edk2-stable202011 tag
and apply

these 2 commits as critical bug fixes on the supported branch.



Since we started using the edk2-stable* tag process, there has not been a
request to create

a supported branch from one of those tags. As a result, there are a couple
opens that

need to be addressed:



1) Supported branch naming convention.



Branch Proposal: stable/<YYYY><MM>

Branch Example: stable/202011



2) CI requirements for supported branches.



Proposal: Update .azurepipelines yml files to trigger on stable/*
branches,

update .mergify configuration file to support merging of stable/*
branches,

and update GitHub branch protection settings to protect stable/*
branches.



3) A stable/* branch is only supported until the next edk2-stable tag release.



4) Release requirements for supported branches.



Proposal: If there are a significant number of critical fixes applied to

a stable/* branch, then a request for a release can be made that would

trigger focused testing of the supported branch and creation of a new

release. If all testing passes, then a tag is created on the stable/*

branch and a release is created on GitHub that summarizes the set of

critical fixes and the testing performed.



Tag Proposal: edk2-stable<YYYY><MM>.<XX>

Tag Example : edk2-stable202011.01



Please let me know if you have any feedback or comments on this proposal.
The goal

is to close on this topic this week.



Thank you,



Mike




281 - 300 of 772