Re: MemoryFence()


Ni, Ray
 

Wow! Thank you Paolo for explaining everything!

I read a bit about c memory model.
https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync

Do you have any documentation that can explain how the different models are implemented (using what instructions)?

I would like to know if there is a better primitive api other than the fences because MSVC deprecates the barrier apis.
https://docs.microsoft.com/en-us/cpp/intrinsics/readwritebarrier?view=msvc-160



thanks,
ray
________________________________
发件人: Paolo Bonzini <pbonzini@...>
发送时间: Saturday, February 6, 2021 2:17:53 AM
收件人: Ni, Ray <ray.ni@...>
抄送: edk2-rfc-groups-io <rfc@edk2.groups.io>; Laszlo Ersek <lersek@...>; Ard Biesheuvel <ardb@...>; Andrew Fish <afish@...>; Kinney, Michael D <michael.d.kinney@...>; Leif Lindholm (Nuvia address) <leif@...>; Dong, Eric <eric.dong@...>; Liming Gao (Byosoft address) <gaoliming@...>; Ankur Arora <ankur.a.arora@...>
主题: Re: [edk2-rfc] MemoryFence()

Il ven 5 feb 2021, 19:12 Ni, Ray <ray.ni@...<mailto:ray.ni@...>> ha scritto:
I saw the proposal of fences in first mail by Laszlo. Please forgive my ignorance. What is asm(“”) in x86? A nop? The how a nop can help as a processor level load store barrier?

On x86 load-load, load-store and store-store ordering is already guaranteed by the processor. Therefore on x86 the AcquireMemoryFence and ReleaseMemoryFence are just like CompilerFence: they only have to block compiler-level reordering. MemoryFence is the only one that blocks store-load reordering and needs to emit an MFENCE instruction.

On ARM (either 32- or 64-bit) the processor-level guarantees are weaker, and you need to emit a "dmb" instruction for acquire and release fences as well.

Paolo



thanks,
ray
________________________________
发件人: rfc@edk2.groups.io<mailto:rfc@edk2.groups.io> <rfc@edk2.groups.io<mailto:rfc@edk2.groups.io>> 代表 Paolo Bonzini <pbonzini@...<mailto:pbonzini@...>>
发送时间: Saturday, February 6, 2021 2:01:14 AM
收件人: Ni, Ray <ray.ni@...<mailto:ray.ni@...>>; Laszlo Ersek <lersek@...<mailto:lersek@...>>; Ard Biesheuvel <ardb@...<mailto:ardb@...>>
抄送: Andrew Fish <afish@...<mailto:afish@...>>; edk2 RFC list <rfc@edk2.groups.io<mailto:rfc@edk2.groups.io>>; Kinney, Michael D <michael.d.kinney@...<mailto:michael.d.kinney@...>>; Leif Lindholm (Nuvia address) <leif@...<mailto:leif@...>>; Dong, Eric <eric.dong@...<mailto:eric.dong@...>>; Liming Gao (Byosoft address) <gaoliming@...<mailto:gaoliming@...>>; Ankur Arora <ankur.a.arora@...<mailto:ankur.a.arora@...>>
主题: Re: [edk2-rfc] MemoryFence()

On 05/02/21 18:53, Ni, Ray wrote:
Without calling _ReadWriteBarrier, is it possible that compiler
generates the assembly in the wrong location? I mean the compiler may
in-line the LibWaitForSemaphore and call cmpxchg earlier than the
desired location.
Similar to LibReleaseSemaphore.

So my understanding is the _ReadWriteBarrier in ReleaseSpinLock is required.
The proposed ReleaseMemoryFence() should already have that effect. All
the proposed fences except CompilerFence() are both compiler
optimization barriers and processor barriers.
InterlockedCompareExchange() is also both a compiler optimization
barrier and a processor barrier

CompilerFence() is just a better name for _ReadWriteBarrier(), it blocks
optimizations but it has no effect at the processor level. It should
only be used (instead of volatile) in busy-waiting loops that do not
always go through an InterlockedCompareExchange.

Paolo

234 _ReadWriteBarrier ();
235 *SpinLock = SPIN_LOCK_RELEASED;
236 _ReadWriteBarrier ();
237
238 return SpinLock;
it *is* buggy because it is missing a
(processor) barrier on non-x86 architectures and has a useless barrier
after the store. Instead it should be just this:

ReleaseMemoryFence ();
*SpinLock = SPIN_LOCK_RELEASED;

Join rfc@edk2.groups.io to automatically receive all group messages.