Il ven 5 feb 2021, 19:12 Ni, Ray <ray.ni@...> ha scritto:
I saw the proposal of fences in first mail by Laszlo. Please forgive my
ignorance. What is asm(“”) in x86? A nop? The how a nop can help as a
processor level load store barrier?
On x86 load-load, load-store and store-store ordering is already guaranteed
by the processor. Therefore on x86 the AcquireMemoryFence and
ReleaseMemoryFence are just like CompilerFence: they only have to block
compiler-level reordering. MemoryFence is the only one that blocks
store-load reordering and needs to emit an MFENCE instruction.
On ARM (either 32- or 64-bit) the processor-level guarantees are weaker,
and you need to emit a "dmb" instruction for acquire and release fences as
well.
Paolo
thanks,
ray
------------------------------
*发件人:* rfc@edk2.groups.io <rfc@edk2.groups.io> 代表 Paolo Bonzini <
pbonzini@...>
*发送时间:* Saturday, February 6, 2021 2:01:14 AM
*收件人:* Ni, Ray <ray.ni@...>; Laszlo Ersek <lersek@...>; Ard
Biesheuvel <ardb@...>
*抄送:* Andrew Fish <afish@...>; edk2 RFC list <rfc@edk2.groups.io>;
Kinney, Michael D <michael.d.kinney@...>; Leif Lindholm (Nuvia
address) <leif@...>; Dong, Eric <eric.dong@...>; Liming
Gao (Byosoft address) <gaoliming@...>; Ankur Arora <
ankur.a.arora@...>
*主题:* Re: [edk2-rfc] MemoryFence()
On 05/02/21 18:53, Ni, Ray wrote:
Without calling _ReadWriteBarrier, is it possible that compiler
generates the assembly in the wrong location? I mean the compiler may
in-line the LibWaitForSemaphore and call cmpxchg earlier than the
desired location.
Similar to LibReleaseSemaphore.
So my understanding is the _ReadWriteBarrier in ReleaseSpinLock is
required.
The proposed ReleaseMemoryFence() should already have that effect. All
the proposed fences except CompilerFence() are both compiler
optimization barriers and processor barriers.
InterlockedCompareExchange() is also both a compiler optimization
barrier and a processor barrier
CompilerFence() is just a better name for _ReadWriteBarrier(), it blocks
optimizations but it has no effect at the processor level. It should
only be used (instead of volatile) in busy-waiting loops that do not
always go through an InterlockedCompareExchange.
Paolo
234 _ReadWriteBarrier ();
235 *SpinLock = SPIN_LOCK_RELEASED;
236 _ReadWriteBarrier ();
237
238 return SpinLock;
it *is* buggy because it is missing a
(processor) barrier on non-x86 architectures and has a useless barrier
after the store. Instead it should be just this:
ReleaseMemoryFence ();
*SpinLock = SPIN_LOCK_RELEASED;