Re: MemoryFence()


Laszlo Ersek
 

rant warning

On 02/05/21 19:29, Ni, Ray wrote:

I read a bit about c memory model.
https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync
Thanks.

I think what happened here is a catastrophic impedance mismatch.

The underlying, physical architecture is a message-passing one.

Yet the mental abstraction that is built atop, for programmers to deal
with, is "shared memory". Which suggests a global shared state.

This is *suicidal*.

There has forever been a debate of tastes; there's a camp that prefers
the message passing API, and another camp that prefers the shared memory
/ locking API. There are pros and cons; usually it is pointed out that
message passing is safer but less performant, and also that the message
passing APIs are not difficult to implement on top of shared memory /
locking. What people (IME) don't spend many words on however is an
attempt to implement shared memory / locking on top of message passing.
"Why would you want to do that, in an *application*?", is how the
thinking would go, I guess.

But that's *exactly* what's being sold to us with the shared memory /
locking API *itself*. The hardware underneath is message-passing!

All this complexity about "non-commutativity" in the release/acquire
pattern -- see: "There has been no synchronization between threads 1 and
3" -- *only* makes sense if the programmer thinks about *messages in
transit* between the threads (well, the processing units that execute
them). Whereas, the concepts that the programmer deals with *in the
source code* are called "x" and "y" -- the same names in each thread,
the same addresses, the same storage.

So you've got a message-passing architecture at the bottom, then build a
shared memory abstraction on top, which leaks like a sieve, and then you
ask the programmer to mentally deconstruct that abstraction, while he or
she still has to stare at code that uses the shared memory terminology.
It's *insane*. It's self-defeating. If an alien came by and you claimed
this was all invented to deceive programmers on purpose, they'd have no
problem believing it.

Sequentially Consistent is the *only* model that halfway makes sense
(whatever the execution cost), as long as we have a shared-anything
programming model.

If we replaced "x" and "y" in the "Overall Summary" example, with the
following objects:

- thread1_thread2_channel_a
- thread1_thread2_channel_b
- thread2_thread3_channel_c
- thread2_thread3_channel_d
- thread3_thread1_channel_e
- thread3_thread1_channel_f

in other words, if we explicitly instantiated both "x" and "y" in every
particular inter-thread relationship, and we considered "store" as
"send", and "load" as "receive", and each such channel were
bi-directional, then folks like me might actually get a fleeting chance
at scratching the surface.

The shared memory model is a *wreck*, given how the hardware operates
underneath.

Laszlo

Join rfc@edk2.groups.io to automatically receive all group messages.