Laszlo, I think it makes sense to “break this up”. Seems like we need correct primitives and documentation on how to use them. It is easy enough after that to use code review to make new code “correct”, but as you say fixing the general volatile usage (especially given some of the VC++ behavior) is a big undertaking. Let us not have perfection block being able to do it correctly going forward. Changing the CPU drivers (MP libraries) is a complex undertaking as we really need to test against the various compilers. I’ve tracked down quite a few MP bugs in proprietary CPU drivers that did not show up under VC++, but did under clang.
I think the first step should be to introduce the new fence primitives and fixing MemoryFence to be what it says on the tin.
I can volunteer for the work of removing volatile once those are in.