Would it be correct to say that acquire-release is in some sense "higher level" than memory fences, with acq-rel implemented in terms of fences (and restrictions on code-gen?) and fences being all that the CPU actually knows about at least in the case of x86_64?
Acquire/release is higher level because it is more of a theoretical description than an hardware description. But acq/rel is not implemented using fences on x86. On this arch all loads and stores have implicit acquire and release semantics respectively, while all RMW are sequentially consistent acq+rel. The compiler will need to emit explicit fences very rarely on x86.