I may be completely missing something here - I hope not :-) but that does not, according to my mental picture of what's going on with all this stuff, address the problem in any way.
The loads which occur in the reader thread are c0 and c1. They indeed will not be moved above the load barrier - but the problem is not that they are moved; the problem is that the read thread pauses after the load barrier, while the write thread continues.
One minor point - being an acquire fence, it will only prevent *loads* from moving up above the fence, not stores as well.
Would you describe how you see the code in the two threads working (your version of the code, with the acquire right after the label), with regard to loads, stores, barriers, etc?