aligned_storage failure - any undefined behaviour here?
While working on a std::function-like class, I managed to get incorrect code generation from GCC. As far as I can tell, the code is tricky, but I can't identify any undefined behaviour that would justify GCC failing.
In summary, the class layout is
class Callback {
using Store = std::aligned_
Store storage;
void (*call)();
};
In the failing case, assignment into the Callback did
memset(
call = X;
new (&storage) F(std::move(f));
So zero-fill complete storage, then placement-new copy-or-move construct into (some of) the storage. F is trivially-copyable and suitably sized and aligned.
This Callback was then copied to another Callback. The stored object and Callback are both trivially-copyable, so that was using the defaulted (trivial) copy assignment operator.
The complete sequence is then
memset(
cb.call = X;
new (&cb.storage) F(std::move(f));
cb2 = cb;
But the compiler reorders to:
memset(
cb.call = X;
cb2 = cb;
new (&cb.storage) F(std::move(f));
So cb2 ends up with incorrect contents.
Some interaction of the memset, the placement new and the trivial copy apparently falls foul of some sort of aliasing problem - GCC decides the placement new and the trivial copy don't interact.
Many hours of staring at the C++ standard makes me think this should be fine.
1) You can store trivially-copyable objects in character arrays, if aligned enough.
2) aligned_storage is specified as being usable as a store (although it isn't explicitly a character array, and changing it to be a character array doesn't change anything).
3) Copying a trivially-copyable object as its component characters preserves its value.
4) The defaulted copy assignment of Callback should copy the characters of storage, and hence the value of the stored object
5) Performing a placement new into storage creates a new object with new lifetime that should persist as long as the Callback holding the storage.
The problem has gone away in my tests since I've removed the full-storage memset, replacing it with a memset of padding only, but I want to pin down if I'm doing anything wrong, or this is a GCC bug. The failing code worked fine with ARMC6 and IAR, fwiw.
Replacing the placement-new with `memcpy(_storage, &f, sizeof f)` also made the problem go away while I had the memset.
Failing shown here on compiler explorer: https:/
mov r0, r4 // r4 = sp+24
bl mbed::detail:
ldr r3, .L16+12
str r3, [sp, #36]
ldm r4, {r0, r1, r2, r3}
stm r5, {r0, r1, r2, r3}
mov r0, r4
str r6, [sp, #24] // XXX these two stores should have happened
str sp, [sp, #28] // XXX before the ldm r4, {} copy
bl mbed::detail:
Failure seen on both GCC 8.3 and 9.2. It appears to be A32/T32-specific. This may be because ARM32 code generation for trivial structures passed by value is markedly inferior to ARM64 or x64 - the copy seen here doesn't occur at all for them.
GCC performs far more copying when generating ARM32 code for things like `std::span` or this `Callback` passed by value, suggesting some backend small-structure
In Godbolt, GCC 8.2 produces apparently OK code.
Same basic failing code seen with Cortex-M4 and arm7tdmi (arm or thumb). Cortex-A7 (arm or thumb) reorders a bit due to scheduling, but still half-wrong.
Question information
- Language:
- English Edit question
- Status:
- Solved
- Assignee:
- No assignee Edit question
- Solved by:
- Kevin Bracey
- Solved:
- Last query:
- Last reply: