Let's start with some context.

A custom memory pool was using code similar to the following:

struct FastInitialization {}; template <typename T> T* create() { static FastInitialization const F = {}; void* ptr = malloc(sizeof(T)); memset(ptr, 0, sizeof(T)); new (ptr) T(F); return reinterpret_cast<T*>(ptr); }

The idea is that when called with FastInitialization , a constructor could assume that the storage is already zero-initialized and therefore only initialize those members who need a different value.

GCC (6.2 and 6.3, at least) however has an "interesting" optimization which kicks in.

struct Memset { Memset(FastInitialization) { memset(this, 0, sizeof(Memset)); } double mDouble; unsigned mUnsigned; }; Memset* make_memset() { return create<Memset>(); }

Compiles down to:

make_memset(): sub rsp, 8 mov edi, 16 call malloc mov QWORD PTR [rax], 0 mov QWORD PTR [rax+8], 0 add rsp, 8 ret

But:

struct DerivedMemset: Memset { DerivedMemset(FastInitialization f): Memset(f) {} double mOther; double mYam; }; DerivedMemset* make_derived_memset() { return create<DerivedMemset>(); }

Compiles down to:

make_derived_memset(): sub rsp, 8 mov edi, 32 call malloc mov QWORD PTR [rax], 0 mov QWORD PTR [rax+8], 0 add rsp, 8 ret

That is, only the first 16 bytes of the struct , the part corresponding to its base, have been initialized. Debugging information confirm that the call to memset(ptr, 0, sizeof(T)); has been completely elided.

On the other hand, both ICC and Clang both call memset on the full size, here is Clang's result:

make_derived_memset(): # @make_derived_memset() push rax mov edi, 32 call malloc xorps xmm0, xmm0 movups xmmword ptr [rax + 16], xmm0 movups xmmword ptr [rax], xmm0 pop rcx ret

So the behavior of GCC and Clang differ, and the question becomes: is GCC right and producing better assembly, or is Clang right and GCC buggy?

Or, in terms of language lawyering:

Under which circumstances may a constructor rely on the previous value stored in its allocated storage?