Look at the code for _DEQUESIZ (number of elements per block):
#define _DEQUESIZ (sizeof (_Ty) <= 1 ? 16 \
: sizeof (_Ty) <= 2 ? 8 \
: sizeof (_Ty) <= 4 ? 4 \
: sizeof (_Ty) <= 8 ? 2 : 1) /* elements per block (a power of 2) */
It gets smaller if the element is larger. Only for elements larger than 8 bytes will you get the expected behavior (percentual decrease of overhead with increase of element size).