doc/source/dev/alignment.rst
.. currentmodule:: numpy
.. _alignment:
Memory alignment
There are three use-cases related to memory alignment in NumPy (as of 1.14):
structured datatypes <structured data type> with
:term:fields <field> aligned like in a C-struct.uint assignment in instead of
memcpy.NumPy uses two different forms of alignment to achieve these goals: "True alignment" and "Uint alignment".
"True" alignment refers to the architecture-dependent alignment of an
equivalent C-type in C. For example, in x64 systems :attr:float64 is
equivalent to double in C. On most systems, this has either an alignment of
4 or 8 bytes (and this can be controlled in GCC by the option
malign-double). A variable is aligned in memory if its memory offset is a
multiple of its alignment. On some systems (eg. sparc) memory alignment is
required; on others, it gives a speedup.
"Uint" alignment depends on the size of a datatype. It is defined to be the
"True alignment" of the uint used by NumPy's copy-code to copy the datatype, or
undefined/unaligned if there is no equivalent uint. Currently, NumPy uses
uint8, uint16, uint32, uint64, and uint64 to copy data of
size 1, 2, 4, 8, 16 bytes respectively, and all other sized datatypes cannot
be uint-aligned.
For example, on a (typical Linux x64 GCC) system, the NumPy :attr:complex64
datatype is implemented as struct { float real, imag; }. This has "true"
alignment of 4 and "uint" alignment of 8 (equal to the true alignment of
uint64).
Some cases where uint and true alignment are different (default GCC Linux): ====== ========= ======== ======== arch type true-aln uint-aln ====== ========= ======== ======== x86_64 complex64 4 8 x86_64 float128 16 8 x86 float96 4 - ====== ========= ======== ========
There are 4 relevant uses of the word align used in NumPy:
dtype.alignment attribute (descr->alignment in C). This is
meant to reflect the "true alignment" of the type. It has arch-dependent
default values for all datatypes, except for the structured types created
with align=True as described below.ALIGNED flag of an ndarray, computed in IsAligned and checked
by :c:func:PyArray_ISALIGNED. This is computed from
:attr:dtype.alignment.
It is set to True if every item in the array is at a memory location
consistent with :attr:dtype.alignment, which is the case if the
data ptr and all strides of the array are multiples of that alignment.align keyword of the dtype constructor, which only affects
:ref:structured_arrays. If the structure's field offsets are not manually
provided, NumPy determines offsets automatically. In that case,
align=True pads the structure so that each field is "true" aligned in
memory and sets :attr:dtype.alignment to be the largest of the field
"true" alignments. This is like what C-structs usually do. Otherwise if
offsets or itemsize were manually provided align=True simply checks that
all the fields are "true" aligned and that the total itemsize is a multiple
of the largest field alignment. In either case :attr:dtype.isalignedstruct
is also set to True.IsUintAligned is used to determine if an ndarray is "uint aligned" in
an analogous way to how IsAligned checks for true alignment.Here is how the variables above are used:
align=True, NumPy looks up field.dtype.alignment. This includes
fields that are nested structured arrays.ALIGNED flag of an array is False, ufuncs will
buffer/cast the array before evaluation. This is needed since ufunc inner
loops access raw elements directly, which might fail on some archs if the
elements are not true-aligned.ALIGNED is False they will
use a code path that buffers the arguments so they are true-aligned.*(uintN*)dst) = *(uintN*)src) for
appropriate N. Otherwise, NumPy copies by doing memcpy(dst, src, N).*dst = CASTFUNC(*src) if aligned. Otherwise, it does
memmove(srcval, src); dstval = CASTFUNC(srcval); memmove(dst, dstval)
where dstval/srcval are aligned.Note that the strided-copy and strided-cast code are deeply intertwined and so any arrays being processed by them must be both uint and true aligned, even though the copy-code only needs uint alignment and the cast code only true alignment. If there is ever a big rewrite of this code it would be good to allow them to use different alignments.