According to this thread: https://forum.kde.org/viewtopic.php?f=74&t=125884&sid=ee8bc135815a4e1fd1c79accd3937209, and some new unit tests (http://manao.inria.fr/CDash/viewTest.php?onlyfailed&buildid=19166), and in contrast to what I read everywhere else, MSVC/win32 fail to align doubles on the stack a 8 bytes boundary.
This means that if someone map a statically allocated buffer of doubles, then all our strategies to extract the first aligned entry will fail.
The easiest fix would be to set packet_traits<double>::AlignedOnScalar to false on win32. The downside is a loss of performance for objects allocated by Eigen and heap allocated buffers.
Another solution would to detect such cases at runtime. I already have a fix for sliced evaluation there: https://forum.kde.org/viewtopic.php?f=74&t=125884&sid=ee8bc135815a4e1fd1c79accd3937209#p333129, but there might be other places where such a fix would be needed. If go with that approach, then we can also wonder whether such runtime detection should be enabled all the time, or only for win32 with sizeof(Scalar)>4 ??
I'm not sure if the following is worth the effort: Could/should we distinguish between buffers that are:
a) 16 byte aligned
b) sizeof(Scalar) aligned
c) entirely unaligned
b) would be the case for segments of vectors allocated by Eigen. And we could enforce b) for stack-allocated Eigen objects.
For Maps we need to think about alignment anyways, wrt AVX support (Bug 973), and if we want to support 32/16 and unaligned Maps, also distinguishing between 8 and 4 byte alignment sounds logical. This might justify an independent template parameter, rather than additional flags, however that would break API-compatibility. Perhaps, we can reserve 2-4 bits of the Options to encode alignment, instead? Preferably in a way that the current values correspond to 16 byte and 4 byte alignment (or 16 byte and sizeof(Scalar)?).
This issue is really annoying. As if the vectorization logic was not complicated enough!
Perhaps the simplest would be to properly assert (in Map) and ask users to enforce scalar alignment by providing a cross platform EIGEN_ALIGN8 macro.
(In reply to Gael Guennebaud from comment #2)
> Perhaps the simplest would be to properly assert (in Map) and ask users to
> enforce scalar alignment by providing a cross platform EIGEN_ALIGN8 macro.
I implemented this as this is the least we can do for now:
Finally, I realized that first_aligned already checked the scalar alignment of the given pointer at runtime, regardless of packet_traits<>::AlignedOnScalar, so to be consistent, we should do so in slice-vectorization and remove the previously introduced assert.