New user self-registration is disabled due to spam. Please email eigen-core-team @ lists.tuxfamily.org if you need an account.
Before reporting a bug, please make sure that your Eigen version is up-to-date!
Bug 761 - Memory Not Properly Aligned (Incorrect Assumptions About std::malloc)
Summary: Memory Not Properly Aligned (Incorrect Assumptions About std::malloc)
Status: RESOLVED WONTFIX
Alias: None
Product: Eigen
Classification: Unclassified
Component: Core - general (show other bugs)
Version: 3.2
Hardware: All All
: High Crash
Assignee: Nobody
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 3.3 779
  Show dependency treegraph
 
Reported: 2014-03-13 20:05 UTC by ensafi
Modified: 2016-02-05 21:13 UTC (History)
3 users (show)



Attachments
Patch for Eigen/Core/util/Memory.h (wraps malloc and realloc to test allocated memory for proper alignment) (2.48 KB, patch)
2014-03-13 20:06 UTC, ensafi
no flags Details | Diff
Patched Eigen/Core/util/Memory.h for Eigen 3.2.1 (37.61 KB, text/plain)
2014-03-13 20:10 UTC, ensafi
no flags Details

Description ensafi 2014-03-13 20:05:09 UTC
We have experienced crashes with Eigen 3.2.1 (and as far back as Eigen 3.1.2) under 64-bit OS variants, including Windows 7 and 8 (VS 2010 and VS 2012) and Red Hat Enterprise Linux 6.5 (glibc-2.12, standard GCC 4.4, devtoolset-1.1/GCC 4.7, and devtoolset-2/GCC 4.8).  We are using Intel Parallel Studio XE 2013 SP1 on all platforms.  The header file Eigen/Core/util/Memory.h incorrectly asserts that std::malloc/realloc/etc under 64-bit Windows and 64-bit Linux is always properly aligned.  This simply is not true, as can be demonstrated by the supplied patch.  The same assumption is make about Apple, but we have not verified this.

If you replace Eigen/Core/util/Memory.h with the version provided, or if you apply the provided patch, you will see warnings about std::malloc not being aligned.  Clearly, the only solution on these systems is to always call posix_memalign (Linux) or _aligned_malloc (Windows).
Comment 1 ensafi 2014-03-13 20:06:47 UTC
Created attachment 427 [details]
Patch for Eigen/Core/util/Memory.h (wraps malloc and realloc to test allocated memory for proper alignment)

Wraps std::malloc and std::realloc to test allocated memory for proper alignment.
Comment 2 ensafi 2014-03-13 20:10:12 UTC
Created attachment 428 [details]
Patched Eigen/Core/util/Memory.h for Eigen 3.2.1

For your convenience, my already patched version of the header file.
Comment 3 Gael Guennebaud 2014-03-13 20:53:55 UTC
hm, you're the first one to encounter such an issue. The problem probably comes from Intel Parallel Studio which likely generates code bypassing system's malloc.

Indeed:

"The address of a block returned by malloc or realloc in GNU systems is always a multiple of eight (or sixteen on 64-bit systems). "

Source: http://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
Comment 4 Gael Guennebaud 2014-03-13 21:00:06 UTC
and for windows:

http://msdn.microsoft.com/en-us/library/ycsb6wwf.aspx
Comment 5 ensafi 2014-03-13 21:07:11 UTC
For Windows, alignment is only guaranteed to be "fundamental" (8-byte or 16-byte boundary) for VS 2013.  However, if you look at the documentation for VS 2010 or 2012, this is not the case:

"malloc is guaranteed to return memory that's aligned on a boundary that's suitable for storing any object that could fit in the amount of memory that's allocated. For example, a four-byte allocation would be aligned on a boundary that supports any four-byte or smaller object. Memory alignment on a boundary that's suitable for a larger object than will fit in the allocation is not guaranteed."

Therefore, allocating an odd number of single-precision floats may not necessarily be 16-byte aligned on a 64-bit processor when compiling with earlier versions of MSVC.
Comment 6 Gael Guennebaud 2014-03-13 22:02:45 UTC
For Visual Studio 2008, we have:

"malloc is required to return memory on a 16-byte boundary."

I doubt they changed this behaviour  for 64bits system, otherwise all our unit tests would fail.

Which compiler flag are you using?
Comment 7 Gael Guennebaud 2014-03-13 22:05:34 UTC
What about the following fix (lines 51-58):

#if (defined(__APPLE__) \
 || defined(_WIN64) \
 || EIGEN_GLIBC_MALLOC_ALREADY_ALIGNED \
 || EIGEN_FREEBSD_MALLOC_ALREADY_ALIGNED) \
 && !defined(__INTEL_CXXLIB_ICC)
  #define EIGEN_MALLOC_ALREADY_ALIGNED 1
#else
  #define EIGEN_MALLOC_ALREADY_ALIGNED 0
#endif

It basically bypass malloc when using Intel's C++ runtime library. Does it work for you?
Comment 8 ensafi 2014-03-13 22:30:28 UTC
Your suspicion about Intel Parallel Studio drove me to do some more digging.  On the Linux side, I have potentially traced the problem to libtbbmalloc_proxy.so from the Intel TBB library.    The same may apply to Windows, but unless you are using VS 2013, I don't find the documentation for VS 2012 and below to be very comforting.  If this turns out to be an Intel problem, I don't think you should have to patch anything by default.  Rather, people should probably define a macro to force Eigen to assume that memory allocation will not be properly aligned.  Is there such a macro at present?
Comment 9 ensafi 2014-03-13 22:33:41 UTC
To answer your question, we are using Intel MKL and TBB libraries, but we prefer to use the standard, native compilers on each system (MSVC or GCC) so __INTEL_CXXLIB_ICC will not be defined.  I'd say let's leave well alone, but perhaps devise a way to force posix_memalign or _aligned_malloc to be called when the user knows that his application will clash with your otherwise correct #ifdef logic.

Thank you.  You guys are very responsive, and Eigen rocks!
Comment 10 Christoph Hertzberg 2014-03-13 22:37:48 UTC
You can compile with 
 -DEIGEN_MALLOC_ALREADY_ALIGNED=0
to force posix_memalign/_aligned_malloc

I wonder, is there a measurable advantage of using malloc instead of one of the aligned mallocs?
Comment 11 ensafi 2014-03-13 22:46:31 UTC
I wish I could answer that, but I simply do not know.
Comment 12 Gael Guennebaud 2014-03-14 13:09:29 UTC
(In reply to comment #10)
> I wonder, is there a measurable advantage of using malloc instead of one of the
> aligned mallocs?

The assumption is that standard malloc can only be better or in the worst case equivalent to the aligned versions.

Moreover, a few libraries bypass standard malloc calls for memory debugging or performance purpose. libtbbmalloc_proxy  is one example. Therefore, using malloc instead of aligned_malloc or posix_memalign makes this process easier on systems where malloc is already aligned.

That being said, with this last argument, we should probably enable only two paths: standard malloc and the handmade one....
Comment 13 Gael Guennebaud 2016-02-05 21:13:44 UTC
Won't fix, but at least now the remedy should be easier to found:

https://bitbucket.org/eigen/eigen/commits/93bc52c5c4e7/

Note You need to log in before you can comment on or make changes to this bug.