Bug 554 - Eigen segfaults when the aligned free operator is called
Eigen segfaults when the aligned free operator is called
Status: RESOLVED FIXED
Product: Eigen
Classification: Unclassified
Component: Core - general
unspecified
All All
: Normal Unknown
Assigned To: Nobody
:
Depends on:
Blocks: 3.2
  Show dependency treegraph
 
Reported: 2013-02-20 00:54 UTC by bifo.10.Toranaga-San
Modified: 2013-06-27 11:59 UTC (History)
6 users (show)



Attachments
Program illustrating the bug (371 bytes, text/x-c++src)
2013-02-20 00:56 UTC, bifo.10.Toranaga-San
no flags Details
A small example for which the crash occurs (30.00 KB, application/x-bzip)
2013-04-03 17:42 UTC, bifo.10.Toranaga-San
no flags Details
The gdb test session (1.68 KB, text/plain)
2013-04-04 22:20 UTC, bifo.10.Toranaga-San
no flags Details
include unistd.h if __unix (1.78 KB, patch)
2013-06-26 10:36 UTC, Gael Guennebaud
no flags Details | Diff

Description bifo.10.Toranaga-San 2013-02-20 00:54:16 UTC
The attached program gives the following output:

    Operator delete[]
    Segmentation fault (core dumped)

The operator delete[] is taken from the EIGEN_MAKE_ALIGNED_OPERATOR_NEW
macro that is supposed to be used for classes with Eigen structures as members.
I compiled it using g++ `pkg-config --libs --cflags eigen3` -o main main.cc
I am using eigen 3.1.1.
Comment 1 bifo.10.Toranaga-San 2013-02-20 00:56:06 UTC
Created attachment 314 [details]
Program illustrating the bug
Comment 2 Gael Guennebaud 2013-02-20 09:12:49 UTC
You cannot only overload operator delete to use aligned_free without overloading operator new to create the object  with aligned_malloc.
Comment 3 Christoph Hertzberg 2013-02-21 16:41:01 UTC
As Gael said, you need to overload operator new[] as well. Simply using EIGEN_MAKE_ALIGNED_OPERATOR_NEW somewhere public in your class will give you the intended behavior, except for not printing to cout.
Comment 4 bifo.10.Toranaga-San 2013-04-03 16:27:25 UTC
Ok, I know that this program does not illustrate the bug correctly.
Nevertheless, my problem persists with my setup. I have a class that contains the EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro. Everytime I launch the program I get a SIGABRT and valgrind shows me something like this:

==7262== Invalid free() / delete / delete[] / realloc()
==7262==    at 0x402A45C: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==7262==    by 0x827F66A: Eigen::internal::aligned_free(void*) (Memory.h:233)
==7262==    by 0x828668E: void Eigen::internal::conditional_aligned_free<true>(void*) (Memory.h:305)
==7262==    by 0x827FB85: Point::operator delete[](void*) (point.hh:171)
...

Could you help me to fix this problem? I could send you my compiled program or  the source code as well...
Comment 5 Christoph Hertzberg 2013-04-03 16:57:38 UTC
(In reply to comment #4)
> Could you help me to fix this problem? I could send you my compiled program or 
> the source code as well...

Can you attach a minimal example which properly uses EIGEN_MAKE_ALIGNED_OPERATOR_NEW and fails? 
Your previously attached program works fine after replacing your implementation of operator::delete[] by EIGEN_MAKE_ALIGNED_OPERATOR_NEW.
Comment 6 bifo.10.Toranaga-San 2013-04-03 17:42:40 UTC
Created attachment 324 [details]
A small example for which the crash occurs

To reproduce the crash, I first create the Makefile using cmake via

cmake . -DCMAKE_CXX_FLAGS="-O0 -g -Wall -Wno-unused-local-typedefs" -DCMAKE_BUILD_TYPE=Debug

Then I call ./src/grid_test

The error occurs in the EIGEN_MAKE_ALIGNED_OPERATOR_NEW call of the Point class which has a Vector2d member.

The gdb backtrace is:

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7ceb296 in raise () from /usr/lib/libc.so.6
#2  0xb7ceca23 in abort () from /usr/lib/libc.so.6
#3  0xb7d2a635 in __libc_message () from /usr/lib/libc.so.6
#4  0xb7d3086a in malloc_printerr () from /usr/lib/libc.so.6
#5  0xb7d314ac in _int_free () from /usr/lib/libc.so.6
#6  0x08158c1b in Eigen::internal::aligned_free (ptr=0x81f3880) at /usr/include/eigen3/Eigen/src/Core/util/Memory.h:233
#7  0x0815ef1a in Eigen::internal::conditional_aligned_free<true> (ptr=0x81f3880) at /usr/include/eigen3/Eigen/src/Core/util/Memory.h:305
#8  0x08158c2e in Point::operator delete[] (ptr=0x81f3880) at ./example/src/point.hh:56
#9  0x0815ef56 in Array2D<Point>::~Array2D (this=0xbfffdf74, __in_chrg=<optimized out>) at ./example/src/util.hh:100
#10 0x08158c5a in Grid::~Grid (this=0xbfffdf60, __in_chrg=<optimized out>) at ./example/src/grid.hh:11
#11 0x08158d3e in Fixture::~Fixture (this=0xbfffdf00, __in_chrg=<optimized out>) at ./example/src/test/grid_fixtue.hh:7
Comment 7 bifo.10.Toranaga-San 2013-04-03 17:46:12 UTC
Granted, this is not exactly minimal, but I had some problems finding the exact source of the error... If I call the program by hand, I get the following relevant output:

*** Error in `./src/grid_test': double free or corruption (out): 0x08302860 ***
======= Backtrace: =========
...

I was running the example on an "Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz"
using a standard x86 linux kernel if you need this information.
Comment 8 Gael Guennebaud 2013-04-03 18:13:49 UTC
cannot reproduce, it works for me.
Comment 9 bifo.10.Toranaga-San 2013-04-04 01:50:24 UTC
I just looked through the code with gdb once more. It occured to me that the new[] operator internally calls handmade_aligned_malloc(size) whereas the delete[] operator calls std::free(ptr) and not "handmade_aligned_free(ptr)".
Therefore the pointer passed to std::free is the aligned pointer and not the original one (from handmade_aligned_malloc). Could that be the reason for my problem?

(I am pretty sure that there are some #define s different on our machines)
Comment 10 Gael Guennebaud 2013-04-04 14:22:49 UTC
Indeed, that would explain the segfault.

However, I don't understand how aligned_malloc would call handmade_aligned_malloc and aligned_free would call std::free.

Moreover, your backtrace points to Memory.h:233 which is a call to handmade_aligned_free. Could you check your Eigen version in /usr/include/eigen3/Eigen/src/Core/util/Macros.h and make sure you are using eigen 3.1.1?
Comment 11 bifo.10.Toranaga-San 2013-04-04 14:46:40 UTC
I am using eigen 3.1.2. My package is compiled using the file from

http://bitbucket.org/eigen/eigen/get/3.1.2.tar.bz2

The macros give me

#define EIGEN_WORLD_VERSION 3
#define EIGEN_MAJOR_VERSION 1
#define EIGEN_MINOR_VERSION 91

Also I just compiled the program using clang++ and oddly enough, in this case everything works whereas in the case of using g++ the problem persists.
Comment 12 Christoph Hertzberg 2013-04-04 16:04:08 UTC
(In reply to comment #11)
> http://bitbucket.org/eigen/eigen/get/3.1.2.tar.bz2

For that package I get:
#define EIGEN_WORLD_VERSION 3
#define EIGEN_MAJOR_VERSION 1
#define EIGEN_MINOR_VERSION 2

3.1.91 is more likely this or a recent hg version:
http://bitbucket.org/eigen/eigen/get/3.2-beta1.tar.bz2

In that package Memory.h:233 is 
  #elif EIGEN_HAS_POSIX_MEMALIGN
    std::free(ptr); // L233

and the corresponding allocation function is: 
  #elif EIGEN_HAS_POSIX_MEMALIGN
    if(posix_memalign(&result, 16, size)) result = 0;


Does the following work on your system? (maybe after including some standard headers)

int main(){
  void* result;
  posix_memalign(&result, 16, 32);
  std::free(result);
  return 0;
}

If not, does it work with free instead of std::free?
Comment 13 bifo.10.Toranaga-San 2013-04-04 22:20:22 UTC
Created attachment 326 [details]
The gdb test session
Comment 14 bifo.10.Toranaga-San 2013-04-04 22:26:31 UTC
I think the gdb file makes it clear that the #ifdefs are not processed correctly by g++. I replaced the functions in Memory.h like this:

inline void* aligned_malloc(size_t size)
{
  check_that_malloc_is_allowed();

  void *result;

  if(posix_memalign(&result, 16, size)) result = 0;

  if(!result && size)
    throw_std_bad_alloc();

  return result;
}

inline void aligned_free(void *ptr)
{
  std::free(ptr);
}

This resolves the issue which is good enough for me. Seems like a bug in gcc 4.8.0 to me (could someone confirm this?). I don't think there is anything you can do about this.
Comment 15 Gael Guennebaud 2013-04-05 13:52:36 UTC
hm the line numbers are different again. Anyway, could you comment a bit on your system? Linux? 32/64 bits? glibc version? CPU?
Comment 16 bifo.10.Toranaga-San 2013-04-06 15:29:40 UTC
I am using archlinux (32 bits) on a i686. I am using gcc 4.8.0 as well as gcc-libs 4.8.0 and glibc 2.17. My CPU is a "Intel(R) Core(TM)2 Duo CPU     L9400  @ 1.86GHz"
Comment 17 Iain 2013-06-18 12:09:58 UTC
Dear all,

I have had exactly the same error as Toranga - a segmentation fault with aligned free, but this occurs when an eigen vector is deleted, and not when a class containing a fixed size vector (although these do exist in my code). I have lost several days to this now. I am using eigen 3.1.3 with gcc 4.6.2 on corei7, x86_64, ubuntu 12.04 virtualbox (running in x86_64 opensuse - although I have also got the same error on native ubuntu 12.04 x86_64 hardware).

I have tried to create a minimal working example, but cannot reduce the problem and still reproduce the error. My code roughly looks like:

VectorXd *dataAlignedptr;
...
class A(){
   EIGEN_MAKE_ALIGNED_OPERATOR_NEW
   Vector3d tmp;
   double whatever;
};
...
int main(){
   int N = 16380;//some number
   dataAlignedptr=new VectorXd(N);
   A* a = new A();
 
   for (i=0;i<LOTS;i++)
         doStuff(*dataAlignedptr,*a);

   delete dataAlignedptr; //<------crashes here at line 182
   delete a;
}

The gdb backtrace gives:


#0  0x00002b5996688425 in __GI_raise (sig=<optimised out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00002b599668bb8b in __GI_abort () at abort.c:91
#2  0x00002b59966c639e in __libc_message (do_abort=2, fmt=0x2b59967d0008 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:201
#3  0x00002b59966d0b96 in malloc_printerr (action=3, str=0x2b59967d0168 "double free or corruption (out)", ptr=<optimised out>)
    at malloc.c:5018
#4  0x00002b59a8ac1862 in Eigen::internal::aligned_free (ptr=0xd70b1b0) at /usr/local/include/eigen3/Eigen/src/Core/util/Memory.h:230
#5  0x00002b59a8ac601a in Eigen::internal::conditional_aligned_free<true> (ptr=0xd70b1b0)
    at /usr/local/include/eigen3/Eigen/src/Core/util/Memory.h:304
#6  0x00002b59a8ac7ac2 in Eigen::internal::conditional_aligned_delete_auto<double, true> (ptr=0xd70b1b0, size=16368)
    at /usr/local/include/eigen3/Eigen/src/Core/util/Memory.h:430
#7  0x00002b59a8ac60a7 in Eigen::DenseStorage<double, -0x00000000000000001, -0x00000000000000001, 1, 0>::~DenseStorage (
    this=0x6e68aa0, __in_chrg=<optimised out>) at /usr/local/include/eigen3/Eigen/src/Core/DenseStorage.h:286
#8  0x00002b59a8ac29ca in Eigen::PlainObjectBase<Eigen::Matrix<double, -0x00000000000000001, 1, 0, -0x00000000000000001, 1> >::~PlainObjectBase (this=0x6e68aa0, __in_chrg=<optimised out>) at /usr/local/include/eigen3/Eigen/src/Core/PlainObjectBase.h:72
#9  0x00002b59a8ac29e4 in Eigen::Matrix<double, -0x00000000000000001, 1, 0, -0x00000000000000001, 1>::~Matrix (this=0x6e68aa0, 
    __in_chrg=<optimised out>) at /usr/local/include/eigen3/Eigen/src/Core/Matrix.h:127
#10 0x00002b59a8abe935 in tracking (fileName=0x21bccb4 "Data/201210190820GMT_PRNS_31_25_30_21_MUNGED.bin", satdata=0x61c5a10, 
    SampleRate=16368000, num_ms=10027, out=0x6e4ca00, dTr=2.2664873040417137, alpha=..., beta=..., TR=..., sats=..., Pos=..., Vel=..., 
    userOutputs=..., Nproc=1) at VFDLL.cpp:182
#11 0x00002b59a8abdcdb in VFDLL (self=0x0, args=0x3405520) at VFDLL.cpp:76


I tried Toranga's fix from comment 15, without success. In Memory.h, I have the following flag values:
EIGEN_GLIBC_MALLOC_ALREADY_ALIGNED 1 
EIGEN_FREEBSD_MALLOC_ALREADY_ALIGNED 0 
EIGEN_MALLOC_ALREADY_ALIGNED 1 
EIGEN_HAS_POSIX_MEMALIGN 1 
EIGEN_HAS_MM_MALLOC 1 
EIGEN_ALIGN 1

For what it is worth I checked the memory that the dataAlignedptr points to, and all 16380 values seem correct, so the vector is still using that address space when it is deleted.

I found Memory.h extremely hard to understand, so any help you can offer would be appreciated, although I realise that it is hard without a minimal example. I am haemorraging time to this, and I really don't want to have to ditch eigen.
Comment 18 Christoph Hertzberg 2013-06-18 12:39:29 UTC
It's hard to check that if all your minimal examples work. Are you sure you didn't do some kind of double-free yourself?
The fact that the memory still points to valid elements does not say much, if it has not been overwritten after deallocation.

Furthermore, is there a reason you dynamically allocate `a` and `dataAlignedptr`?
At least your reduced example should work fine with stack variables, since both A and VectorXd are quite small objects.
Comment 19 Gael Guennebaud 2013-06-18 13:29:53 UTC
Please run your program with valgrind to ca
Comment 20 Gael Guennebaud 2013-06-18 13:30:26 UTC
Please run your program with valgrind to catch all the memory errors.
Comment 21 Iain 2013-06-20 18:01:39 UTC
I had originally tried to run under valgrind but after 15 hours plus of it running I gave up - the eigen code is used in a python extension, and the python/c++/valgrind combination was ridiculously slow for a 1 minute program. 

I've now written an interface such that the code could be called by C++ and therefore cut python out of the loop. valgrind+gdb then quickly showed two issues in the code: one where a VectorXd array was accessed by an index one longer than the array, and another to do with numpy's hideous c API. I believe that it was the overshooting that was causing the segfault reported above. 

Is there a out-of-bounds indices checker in eigen? If not, this would be a useful feature to enable with a "-DCHECK_ARRAY_BOUNDS" flag. It would have saved me a while...

Summary: problem in my code, not yours! Thanks for the suggestions.

P.S. Last question that has been niggling me for the last day - where is "VectorXd" actually defined?? Both grep and ctags refused to show me where it is actually typedef'ed, and I am perplexed.
Comment 22 Jitse Niesen 2013-06-20 18:20:19 UTC
(In reply to comment #21)
> Is there a out-of-bounds indices checker in eigen? If not, this would be a
> useful feature to enable with a "-DCHECK_ARRAY_BOUNDS" flag. It would have
> saved me a while...

There should be an assertion that guards against accessing vectors or matrices out of bounds (unless you turned off asserts with NDEBUG or EIGEN_NO_DEBUG). For instances, the following code aborts with an error message:

  Eigen::MatrixXd m(6,6);
  std::cout << m(7,1) << "\n";
 
> P.S. Last question that has been niggling me for the last day - where is
> "VectorXd" actually defined?? Both grep and ctags refused to show me where it
> is actually typedef'ed, and I am perplexed.

It's in an ugly macro ... search for EIGEN_MAKE_TYPEDEFS, it should be in Matrix.h if I remember correctly.
Comment 23 Christoph Hertzberg 2013-06-20 19:02:25 UTC
(In reply to comment #21)
W.r.t. out-of-bound checking: If you found a method which circumvents the out-of-bounds checks this is likely a bug. But I agree with Jitse that it is more likely that you accidentally disabled assertions.

> P.S. Last question that has been niggling me for the last day - where is
> "VectorXd" actually defined?? Both grep and ctags refused to show me where it
> is actually typedef'ed, and I am perplexed.

Eclipse CDT goes to a line (currently 395) in Matrix.h showing:

  EIGEN_MAKE_TYPEDEFS_ALL_SIZES(double,               d)

which it expands to:

typedef Matrix<double, 2, 2> Matrix2d;  \
/** many more lines ... */                                    \
typedef Matrix<double, Dynamic, 1>    VectorXd;  \
/** ... */                                    \
typedef Matrix<double, Dynamic, 4> MatrixX4d;
Comment 24 Marios Visvardis 2013-06-25 12:10:01 UTC
I have similar problems. 


I am using Eigen 3.1 with g++-4.7.2/ binutils-2.23 on CentoOS 4 and 5.

The following block crashes(not always...) for 32bit builds:
{
    data = Eigen::MatrixXd::Zero(10, 10);
}



The valgrind::memcheck output for the complete example is:

==26032== Invalid read of size 4
==26032==    at 0x804AC53: Eigen::internal::handmade_aligned_free(void*) (in /path/to/replicate)
==26032==    by 0x804AC6F: Eigen::internal::aligned_free(void*) (in /path/to/replicate)
==26032==    by 0x804ACFB: void Eigen::internal::conditional_aligned_free<true>(void*) (in /path/to/replicate)
==26032==    by 0x804ACE8: void Eigen::internal::conditional_aligned_delete_auto<double, true>(double*, unsigned int) (in /path/to/replicate)
==26032==    by 0x804ACD5: Eigen::DenseStorage<double, -1, -1, -1, 0>::~DenseStorage() (in /path/to/replicate)
==26032==    by 0x804AC82: Eigen::PlainObjectBase<Eigen::Matrix<double, -1, -1, 0, -1, -1> >::~PlainObjectBase() (in /path/to/replicate)
==26032==    by 0x804AC96: Eigen::Matrix<double, -1, -1, 0, -1, -1>::~Matrix() (in /path/to/replicate)
Comment 25 Marios Visvardis 2013-06-25 15:35:31 UTC
My code was split into two compilation units. For some reason, for each compilation the _POSIX_ADVISORY_INFO had different value(the results were different if some boost headers were added/removed). This caused allocating and freeing of memory to be done using different methods(posix_memalign vs handmade_aligned) and naturally a segmentation fault.

A possible workaround to this problem is to enforce the same definition of _POSIX_ADVISORY_INFO right before including Eigen/Core, by including <unistd.h>.

So this worked for me:

#include <unistd.h>
#include <Eigen/Core>
Comment 26 Gael Guennebaud 2013-06-26 10:24:03 UTC
(In reply to comment #25)

Nice discovery. Indeed, it seems important to #include <unistd.h> before checking _POSIX_ADVISORY_INFO. So now the question is to safely detect that this header file is really present without configure step. One option would to be very conservative and include it only if __unix__ is defined, and do not check _POSIX_ADVISORY_INFO if we have not included <unistd.h>.

Another option would be to simply remove the possibility of using posix_memalign.
Comment 27 Gael Guennebaud 2013-06-26 10:36:39 UTC
Created attachment 357 [details]
include unistd.h if __unix

Here is a patch proposal.
Comment 28 Jitse Niesen 2013-06-26 13:59:11 UTC
Comment on attachment 357 [details]
include unistd.h if __unix

Review of attachment 357 [details]:
-----------------------------------------------------------------

Conservative seems to be the right approach here. Looks fine to me, except for the typo.

::: Eigen/src/Core/util/Memory.h
@@ +60,5 @@
>  
> +// See bug 554 (http://eigen.tuxfamily.org/bz/show_bug.cgi?id=554)
> +// It seems to be unsafe to check _POSIX_ADVISORY_INFO without including unistd.h first.
> +// Currently, let's include it only on unix systems:
> +#ifdef __unix___

you have three underscores at the end while you want only two
Comment 29 Gael Guennebaud 2013-06-26 22:52:30 UTC
thanks for the typo. this commit checks for both __unix and __unix__ and I'll assume this close this bug. If not feel free to reopen.

https://bitbucket.org/eigen/eigen/commits/21273ebd6b4d/
Changeset:   21273ebd6b4d
User:        ggael
Date:        2013-06-26 22:49:14
Summary:     Fix bug 554: include unistd.h before checking the presence of posix_memalign.
Comment 30 Marios Visvardis 2013-06-27 11:59:25 UTC
Thanks for your quick response. The proposed patch should fix the problem.

Note You need to log in before you can comment on or make changes to this bug.