Bug 363 - Eigen segfaults on huge matrices
: Eigen segfaults on huge matrices
Status: RESOLVED FIXED
Product: Eigen
Classification: Unclassified
Component: General
: 3.0
: x86 - 32-bit Linux
: --- normal
Assigned To: Nobody
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2011-10-15 14:17 UTC by Etienne Vouga
Modified: 2011-11-06 21:30 UTC (History)
2 users (show)



Attachments
in Memory.h, check byte sizes for overflow (5.41 KB, patch)
2011-10-15 17:09 UTC, Benoit Jacob
no flags Details | Diff | Splinter Review
force inlining of overflow helpers (5.66 KB, patch)
2011-11-06 20:35 UTC, Benoit Jacob
no flags Details | Diff | Splinter Review

Description Etienne Vouga 2011-10-15 14:17:31 UTC
This simple program segfaults inside setZero():

#include <Eigen/Core>

int main()
{
  Eigen::MatrixXd M(19604, 29601);
  M.setZero();
}



Here's the backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x080491fb in Eigen::DenseCoeffsBase<Eigen::Matrix<double, -1, -1, 0, -1, -1>,
1>::copyCoeff<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, index=43428860,
other=...) at eigen/Eigen/src/Core/DenseCoeffsBase.h:508
508          derived().coeffRef(index) = other.derived().coeff(index);
(gdb) bt
#0  0x080491fb in Eigen::DenseCoeffsBase<Eigen::Matrix<double, -1, -1, 0, -1,
-1>,
1>::copyCoeff<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, index=43428860,
other=...) at eigen/Eigen/src/Core/DenseCoeffsBase.h:508
#1  0x08049109 in Eigen::internal::assign_impl<Eigen::Matrix<double, -1, -1, 0,
-1, -1>, Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> >, 1, 0>::run (dst=..., src=...) at
eigen/Eigen/src/Core/Assign.h:316
#2  0x0804900d in Eigen::DenseBase<Eigen::Matrix<double, -1, -1, 0, -1, -1>
>::lazyAssign<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, other=...) at
eigen/Eigen/src/Core/Assign.h:511
#3  0x08048f6a in Eigen::PlainObjectBase<Eigen::Matrix<double, -1, -1, 0, -1,
-1>
>::lazyAssign<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, other=...) at
eigen/Eigen/src/Core/PlainObjectBase.h:371
#4  0x08048f35 in Eigen::internal::assign_selector<Eigen::Matrix<double, -1,
-1, 0, -1, -1>,
Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> >, false, false>::run (dst=...,
other=...) at eigen/Eigen/src/Core/Assign.h:534
#5  0x08048ecd in Eigen::PlainObjectBase<Eigen::Matrix<double, -1, -1, 0, -1,
-1>
>::_set_noalias<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, other=...) at
eigen/Eigen/src/Core/PlainObjectBase.h:576
#6  0x08048e7e in Eigen::PlainObjectBase<Eigen::Matrix<double, -1, -1, 0, -1,
-1>
>::_set_selector<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, other=...) at
eigen/Eigen/src/Core/PlainObjectBase.h:561
#7  0x08048d95 in Eigen::PlainObjectBase<Eigen::Matrix<double, -1, -1, 0, -1,
-1> >::_set<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, other=...) at
eigen/Eigen/src/Core/PlainObjectBase.h:553
#8  0x08048c68 in Eigen::Matrix<double, -1, -1, 0, -1,
-1>::operator=<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op<double>,
Eigen::Matrix<double, -1, -1, 0, -1, -1> > > (this=0xbffff1ec, other=...) at
eigen/Eigen/src/Core/Matrix.h:189
#9  0x08048b0c in Eigen::DenseBase<Eigen::Matrix<double, -1, -1, 0, -1, -1>
>::setConstant (this=0xbffff1ec, value=@0xbffff1b8) at
eigen/Eigen/src/Core/CwiseNullaryOp.h:341
#10 0x08048a1b in Eigen::DenseBase<Eigen::Matrix<double, -1, -1, 0, -1, -1>
>::setZero (this=0xbffff1ec) at eigen/Eigen/src/Core/CwiseNullaryOp.h:490
#11 0x08048806 in main () at test.cpp:7
Comment 1 Benoit Jacob 2011-10-15 15:30:38 UTC
How's that surprising?

You allocate a too big matrix so that the allocation fails, apparently you had
exceptions disabled or else you get

bjacob@cahouette:~$ g++ oom.cpp -o oom -I eigen
bjacob@cahouette:~$ ./oom
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted

Even with exceptions disabled you could still easily check that allocation
succeeded (M.data() != null).
Comment 2 Etienne Vouga 2011-10-15 15:53:34 UTC
I don't think allocation is failing, at least not on my machine. If I compile
and run

#include <Eigen/Core>
#include <iostream>

int main()
{
  Eigen::MatrixXd M(19604, 29601);
  if(M.data() == NULL)
    return 0;
  M.setZero();
}


I get the same segfault, with the same stack trace.
Comment 3 Benoit Jacob 2011-10-15 16:11:54 UTC
Ok sorry, I get it now. In 32bit mode, we use 32bit integers for sizes, and

  19604*29601*8 = 4642384032

is too big.

(19604 * 29601 * 8) mod (2^32) = 347416736

Setting a breakpoint in Eigen::internal::aligned_malloc I get:

Breakpoint 1, Eigen::internal::aligned_malloc (size=347416736)
    at eigen/Eigen/src/Core/util/Memory.h:199
199       check_that_malloc_is_allowed();
(gdb) bt
#0  Eigen::internal::aligned_malloc (size=347416736) at
eigen/Eigen/src/Core/util/Memory.h:199
#1  0x08048f8d in Eigen::internal::conditional_aligned_malloc<true>
(size=347416736)
    at eigen/Eigen/src/Core/util/Memory.h:288
#2  0x08048ec1 in Eigen::internal::conditional_aligned_new_auto<double, true>
(size=580298004)
    at eigen/Eigen/src/Core/util/Memory.h:397
#3  0x08048d6f in Eigen::DenseStorage<double, -1, -1, -1, 0>::resize
(this=0xffffd34c, 
    size=580298004, rows=19604, cols=29601) at
eigen/Eigen/src/Core/DenseStorage.h:230
#4  0x08048c6e in Eigen::PlainObjectBase<Eigen::Matrix<double, -1, -1, 0, -1,
-1> >::_init2<int, int> (this=0xffffd34c, rows=19604, cols=29601) at
eigen/Eigen/src/Core/PlainObjectBase.h:584
#5  0x08048b70 in Eigen::Matrix<double, -0x00000000000000001,
-0x00000000000000001, 0, -0x00000000000000001,
-0x00000000000000001>::Matrix<int, int> (this=0xffffd34c, x=@0xffffd35c, 
    y=@0xffffd358) at eigen/Eigen/src/Core/Matrix.h:252
#6  0x0804898a in main () at oom.cpp:5


What I really don't get is where the 580298004 comes from.
Comment 4 Benoit Jacob 2011-10-15 16:13:35 UTC
oh ok 580298004 = the number of entries in the matrix.
Comment 5 Benoit Jacob 2011-10-15 16:18:59 UTC
So to be clear, fixing this bug means properly checking for integer overflow
when computing buffer sizes.

In Mozilla I have code to do that:
http://hg.mozilla.org/mozilla-central/file/6c8a909977d3/xpcom/ds/CheckedInt.h

in our case it's pretty simple (multiplications only, unsigned integers, don't
try to use a twice bigger integer type)

http://hg.mozilla.org/mozilla-central/file/6c8a909977d3/xpcom/ds/CheckedInt.h#l266
Comment 6 Benoit Jacob 2011-10-15 17:09:36 UTC
Created attachment 218 [details] [review]
in Memory.h, check byte sizes for overflow

This takes care of half of the problem, and is already enough to address your
case.

This makes us throw a std::bad_alloc on integer overflow when computing the
byte size. My understanding of the c++ standard is that that's all what the
regular operator new is supposed to be doing on bad allocation. Likewise our
own allocation routines already have a comment saying:

  * On allocation error, the returned pointer is undefined,
  * but if exceptions are enabled then a std::bad_alloc is thrown.

So at this point we're not guaranteeing a null pointer on error anyways (so I
was wrong in my first reply when I said that M.data()==null could be used as a
test).

The other half of the problem will be to handle the case where width*height
overflows.
Comment 7 Benoit Jacob 2011-10-16 22:13:21 UTC
Bug fixed by these 3 changesets:

changeset:   4283:f76f77b7d086
tag:         tip
user:        Benoit Jacob <jacob.benoit.1@gmail.com>
date:        Sun Oct 16 16:12:19 2011 -0400
summary:     Bug 363 - add test for integer overflow in size computations

changeset:   4282:a07628375b86
user:        Benoit Jacob <jacob.benoit.1@gmail.com>
date:        Sun Oct 16 16:12:19 2011 -0400
summary:     Bug 363 - check for integer overflow in size=rows*cols
computations

changeset:   4281:25ba289d5292
user:        Benoit Jacob <jacob.benoit.1@gmail.com>
date:        Sun Oct 16 16:12:19 2011 -0400
summary:     Bug 363 - check for integer overflow in byte-size computations
Comment 8 Benoit Jacob 2011-11-06 20:35:05 UTC
Created attachment 225 [details] [review]
force inlining of overflow helpers

Here's a follow-up attempting to force the compiler to inline these helper
functions.

This fixes for me the regression reported to the eigen mailing list in the
"Significant perf regression probably due to bug 363 patches" thread.

Let me know if you like the EIGEN_ALWAYS_INLINE changes in particular. I think
it makes more sense this way.
Comment 9 Benoit Jacob 2011-11-06 21:30:04 UTC
Comment on attachment 225 [details] [review]
force inlining of overflow helpers

Pushed: c29d777b278c on default branch and e5f87fa9e5c2 on 3.0 branch.

Note You need to log in before you can comment on or make changes to this bug.