New user self-registration is currently disabled. Please email eigen-core-team @ if you need an account.
Bug 556 - Matrix multiplication crashes using mingw 4.7
Matrix multiplication crashes using mingw 4.7
Product: Eigen
Classification: Unclassified
Component: General
x86 - 32-bit Windows
: Normal major
Assigned To: Nobody
: 617 (view as bug list)
Depends on:
Blocks: 3.2
  Show dependency treegraph
Reported: 2013-02-21 02:03 UTC by djurikom
Modified: 2014-02-03 19:13 UTC (History)
10 users (show)

Disable ipa-cp-clone flag with mingw >= 6.0 (1.18 KB, patch)
2013-06-28 23:48 UTC, Gael Guennebaud
no flags Details | Diff

Description djurikom 2013-02-21 02:03:43 UTC
The code compiles without any problems. However, when executed the program crashes in this function:

MatrixXd sqdist(MatrixXd A, MatrixXd B)
	MatrixXd aa = (A.cwiseProduct(A)).colwise().sum();
	MatrixXd bb = (B.cwiseProduct(B)).colwise().sum();
	MatrixXd aSquare(aa.cols(), bb.cols());
	MatrixXd bSquare(aa.cols(), bb.cols());
	for (int i = 0; i < bb.cols(); i++)
		aSquare.col(i) = aa.transpose();
	for (int i = 0; i < aa.cols(); i++)
		bSquare.row(i) = bb;
	MatrixXd dist = aSquare + bSquare - 2.0 * A.transpose() * B;
	return dist;

The function is supposed to find a distance matrix between rows in A and B. The program crashes at "2.0 * A.transpose() * B". It is interesting that it DOES NOT crash when I compile using
> gcc -v
Using built-in specs.
Target: i686-w64-mingw32
Configured with: ../gcc44-svn/configure --target=i686-w64-mingw32 --host=i686-w6
4-mingw32 --disable-multilib --disable-nls --disable-win32-registry --prefix=/mi
ngw32 --with-gmp=/mingw32 --with-mpfr=/mingw32 --enable-languages=c,c++
Thread model: win32
gcc version 4.4.3 (GCC)

while it DOES crash using
>gcc -v
Using built-in specs.
Target: i686-pc-mingw32
Configured with: ../src/configure --prefix=/c/temp/gcc/dest --with-gmp=/c/temp/g
cc/gmp --with-mpfr=/c/temp/gcc/mpfr --with-mpc=/c/temp/gcc/mpc --enable-language
s=c,c++ --with-arch=i686 --with-tune=generic --disable-libstdcxx-pch --disable-n
ls --disable-shared --disable-sjlj-exceptions --disable-win32-registry --enable-
checking=release --enable-lto
Thread model: win32
gcc version 4.7.0 (GCC)

I am compiling using:
g++ -Wall -Wconversion -O3
Comment 1 djurikom 2013-02-21 02:12:12 UTC
I forgot to mention that it crashed when input matrices A and B were of various sizes, but you can check, for example, for sizes 40x123 and 32561x123, respectively, because for those sizes I first noticed the bug. E.g., it crashed when I generated the matrices using:

MatrixXd A = MatrixXd::Zero(40, 123);
MatrixXd B = MatrixXd::Zero(32561, 123);
MatrixXd C = sqdist(A, B);
Comment 2 Gael Guennebaud 2013-02-21 09:17:44 UTC
I cannot reproduce, can you paste the backtrace. Also, which Eigen version are you using? Thanks.
Comment 3 Gael Guennebaud 2013-02-25 18:54:09 UTC
btw, I though that the line abs(dist); was a typo when copy/pasting, but just to be sure, abs(dist) is invalid and does not compile unless you have your own abs function for MatrixXd...
Comment 4 djurikom 2013-02-25 20:46:47 UTC

I have trouble locating the exact version of Eigen in the documentation, sorry about that, but the first version I used was from, and then switched to in hope to fix the bug, I hope that helps. As for the abs() function, I implemented the following

void abs(MatrixXd &A)
    for (int i = 0; i < A.rows(); i++)
		for (int j = 0; j < A.cols(); j++)
			if (A(i, j) < 0)
				A(i, j) = -1.0 * A(i, j);

Sorry about late responses, I had/have some deadlines, but will post other information that might help asap (some additional tests, backtrace, ...).
Comment 5 Christoph Hertzberg 2013-03-16 13:03:24 UTC
Could you at least specify how the multiplication "crashes" (segfault, bad_alloc, ...)?
Otherwise, or until someone else can reproduce this, I'd vote for changing to invalid.

Besides that, there is some room for optimization:
Pass Matrices A, B, by const ref and not by value, 
No need to explicitly transpose A and B, 
aa and bb are Vectors and can be computed as A.rowwise().squaredNorm();
No need for temporary matrices {a,b}Square: you can use the replicate function or add the vectors directly into the result.
Your abs function can be replaced by A = A.cwiseAbs();
Comment 6 djurikom 2013-03-17 00:05:47 UTC
Again, sorry about late responses, and thank you very much for helpful comments!
The error still persists, and here are some more info. I am compiling the code in command prompt of Windows NT 5.1 Build: 2600, if that is of any help. I just get that "Send Error Report" window when I run the program. The exception is the following:

Exception Information:
Code: 0xc0000005          Flags: 0x00000000
Record: 0x0000000         Address: 0x000426837

By googling the exception code it seems that it might also crash due to my hardware.
Comment 7 Christoph Hertzberg 2013-03-17 14:28:05 UTC
Can you somehow rule out other problems? (e.g. run your program on a different machine)
I hardly do any Windows-programming so I can't be of much help here.
Comment 8 djurikom 2013-03-18 20:05:34 UTC
It appears that it works fine on Linux machine, at least on the computer I had at my disposal. So I guess it's something very specific, hopefully it will not manifest to other users.
Thanks for help, if I notice something else I will report it.
Comment 9 Gael Guennebaud 2013-06-14 15:29:49 UTC
We cannot reproduce with mingw 4.7, so I'm closing this bug.
Comment 10 Desire NUENTSA 2013-06-14 16:17:35 UTC
Seems like a MinGW bug. 
It works with -O2 but crash with -O3
Comment 11 Gael Guennebaud 2013-06-21 22:11:10 UTC
*** Bug 617 has been marked as a duplicate of this bug. ***
Comment 12 Gael Guennebaud 2013-06-21 22:13:31 UTC
It should be reported to mingw. On our side could we detect this compiler or level of optimization and issue a #error?
Comment 13 Gael Guennebaud 2013-06-21 22:27:33 UTC
hm, I don't see how detect the optimization level, however it would be nice to test individually each of these flags that are enabled by -O3:

-finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone 

and remove the problematic one using:

 #pragma GCC optimize ("-fno-blabla")
Comment 14 Pedro Tabacof 2013-06-28 17:20:00 UTC

I'm having a similar bug, my code crashes whenever I use any linear solver (except LLT for some reason) and I use the O3 flag on MinGW g++. Everything works perfectly with O2.

It's interesting that I tried to reproduce this bug using the same code but with randomly generated matrices, but the bug didn't happen. It only happens when I use the specific matrix of the problem I'm trying to solve, so unfortunately I cannot post the whole program here. 

I also tried what Gael said on his last post and I found out the culprit flag: -fipa-cp-clone. 

It's also interesting to note that MinGW still does not have the "-ftree-partial-pre" flag implemented, so it couldn't be tested.

Just to be clear, I'm using the latest Eigen and MinGW versions. I tried some preprocessor directives such as EIGEN_DONT_ALIGN but it didn't change anything.

Comment 15 Gael Guennebaud 2013-06-28 23:48:06 UTC
Created attachment 363 [details]
Disable ipa-cp-clone flag with mingw >= 6.0

Thank you for finding the right flag. Here is a patch that should workaround the problem. Please tell us if it works for you before I push it upstream. Thanks.
Comment 16 Gael Guennebaud 2013-07-05 23:49:21 UTC
ok, let's assume that does the trick:
Changeset:   fe8f1f3060f3
User:        ggael
Date:        2013-07-05 23:47:40
Summary:     Bug 556: workaround mingw bug with -O3 or -fipa-cp-clone
Comment 17 Sandro Mani 2013-08-05 22:27:05 UTC
Just a note, in case other people are suddenly battling huge gcc memory usage: using #pragma GCC optimize exposes an odd gcc bug:

Note You need to log in before you can comment on or make changes to this bug.