We have some performance regressions with recent compiler versions of both clang and gcc. The problem is bad register allocation leading to spilling. More precisely, here is what I found: With AVX (no FMA): - gcc 5: OK - gcc >= 6: spilling in 2pX4 kernel - clang: OK With AVX+FMA: - gcc: OK - clang 5: OK - clang >=6: spilling in 3pX4 kernel With AVX512f(+FMA) - gcc: ok - clang: ok
For gcc, I managed to trick it with more aggressive asm comments to isolate each "EIGEN_GEBP_ONESTEP": asm("" : [a0] "+x" (A0), [a1] "+x" (A1) ); \ Such a trick does not work for clang.
For gcc: https://bitbucket.org/eigen/eigen/commits/9f52fde0348
for clang: https://bitbucket.org/eigen/eigen/commits/40e26d3f60fb/ I backported the two workaround to 3.3
-- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/1637.