New user self-registration is currently disabled. Please email eigen-core-team @ if you need an account.
Bug 667 - Functions must be DECLARED __forcedinline.
Functions must be DECLARED __forcedinline.
Product: Eigen
Classification: Unclassified
Component: General
x86 - 32-bit Windows
: Normal Unknown
Assigned To: Nobody
Depends on:
Blocks: 3.4
  Show dependency treegraph
Reported: 2013-10-01 16:04 UTC by panda-34
Modified: 2016-09-07 20:04 UTC (History)
2 users (show)


Description panda-34 2013-10-01 16:04:32 UTC
It turns out that Intel Composer 14 does NOT respect __forcedinline directive present at out-of-class member function definitions, but only those at member declarations. For example, DenseBase::lazyAssign, defined separately in assign.h is NOT inlined for any but the simplest expression templates (which makes a horrible mess of performance for expressions with large number of scalars, as they're all pushed into stack). Visual Studio, on the other hand, requires __forcedinline at definitions (but has no problem with them present at declarations also), so, please, consider duplicating EIGEN_STRONG_INLINE into member declarations.
Comment 1 Gael Guennebaud 2016-01-31 14:59:05 UTC
Sorry for not looking at this issue earlier. ICC is indeed that stupid, and pretty bad at inlining in general. I have examples where it fails to inline the trivial copy-constructor that it generated itself. For instance for CwiseUnaryOp, it introduces calls to functions with a body as trivial as:

  movq      (%rsi), %rax 
  movq      %rax, (%rdi) 
  movq      8(%rsi), %rdx
  movq      %rdx, 8(%rdi)

I'll try to fix as many of them as possible, but I guess that we should also recommend users to compile with -inline-forceinline (or use gcc or clang ;).

Regarding the discrepancies between declarations and definitions, since there are more than 2000 occurences of EIGEN_STRONG_INLINE we would need an automatic way to detect them... any ideas?
Comment 2 Gael Guennebaud 2016-01-31 20:10:00 UTC
Here is a first bunch of fixes limiting the damages:

I haven't included the explicit copy-ctor because I'd prefer to find another workaround, hopefully...

Note You need to log in before you can comment on or make changes to this bug.