This bugzilla service is closed. All entries have been migrated to https://gitlab.com/libeigen/eigen

Bug 1001

Summary: infer dimensions of Dynamic-sized temporaries from the entire expression (if possible)
Product: Eigen Reporter: Emily <el>
Component: Core - matrix productsAssignee: Nobody <eigen.nobody>
Status: RESOLVED WONTFIX    
Severity: Optimization CC: chtz, gael.guennebaud
Priority: Low    
Version: 3.4 (development)   
Hardware: All   
OS: All   
Whiteboard:
Bug Depends on: 99    
Bug Blocks: 814    

Description Emily 2015-04-20 15:33:50 UTC
I'm calculating "scalar = vector1.transpose() * matrix * vector2". This is the inner most time critical portion of my code, the expression is evaluated using a temporary matrix on the heap. Vectors 1 & 2 are compile-time fixed size, and the matrix is a block to a dynamically allocated matrix. There should be no need to allocate a temporary on the heap in this case. The heap allocation attributes to over 50% of my total runtime.

I would very much like if this could be optimized to not cause a temporary on the heap.

The following code illustrates my use case:

    Eigen::internal::set_is_malloc_allowed(true);
    Eigen::ArrayXXd data(1000, 1000);
    data.setRandom();

    Eigen::internal::set_is_malloc_allowed(false);

    Eigen::Ref<const Eigen::Array<double, Eigen::Dynamic, Eigen::Dynamic>> m_coeff(data.block(30,30,4, 4));

    double x0 = 0.5;
    double x1 = 0.5;

    Eigen::Vector4d X0;
    X0[0] = x0*(4 - 3 * x0) - 1;
    X0[1] = x0*(9 * x0 - 10);
    X0[2] = x0*(8 - 9 * x0) + 1;
    X0[3] = x0*(3 * x0 - 2);

    Eigen::Vector4d X1;
    X1[0] = x1*((2 - x1)*x1 - 1);
    X1[1] = x1*x1*(3 * x1 - 5) + 2;
    X1[2] = x1*((4 - 3 * x1)*x1 + 1);
    X1[3] = x1*x1*(x1 - 1);

    // Will trigger assert as transpose() is evaluated to a temporary on the heap.
    double ans = double(X0.transpose() * (m_coeff.matrix() * X1)) / 4.0;
Comment 1 Christoph Hertzberg 2015-04-22 02:23:41 UTC
X0.transpose() is not allocated on the heap, but m_coeff.matrix() * X1 is because it results in a Dynamic-sized vector.
This will eventually be fixed with evaluator-tree optimizations, but certainly not in 3.2.
If you know the size of the block (and the Ref) at compile-time, write:
  Ref<const Array<double, 4, 4> > m_coeff(data.block<4,4>(30,30));
If you access m_coeff always in a matrix-like manner, you can also use
  Ref<const Matrix4d> m_coeff(data.block<4,4>(30,30));

Furthermore, you may want to replace the last line by
  X0.dot(m_coeff.matrix() * X1) / 4.0;
Comment 2 Emily 2015-04-22 09:14:40 UTC
Thanks, those tips gave me another 30% on top of the 50% or so I gained from removing the temporary by hand (evaluated into a vector4d temporary).

I love you guys!
Comment 3 Christoph Hertzberg 2015-04-22 09:19:22 UTC
I renamed the bug to describe the remaining issue.

Further examples, that might eventually be influenced:
  MatrixXd A, B;
  Matrix4d C;
  Vector4d x,y;

  C+A*B;   // A*B must be 4x4
  x+A*y;   // A must be 4x4, A*y must be 4x1


Probably not important enough to block 3.3
Comment 4 Gael Guennebaud 2019-02-18 21:27:07 UTC
I'm not sure this is worth the effort: increase in internal logic complexity, increase compilation time for very little gain in practice because this is something that is expected to be handled by user code.

There are many more interesting optimization opportunities, e.g., mat*mat*vec just to name one.
Comment 5 Nobody 2019-12-04 14:32:34 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/1001.