This is a feature I've in mind for a while and that I've mentioned at few occasions in the past. Basically, the idea would be to implement our own stack for allocating temporaries with dynamic sizes, and thus save a huge amount of running time for small matrices. For temporaries that are local to a function, then alloca + Map<> should be enough, as I just implemented there: https://bitbucket.org/eigen/eigen/commits/5007ff66c9f6/ In theory, it only remains to replace nested_eval<T,N>::type V(X) by ei_declare_local_nested_eval(T,X,N,V) :) But some temporaries require a longer lifetime, typically the one created in the ctor of the evaluators. One possibility would be to keep using alloca by delegating the allocation in the function that creates/destroys the evaluator (e.g. in the assignment) and then propagate the allocated buffer down to the tree: evaluator<Src> src_eval(src); void* buffer = alloca(src_eval.nbTemporyBytes()); src_eval.init(buffer); for(i=...) dst_eval.coeffRef(i) = src_eval.coeff(i); nbTemporyBytes() would sum up the requires bytes, and init(void* &buffer) would pass the remaining buffer to the leaves, e.g.: binary_evaluator::init(void* &buffer) { lhs_eval.init(buffer); rhs_eval.init(buffer); } and an allocator that requested a temporary buffer would do: product_evaluator::init(void* &buffer) { ::new(&m_tmp) (buffer, rows, cols); generic_product_impl(m_tmp, lhs, rhs); buffer += sizeof(Scalar)*rows*cols; } Of course if the total amount of requested memory is too large, then we would allocate on the heap, but only once, which is still a win if there are multiple temporaries. Another option would be to implement our own stack-based allocator that would have the advantages to be cross-platform and that would possibly be much larger than the stack. It could even be dynamically expanded to amortize heap allocations across expressions. Another advantage is that the above three pass approach (ctor/nbTemporyBytes/init) could be avoided by allocating directly in the ctors. The main difficulty of this approach is multi-threading for which I have no clue how to deal with.
-- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/1568.