Bugzilla – Bug 707

In-Place decompositions

Last modified: 2016-07-04 15:25:21 UTC

Most (dense) decompositions copy the to-be-decomposed matrix into an internal variable and decompose this variable (usually using an in-place decomposition). Often it would be nice to expose the in-place decomposition to the public interface, to save unnecessary copies, if the original matrix is not need afterwards. For decompositions not depending on additional data (such as permutations) this should be more or less straight-forward by providing a static `computeInPlace` function. If additional data is needed, maybe something like this could be implemented: Decomposition<Ref<MatrixXd> > decomp(mat);

The first issue to fix is whether we want to share the *input* matrix only (or matrices, e.g., for general eigen problems) or all *outputs*? Since the main use case is to save memory and copies when the input is not needed anymore, it seems to me that sharing inputs only would be enough. Regarding the implementation, we could add a computeInPlace() non-static member function. Internally, the decomposition object would possesses both a plain Matrix<> m_mat object and a Ref<> m_ref object. If compute() is called, then the m_mat will be resized and m_ref will reference it. If computeInPlace is called, the m_ref will reference the input matrix. The main advantage of this strategy is its flexibility as the same decomposition object can be used for both usages. On the other hand, this strategy does not permit to save memory for fixed size objects (no big deal), and the options of the Ref<> object have to be fixed by us. They include: - Alignment (default OFF) - OuterStride (default Dynamic) - InnerStride (default 1) With your suggestions, i.e., Decomposition<Ref<MatrixXd> >, then the power-user would be able to customize these options but the "inplace" nature of a call to compute() would be less obvious.

This will actually be necessary if we want an allocation-free LAPACK implementation (we'd also need to pass permutation vectors by reference for that).

Actually, this is already the case for Cholesky and LU for which our LAPACK interface uses our internal in-place routines and completely ignore the respective public classes.

Bump. This is becoming a major pain point for certain memory constrained problems that I work on.

As discussed there: https://forum.kde.org/viewtopic.php?f=74&t=133417&p=358470#p358470, declaring PartialPivLU<Ref<MatrixXd> > almost already works. We only have to add a non const ctor like: template<typename MatrixType> template<typename InputType> PartialPivLU<MatrixType>::PartialPivLU(EigenBase<InputType>& matrix) : m_lu(matrix.derived()), m_p(matrix.rows()), m_rowsTranspositions(matrix.rows()), m_l1_norm(0), m_det_p(0), m_isInitialized(false) { compute(matrix.derived()); } Of course, this does not cover the other objects like the permutations and workspace, but those are usually much less concerning. Another limitation is that you cannot call compute() with another matrix to work inplace in the new one: MatrixXd A, B; PartialPivLU<Ref<MatrixXd> > lu(A); // works inplace in A lu.compute(B); // copy B in A and works in A Actually, it is not clear what this second line should do, and copying in A could be the desired behavior. We could address this issue by adding a computeInPlace() method, which would be enabled only if the MatrixType is a Ref<>.

Created attachment 714 [details] Patch for inplace decomposition via SolverName<Ref<Matrix>> This patch adds support for inplace decomposition for: all LU, Cholesky, and QR decompositions.

Created attachment 715 [details] Usage example Here is a simple exemple checking and demonstrating inplace decomposition.

Since the required changes are so small, I'd like to add it for 3.3.

> template<typename InputType> > explicit LDLT(const EigenBase<InputType>& matrix) > : m_matrix(matrix.rows(), matrix.cols()), Can't we just replace the line above by m_matrix(matrix.derived()), and save the additional constructor? As this would cause a second copy (unless the compiler can optimize it away), how about adding another (protected/private) compute() method without parameters, and let the current compute(const EigenBase<...>&) method just make a copy and call that new method? This would also allow calling in-place decompositions on blocks of matrices or temporarily created Maps. The same goes for the other decompositions, of course.

The "problem" is that we need a non-const ctor for the inplace version. The extra copy already exists because the non-const ctor can still be called even without a Ref<> as MatrixType, so yes we should fix it and your suggestion is a good one.

Done: https://bitbucket.org/eigen/eigen/commits/c4e12820b422/ Summary: Bug 707: add inplace decomposition through Ref<> for Cholesky, LU and QR decompositions. Doc: https://bitbucket.org/eigen/eigen/commits/b2e2f917f79b/ Good enough for me. If someone needs more control, such as sharing storage of permutations and Householder coefficients, then let's open a new bug entry.