Created attachment 850 [details]
Minimal reproduction showing error using real life data
LDLT convergence loop fails to converge after commit
Steps to reproduce
Build and run l.cpp with:
g++ l.cpp -Ipath/to/Eigen && ./a.out
minimizeGroupL1 failed to converge in 100 iterations
Build date & Hardware:
Build 2018-04-15 on Ubuntu 17.10
Still present on master as of today.
Reverting the specified commit resolves the issue.
The problem seems to be already in the Matrix*Diagonal*Vector product. The following minimal example also exposes the problem:
Eigen::MatrixXd A; A.setRandom(100,4);
Eigen::VectorXd d,e; d.setRandom(100), e.setRandom(100);
Eigen::MatrixXd D = d.asDiagonal();
Eigen::VectorXd B1 = A.transpose()*d.asDiagonal()*e;
Eigen::VectorXd B2 = A.transpose()*D*e;
std::cout << B1 << "\n\n" << B2 << "\n";
The currently failing sparse_product and permutationmatrices unit tests may be related to this.
Gael: Can you have a look at this?
The issue seems to be that `A.transpose()*d.asDiagonal()*e;` is evaluated (slightly simplified) like so:
for(Index i=0; i<rows; ++i)
dest.coeffRef(i) = (lhs.row(i).cwiseProduct(actual_rhs.transpose())).sum();
Where `lhs` is `A.transpose()*d.asDiagonal()`, and the corresponding `.row(i)` claims to be a block which has linear access. However, for the (dense*diagonal).row(i) this evaluates to:
A.packet(linear_start + j) * d.packet(linear_start + j),
and the latter is obviously wrong.
A possible solution would be to make sure that `Block<Xpr,... >` only sets the `InnerPanel` flag, if `Xpr` has the LinearAccessBit set. Unfortunately, setting this seems not to be done consistently -- sometimes it is done in `traits<Xpr>::Flags`, sometimes in `traits<Xpr>::EvaluatorFlags` and sometimes in `evaluator<Xpr>::Flags`.
An even simpler test is:
sq_m2 = v1.asDiagonal();
sq_m2 = sq_m1 * sq_m2;
VERIFY_IS_APPROX( (sq_m1*v1.asDiagonal()).col(i), sq_m2.col(i) );
I added it in test/diagonalmatrices.cpp
Fix in default and 3.3 branches:
It is simply that the condition for forwarding linear indexing was not strict enough. Hopefully, I did not overseen anything this time.
hm, perhaps it would be better to simply define
ForwardLinearAccess = bool(evaluator<ArgType>::Flags&LinearAccessBit)
thus ignoring the InnerPanel flag which seems irrelevant here. This way (vec1+vec2).segment(i,n) would also forward linear indices.
I assume the idea of the InnerPanel flag was that, e.g., A.middleCols(...) can be accessed linearly (if A itself is column-major and can be accessed linearly), but A.block(...) cannot.
For the record, there are 3 cases:
1) Block<Sparse>: enable write access
2) Block<Dense,DirectAccess>: linearization+alignment
3) Block<Dense,Generic>: linearization
For cases 2 and 3, this is a welcome information when dealing with small matrices. For 1), this information is vital.
I improved the ForwardLinearAccess flag to activate it for any vector-expression: