Created attachment 575 [details]
There is floating point comparison issue with the routine in SelfAdjointEigenSolver.h (Version 3.2.4) for fixed-size 3x3 matrices.
A tolerance is created as follows:
Scalar safeNorm2 = Eigen::NumTraits<Scalar>::epsilon();
In an older version (3.2.2), if followed the line
safeNorm2 *= safeNorm2;
which has been omitted. For matrices which are close to a multiple of the identity matrix, this change becomes crucial in some comparisons.
Created attachment 576 [details]
this seems to conflict with bug 854
Created attachment 577 [details]
Make direct eigenvalue computation of 3x3 matrices more stable by shifting the eigenvalues
We can make the computation significantly more stable by shifting the matrix by the mean of the eigenvalues (i.e. mat.trace()/3) -- note that (in exact math) this shifts the eigenvalues but does not influence the eigenvectors. We could then also safe the calculation of c2 in computeRoots.
I did not check yet if we could also save some cases of the eigenvector computation, i.e., is it even possible that the 3rd cross product is small? If the original matrix had NaNs or Infs, computeRoot should have failed already.
This shift definitely helps a lot in computing more accurate eigenvalues. I'd only suggest doing ".diagonal().array() -= shift" instead of using Identity.
(In reply to Christoph Hertzberg from comment #3)
> I did not check yet if we could also save some cases of the eigenvector
> computation, i.e., is it even possible that the 3rd cross product is small?
> If the original matrix had NaNs or Infs, computeRoot should have failed
I've checked for several matrices, and any one or twos of the 3 cases can be arbitrarily small.
After more tests, the current cross-product based approach is not very good. It is really reliable only if we compute all three possibilities, and then pick the best, but such a approach kills the performance (recall that high performance is the main motivation for this "direct" path).
So here is an alternative to the cross product approach based on the direct extraction of the nullspace:
// Compute the kernel of 'tmp' using a LU factorization of tmp with symmetric pivoting.
// Note that we only need to compute the factor U, which is of the form:
// x x x
// 0 x x
// 0 0 0
Index i0, i1, i2;
Scalar p0, p1, p2, f;
// Find the first pivot of index i0:
p0 = Scalar(1)/tmp.coeff(i0,i0);
// The other two rows/columns are:
i1 = (i0+1)%3;
i2 = (i0+2)%3;
// Update the first row of U:
tmp.coeffRef(i0,i1) *= p0;
tmp.coeffRef(i0,i2) *= p0;
// Thanks to symmetry, the remaining 2x2 block is of the form:
// p1 f
// f p2
p1 = tmp.coeffRef(i1,i1) - tmp.coeff(i1,i0) * tmp.coeff(i0,i1);
p2 = tmp.coeffRef(i2,i2) - tmp.coeff(i2,i0) * tmp.coeff(i0,i2);
f = tmp.coeffRef(i1,i2) - tmp.coeff(i1,i0) * tmp.coeff(i0,i2);
// Second pivoting:
if(std::abs(p1) < std::abs(p2))
p1 = p2;
// Back-subtitutions to form the 1D kernel of 'tmp':
eivecs.col(k).coeffRef(i2) = 1;
eivecs.col(k).coeffRef(i1) = - f / p1;
eivecs.col(k).coeffRef(i0) = - (tmp.coeff(i0,i2) + tmp.coeff(i0,i1) * eivecs.col(k).coeff(i1) );
It is clearly more costly than doing a single cross product, but it should be faster than doing two.
With both patches, I still haven't find a failing case after running the unit tests for hours...
(In reply to Gael Guennebaud from comment #6)
> So here is an alternative to the cross product approach based on the direct
> extraction of the nullspace: [...]
Sounds good to me. Better a reliable, reasonably fast implementation than a slightly faster implementation which occasionally fails. We will still provide a determinable maximal running time.
A possible flaw with the proposed implementation is that diagonal().maxCoeff() might be zero (though probably that's caught outside the fragment you posted).
> With both patches, I still haven't find a failing case after running the
> unit tests for hours...
We probably also should extend our unit tests by some 'trivial' border cases (e.g., diagonal matrices with very close eigenvalues, as in the original report -- probably with small off-diagonal noise).
I'll provide an updated patch for root computation in the next comment.
Created attachment 578 [details]
Updated root computation
For the computation of roots exploit that m.trace()==0 after shifting the matrix.
Actually, the computation should work for complex (self-adjoint) matrices as well, after replacing every other m(i,j) by conj(m(i,j)) [which of course can be done implicitly], and knowing that imag(m(i,i))==0.
Created attachment 579 [details]
Compute 3x3 eigenvectors using nullspace from specialized LU
Here is a more complete patch for the new eigenvector extraction method.
In the two calls to extract_kernel we are guaranteed that there is at least one meaningful non zero on the diagonal (because we are shifting wrt the minimum and maximum eigenvalues and the case were all eigenvalues are the same is already handled).
However, I'm 100% sure that a symmetric pivoting is guaranteed to succeed (compared to a full pivoting), that is, that we are guaranteed to find a second meaningful pivot along the diagonal...
Regarding the unit tests:
Summary: Make test matrices for eigensolver/selfadjoint even more tricky
Summary: Add regression test for bugs 854 and 1014, and check that the eigenvector matrix is unitary.
I haven't tested it yet, but your updated computeRoots looks good to me.
I found a problem (with both patches applied) when running:
$ ./test/eigensolver_selfadjoint_13 s1000
In that test a matrix is generated with a difference in the diagonal elements of just 4 ULPs. This makes the shift inexact and results in a scaledMatrix with a trace of about 0.66, thus returning bogus eigenvalues (with the patched computeRoots). This barely matters for the final eigenvalues but causes problems computing the eigenvectors.
A pragmatic solution would be to short-circuit the eigenvector computation if scale < abs(shift)*epsilon*8 (maybe 16) and return an identity matrix for the eigenvectors with the shift value as eigenvalues.
Or we just keep the current computeRoots function (it's not too many additional computations ...).
Right, I also observed this kind of problem. I've also locally added a comparison between the "direct" eigenvalues and the ones coming from the iterative algorithm and assuming that trace==0 severely affects the accuracy of the eigenvalues, not only the one of the eigenvector computation.
I've also run some benchmark, and the gain of assuming that trace==0 is negligible: computeRoots is dominated by the trigonometric function calls.
I've also observed that my new eigenvector computation is really slower, so I came up with the following version where the idea is to pick the column having the biggest diagonal element, and then pick the best of the two cross-product possibilities: (seems to be 15% faster even though this imply more muls and adds):
Scalar n0, n1;
VectorType c0, c1;
eivecs.col(l) = tmp.col(i0);
n0 = (c0 = tmp.col(i0).cross(tmp.col((i0+1)%3))).squaredNorm();
n1 = (c1 = tmp.col(i0).cross(tmp.col((i0+2)%3))).squaredNorm();
if(n0>n1) eivecs.col(k) = c0/std::sqrt(n0);
else eivecs.col(k) = c1/std::sqrt(n1);
We can also save some redundant computation between the two cross products (at the cost of readability, so I'll check whether the compiler already managed to get rid of them).
I guess to determine i0 we rather need to select the column with the largest element (or the largest norm), e.g. consider this matrix (with e=epsilon):
2e 0 0
0 -e 1
0 1 -e
But then I'm wondering if simply computing all 3 cross-products may not be faster? (exploiting the symmetry)
Created attachment 580 [details]
Stablely working version
It seems my last concerns are not justified. A near-zero diagonal can only occur for the "middle" eigenvector, which is calculated differently anyways.
I attached a patch combining Gael's last proposal with only minor improvements in computeRoots.
extractKernel now always returns true, so the if-branches should actually be removed. Also, we can save some computations and copies by exploiting the symmetry better.
Furthermore, all computations could be ported to complex self-adjoint matrices.
I pushed the last patch after some clean-up here:
Still must be backported to 3.2 and probably optimized before that.
(In reply to Christoph Hertzberg from comment #15)
> It seems my last concerns are not justified. A near-zero diagonal can only
> occur for the "middle" eigenvector, which is calculated differently anyways.
Yes, that's exactly why I reversed the computation of the closest two eigenvectors.
Thanks for pushing the changes.
I backported it to 3.2, the optimization opportunities are very marginal, so good enough for 3.2.5, and not a priority for 3.3 either.
To summarize, it remains to add support for complexes, and check for some speed improvement opportunities.
I'm opening a dedicated entry for complex.