Summary: | More robust quaternion from matrix | ||
---|---|---|---|
Product: | Eigen | Reporter: | Michael Norel <minorlogic> |
Component: | Geometry | Assignee: | Nobody <eigen.nobody> |
Status: | NEW --- | ||
Severity: | Accuracy Problem | CC: | chtz, gael.guennebaud, hauke.heibel, jacob.benoit.1 |
Priority: | Normal | ||
Version: | 3.4 (development) | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Bug Depends on: | 458 | ||
Bug Blocks: | 814 |
Description
Michael Norel
2016-09-29 15:27:36 UTC
hm, with clang or gcc, after 10000 iterations, I get: 1.0 - qc.squaredNorm(): -4.44089e-16 1.0 - qc.squaredNorm(): 1.11022e-16 1.0 - qc.squaredNorm(): -2.22045e-16 1.0 - qc.squaredNorm(): -2.22045e-16 1.0 - qc.squaredNorm(): 0 1.0 - qc.squaredNorm(): -2.22045e-16 1.0 - qc.squaredNorm(): 1.11022e-16 1.0 - qc.squaredNorm(): 0 1.0 - qc.squaredNorm(): 1.11022e-16 1.0 - qc.squaredNorm(): 0 1.0 - qc.squaredNorm(): 0 1.0 - qc.squaredNorm(): 0 1.0 - qc.squaredNorm(): 2.22045e-16 1.0 - qc.squaredNorm(): 0 1.0 - qc.squaredNorm(): 1.11022e-16 1.0 - qc.squaredNorm(): 1.11022e-16 1.0 - qc.squaredNorm(): 0 Arf, sorry for the noise, I still had the fix of bug 458 when testing your code snippet (i.e., implicit re-normalization in Quaternion::toRotationMatrix()). Actually, simply doing qc *= qc; is enough to get significant accuracy loss as it quickly converges to qc=0; It is still unclear to me how to address these issue without huge impact on performance when unitary constraints of quaternions and matrices is already handled by the user. My rationale is that if a method can be generalized with reasonable impact on performance (say 20%), then let's go for it. Otherwise, e.g. for Quaternion::toRotationMatrix() as in bug 458, we could add some variants or optional parameters or decorators to indicate whether we guarantee the input is unitary or not. For instance, if we default to slow but safe: Matrix3 m; Quaternion q; q = m; // slow conversion q = tagged<unitary>(m); // fast conversion AngleAxis aa(theta, tagged<unit>(vec)); // fast AngleAxis aa(theta, vec); // shortcut for AngleAxis(theta,tagged<unit>(vec.normalized()) m = q.toRotationMatrix(); m = tagged<unit>(q).toRotationMatrix(); This a bit verbose and might look cumbersome, but this has the advantage to work with operator=, and it is easily extensible and generalizable. (In reply to Gael Guennebaud from comment #2) > It is still unclear to me how to address these issue without huge impact on > performance when unitary constraints of quaternions and matrices is already > handled by the user. Here , we can have different situation than "toRotationMatrix", i expect lower slow-down than 2x. Sorry, this week, i cant run benchmarks with different compilers. But my code introduce a bit more operations that can be vectorized (normalize()). If you can check its performance, ill be happy. It should look like template<typename Other> struct quaternionbase_assign_impl<Other,3,3> { typedef typename Other::Scalar Scalar; typedef DenseIndex Index; template<class Derived> static inline void run(QuaternionBase<Derived>& q, const Other& mat) { using std::sqrt; // This algorithm comes from "Quaternion Calculus and Fast Animation", // Ken Shoemake, 1987 SIGGRAPH course notes Scalar t = mat.trace(); if (t > Scalar(0)) { q.w() = t + Scalar(1.0); q.x() = mat.coeff(2,1) - mat.coeff(1,2)); q.y() = mat.coeff(0,2) - mat.coeff(2,0)); q.z() = mat.coeff(1,0) - mat.coeff(0,1)); } else { DenseIndex i = 0; if (mat.coeff(1,1) > mat.coeff(0,0)) i = 1; if (mat.coeff(2,2) > mat.coeff(i,i)) i = 2; DenseIndex j = (i+1)%3; DenseIndex k = (j+1)%3; q.coeffs().coeffRef(i) = mat.coeff(i,i)-mat.coeff(j,j)-mat.coeff(k,k) + Scalar(1.0); q.w() = mat.coeff(k,j)-mat.coeff(j,k); q.coeffs().coeffRef(j) = mat.coeff(j,i)+mat.coeff(i,j); q.coeffs().coeffRef(k) = mat.coeff(k,i)+mat.coeff(i,k); } q.normalize(); } P.S. i wander if trick with indexes is faster than explicit hardcode of all 4 cases, is it correct? i mean , the indexes magic can break the CPU prediction pipeline. I mean some kind of explicit code like .... // cut from public sources, matrix can be transposed if (tr > 0.0f){ set( m[1][2] - m[2][1], m[2][0] - m[0][2], m[0][1] - m[1][0], tr+1.0f ); }else if( (m[0][0] > m[1][1] ) && ( m[0][0] > m[2][2]) ) { set( 1.0f + m[0][0] - m[1][1] - m[2][2], m[1][0] + m[0][1], m[2][0] + m[0][2], m[1][2] - m[2][1] ); }else if ( m[1][1] > m[2][2] ){ set( m[1][0] + m[0][1], 1.0f + m[1][1] - m[0][0] - m[2][2], m[2][1] + m[1][2], m[2][0] - m[0][2] ); }else{ set( m[2][0] + m[0][2], m[2][1] + m[1][2], 1.0f + m[2][2] - m[0][0] - m[1][1], m[0][1] - m[1][0] ); } ... can be faster ? It's even worse, like more than x3 slower: Current: float: 0.000559502s double: 0.00140823s Proposal:./a.out float: 0.0026088s double: 0.00404474s There is no measurable difference between the index trick and if/then/else. and if you're puzzled, here is the generated ASM for the q.normalize() part: movaps (%rdi), %xmm0 movaps %xmm0, %xmm1 mulps %xmm1, %xmm1 haddps %xmm1, %xmm1 haddps %xmm1, %xmm1 xorps %xmm2, %xmm2 ucomiss %xmm2, %xmm1 jbe LBB1_7 xorps %xmm2, %xmm2 movss %xmm1, %xmm2 ## xmm2 = xmm1[0],xmm2[1,2,3] sqrtss %xmm2, %xmm2 shufps $0, %xmm2, %xmm2 ## xmm2 = xmm2[0,0,0,0] divps %xmm2, %xmm0 movaps %xmm0, (%rdi) LBB1_7: Sorry , you wasted time for this. Again big surprise. Same situation as "toRotationMatrix". -- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/1318. |