New user self-registration is disabled due to spam. Please email eigen-core-team @ lists.tuxfamily.org if you need an account.
Before reporting a bug, please make sure that your Eigen version is up-to-date!
Bug 1318 - More robust quaternion from matrix
Summary: More robust quaternion from matrix
Status: NEW
Alias: None
Product: Eigen
Classification: Unclassified
Component: Geometry (show other bugs)
Version: 3.4 (development)
Hardware: All All
: Normal Accuracy Problem
Assignee: Nobody
URL:
Whiteboard:
Keywords:
Depends on: 458
Blocks: 3.4
  Show dependency treegraph
 
Reported: 2016-09-29 15:27 UTC by Michael Norel
Modified: 2016-09-30 15:21 UTC (History)
4 users (show)



Attachments

Description Michael Norel 2016-09-29 15:27:36 UTC
Conversion code, assumes strong orthonormal 3x3 matrix as input. But in real data our matrix can have small errors. This errors produce nonunity of quaternion.

To demonstrate, how important to keep quaternion normalized, we can use this code:

 Eigen::Quaterniond qc = Eigen::Quaterniond(1.1, 2.2, 3.3, 4.4).normalized();
        for (int i = 0; i < 1000; i++)
        {
            qc *= qc;
            qc = Eigen::Quaterniond(qc.toRotationMatrix()*qc.toRotationMatrix());

            std::cout << "1.0 - qc.squaredNorm(): " << 1.0 - qc.squaredNorm() << std::endl;
        }

Using VC12, i have INF result with only 1000 iterations.


I propose use normalization instead of analytic scale of quaternion in this conversion. It can be slower, but produce unit quaternion. The performance depends on performance of "normalize" and its vectorization.

In Eigen it can look:
{
      ...
      q.w() = t + Scalar(1.0);
      q.x() = mat.coeff(2,1) - mat.coeff(1,2));
      q.y() = mat.coeff(0,2) - mat.coeff(2,0));
      q.z() = mat.coeff(1,0) - mat.coeff(0,1));

      // same with other cases
      ... 
 
      
      normalize(); // normalize on exit    
}
I can provide full version of function if required.
Comment 1 Gael Guennebaud 2016-09-30 07:31:38 UTC
hm, with clang or gcc, after 10000 iterations, I get:

1.0 - qc.squaredNorm(): -4.44089e-16
1.0 - qc.squaredNorm(): 1.11022e-16
1.0 - qc.squaredNorm(): -2.22045e-16
1.0 - qc.squaredNorm(): -2.22045e-16
1.0 - qc.squaredNorm(): 0
1.0 - qc.squaredNorm(): -2.22045e-16
1.0 - qc.squaredNorm(): 1.11022e-16
1.0 - qc.squaredNorm(): 0
1.0 - qc.squaredNorm(): 1.11022e-16
1.0 - qc.squaredNorm(): 0
1.0 - qc.squaredNorm(): 0
1.0 - qc.squaredNorm(): 0
1.0 - qc.squaredNorm(): 2.22045e-16
1.0 - qc.squaredNorm(): 0
1.0 - qc.squaredNorm(): 1.11022e-16
1.0 - qc.squaredNorm(): 1.11022e-16
1.0 - qc.squaredNorm(): 0
Comment 2 Gael Guennebaud 2016-09-30 08:04:04 UTC
Arf, sorry for the noise, I still had the fix of bug 458 when testing your code snippet (i.e., implicit re-normalization in Quaternion::toRotationMatrix()).

Actually, simply doing qc *= qc; is enough to get significant accuracy loss as it quickly converges to qc=0;

It is still unclear to me how to address these issue without huge impact on performance when unitary constraints of quaternions and matrices is already handled by the user. 

My rationale is that if a method can be generalized with reasonable impact on performance (say 20%), then let's go for it. Otherwise, e.g. for Quaternion::toRotationMatrix() as in bug 458, we could add some variants or optional parameters or decorators to indicate whether we guarantee the input is unitary or not. For instance, if we default to slow but safe:

Matrix3 m;
Quaternion q;

q = m; // slow conversion
q = tagged<unitary>(m); // fast conversion
AngleAxis aa(theta, tagged<unit>(vec)); // fast
AngleAxis aa(theta, vec); // shortcut for AngleAxis(theta,tagged<unit>(vec.normalized())
m = q.toRotationMatrix();
m = tagged<unit>(q).toRotationMatrix();

This a bit verbose and might look cumbersome, but this has the advantage to work with operator=, and it is easily extensible and generalizable.
Comment 3 Michael Norel 2016-09-30 08:22:39 UTC
(In reply to Gael Guennebaud from comment #2)
> It is still unclear to me how to address these issue without huge impact on
> performance when unitary constraints of quaternions and matrices is already
> handled by the user. 

Here , we can have different situation than "toRotationMatrix", i expect lower slow-down than 2x. Sorry, this week, i cant run benchmarks with different compilers. But my code introduce a bit more operations that can be vectorized (normalize()). 

If you can check its performance, ill be happy. It should look like 

template<typename Other>
struct quaternionbase_assign_impl<Other,3,3>
{
  typedef typename Other::Scalar Scalar;
  typedef DenseIndex Index;
  template<class Derived> static inline void run(QuaternionBase<Derived>& q, const Other& mat)
  {
    using std::sqrt;
    // This algorithm comes from  "Quaternion Calculus and Fast Animation",
    // Ken Shoemake, 1987 SIGGRAPH course notes
    Scalar t = mat.trace();
    if (t > Scalar(0))
    {
      q.w() = t + Scalar(1.0);
      q.x() = mat.coeff(2,1) - mat.coeff(1,2));
      q.y() = mat.coeff(0,2) - mat.coeff(2,0));
      q.z() = mat.coeff(1,0) - mat.coeff(0,1));
    }
    else
    {
      DenseIndex i = 0;
      if (mat.coeff(1,1) > mat.coeff(0,0))
        i = 1;
      if (mat.coeff(2,2) > mat.coeff(i,i))
        i = 2;
      DenseIndex j = (i+1)%3;
      DenseIndex k = (j+1)%3;

      q.coeffs().coeffRef(i) = mat.coeff(i,i)-mat.coeff(j,j)-mat.coeff(k,k) + Scalar(1.0);
      q.w() = mat.coeff(k,j)-mat.coeff(j,k);
      q.coeffs().coeffRef(j) = mat.coeff(j,i)+mat.coeff(i,j);
      q.coeffs().coeffRef(k) = mat.coeff(k,i)+mat.coeff(i,k);
    }

    q.normalize();
  }
Comment 4 Michael Norel 2016-09-30 08:26:11 UTC
P.S. i wander if trick with indexes is faster than explicit hardcode of all 4 cases, is it correct? i mean , the indexes magic can break the CPU prediction pipeline. 

I mean some kind of explicit code like 
 .... // cut from public sources, matrix can be transposed
if (tr > 0.0f){
            set( m[1][2] - m[2][1], m[2][0] - m[0][2], m[0][1] - m[1][0], tr+1.0f );
        }else                
        if( (m[0][0] > m[1][1] ) && ( m[0][0] > m[2][2]) ) {
            set( 1.0f + m[0][0] - m[1][1] - m[2][2], m[1][0] + m[0][1],
                 m[2][0] + m[0][2], m[1][2] - m[2][1] );
         }else 
        if ( m[1][1] > m[2][2] ){
            set( m[1][0] + m[0][1], 1.0f + m[1][1] - m[0][0] - m[2][2],
                 m[2][1] + m[1][2], m[2][0] - m[0][2] ); 
         }else{
            set( m[2][0] + m[0][2], m[2][1] + m[1][2],
                 1.0f + m[2][2] - m[0][0] - m[1][1], m[0][1] - m[1][0] );
         }
...

can be faster ?
Comment 5 Gael Guennebaud 2016-09-30 13:39:12 UTC
It's even worse, like more than x3 slower:

Current:
float:  0.000559502s
double: 0.00140823s

Proposal:./a.out
float:  0.0026088s
double: 0.00404474s

There is no measurable difference between the index trick and if/then/else.
Comment 6 Gael Guennebaud 2016-09-30 13:44:36 UTC
and if you're puzzled, here is the generated ASM for the q.normalize() part:

	movaps	(%rdi), %xmm0
	movaps	%xmm0, %xmm1
	mulps	%xmm1, %xmm1
	haddps	%xmm1, %xmm1
	haddps	%xmm1, %xmm1
	xorps	%xmm2, %xmm2
	ucomiss	%xmm2, %xmm1
	jbe	LBB1_7
	xorps	%xmm2, %xmm2
	movss	%xmm1, %xmm2            ## xmm2 = xmm1[0],xmm2[1,2,3]
	sqrtss	%xmm2, %xmm2
	shufps	$0, %xmm2, %xmm2        ## xmm2 = xmm2[0,0,0,0]
	divps	%xmm2, %xmm0
	movaps	%xmm0, (%rdi)
LBB1_7:
Comment 7 Michael Norel 2016-09-30 15:21:44 UTC
Sorry , you wasted time for this.
Again big surprise. Same situation as "toRotationMatrix".

Note You need to log in before you can comment on or make changes to this bug.