New user self-registration is currently disabled. Please email eigen-core-team @ lists.tuxfamily.org if you need an account.
Bug 824 - angularDistance accuracy
angularDistance accuracy
Status: RESOLVED FIXED
Product: Eigen
Classification: Unclassified
Component: Geometry
3.2
All All
: Normal Accuracy Problem
Assigned To: Nobody
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-06-12 17:35 UTC by Michael Norel
Modified: 2015-03-04 17:06 UTC (History)
4 users (show)



Attachments

Description Michael Norel 2014-06-12 17:35:01 UTC
using "acos" function near 1.0 provides a terrible inaccuracy in result angle. Small inacuracy of dot product or quaternion length produce awesome angle error.

I recommend to use atan2 function having good accuracy in whole range

fix can look like 

{
  using std::atan2;
  const QuaternionBase<OtherDerived> d = *this * other.conjugate(); 
  return static_cast<Scalar>( Scalar(2) * atan2( d.vec().norm(), d.w() ));
}
Comment 1 Michael Norel 2014-06-12 18:19:08 UTC
 the "w()"  must be positive for positive angle 

atan2( d.vec().norm(), fabs(d.w()) ));
Comment 2 Christoph Hertzberg 2014-06-13 17:59:43 UTC
Yes, that's certainly more stable and I guess the extra cost is justifiable.
However, it should be:
  using std::abs; 
and
  abs(d.w()) instead of fabs 

I'll submit that if nobody objects.
Comment 3 Gael Guennebaud 2014-06-16 10:40:29 UTC
Because of the quaternion product it is more than 5 times slower, though I admit its superior accuracy.
Comment 4 Michael Norel 2014-06-16 10:52:24 UTC
We can use branch, and apply slower algorithm near 1.0 and -1.0, or provide both methods separately.
Comment 5 Christoph Hertzberg 2014-06-16 13:21:40 UTC
(In reply to comment #3)
> Because of the quaternion product it is more than 5 times slower, though I
> admit its superior accuracy.

I admit that I did not benchmark this -- I wrongly assumed that acos or atan would be the significant cost producers here.
Then the question is, if there are people for which the speed of this method is more important than the accuracy.

I'll make this dependent on bug 560 so maybe we can offer different paths for unit and non-unit quaternions.
Unfortunately, even for unit quaternions we still have numerical inaccuracies for quaternions that are close to each other.
Comment 6 Christoph Hertzberg 2014-06-16 13:45:11 UTC
(In reply to comment #4)
> We can use branch, and apply slower algorithm near 1.0 and -1.0, or provide
> both methods separately.

I'm not really in favor of adding a branch for this. 
I think this method is likely to be applied to quaternions close to each other, so the slower branch would be applied most of the time. We'd also introduce a discontinuity for values near the branch.
Comment 7 Gael Guennebaud 2014-06-16 14:33:47 UTC
I observed a significant decrease of accuracy only when the quaternions are very very close. In my benchmark I generated pairs of quaternions from a random angle-axis and a slightly perturbed one as follow:

    float a1 = internal::random<float>(-M_PI, M_PI);
    Vector3f v1 = Vector3f::Random().normalized();
    float a2 = a1+internal::random<float>(-1e-6,1e6);
    Vector3f v2 = (v1+1e-6*Vector3f::Random()).normalized(); 

Then I compute the angular distance using float and the two variants and I compare them to the result of the stable version using double precision. For 100000 runs and using the absolute difference as the error measure,  I get:

                acos-based   ; atan2-based
% >1e-5: 0.53%            ; 0
% >1e-6: 5.3%              ; 0
max      : 0.000716858 ; 3.66889e-07
sum      : 0.0565818     ; 0.00610966

"% >1e-X" is the percentage of runs with an error greater than 1e-X. So that's not that bad.


A compromise would be to make angularDistance accurate using atan2, and offer a "cosineDistance" function which is enough when the distance is used for comparison to some thresholds, and if someone want speed, he has to take his responsibilities and simply do: std::acos(q1.cosineDistance(q2)).
Comment 8 Gael Guennebaud 2014-06-16 14:38:00 UTC
For the record, the new function would return:

using std::min;
return min(Scalar(1), Scalar(2) * numext::abs2(q1.dot(q2)) - Scalar(1))
Comment 9 Christoph Hertzberg 2014-06-16 16:45:13 UTC
If the precision-loss is not that bad either, how about implementing the more precise function under a different name?

The precision loss does somehow depend on the form of error, however.
E.g., if you have two near-unit-quaternions whose product of norms is y with angular distance of x, the error of the current method is approximately 
  acos(cos(x)*y)-x

http://www.wolframalpha.com/input/?i=plot+acos%28cos%28x%29*y%29-x+for+y+from+0.95+to+1.05+and+x+from+0+to+pi
Comment 10 Michael Norel 2015-02-20 14:41:50 UTC
Hi all, sorry for pause.

Gael, there is a problem with your benchmark, cause it uses just normalized quaternions. 

Often , there is a quaternions with small "magnitude" drift from 1.0. For example , after quaternion multiplication, lerping and so on. All this inaccuracy in quaternion length, will produce significant error in angle calculation. So i propose take into account not only acos() inaccuracy but quaternion length too. 

And we return back to the discussion about unit quaternions.
Comment 11 Christoph Hertzberg 2015-02-20 18:00:31 UTC
(In reply to Michael Norel from comment #10)
> Often , there is a quaternions with small "magnitude" drift from 1.0. 

That's exactly what I wanted to illustrate with y=norm(q1)*norm(q2) in comment 9.
Comment 12 Gael Guennebaud 2015-03-02 20:16:13 UTC
After this changeset: https://bitbucket.org/eigen/eigen/commits/4984501c3db8/, I observe a factor 2x only between the two versions.

If I enforce a re-normalization of the input quaternion in the acos based version, then both versions are as fast (or as slow) to each other.

After-all, I'm not against changing for the accurate version by default, and, if needed, add a fast-version under a new name.
Comment 13 Christoph Hertzberg 2015-03-04 15:38:55 UTC
I would prefer accuracy over speed here.
And if the atan and acos versions are equally fast/slow, then your benchmark in comment 7 suggests that the atan2 path is preferable.

I'm also ok with backporting to 3.2

N.B.:
We could think about specializing the implementation of q1*q2.conjugate() and q1.conjugate()*q2, but I guess this is hardly worth the effort, now that conjugate() is vectorized.
Comment 14 Gael Guennebaud 2015-03-04 16:57:52 UTC
Before vectorizing conjugate() I tried a fully vectorized version of angularAngle which is only marginally faster if any, so vectorizing conjugate product seems to be pointless.

For the record, here it is:


  Packet4f a = q1.coeffs().packet<Aligned>(0);
  Packet4f b = q2.coeffs().packet<Aligned>(0);
  Packet4f c = padd(pmul( vec4f_swizzle1(a,3,3,3,3), vec4f_swizzle1(b,3,0,1,2)),
                    pmul( vec4f_swizzle1(a,1,1,2,0), vec4f_swizzle1(b,1,2,0,1)));
  __m128 mask = _mm_set_ss(-0.f);
  c = _mm_xor_ps(c, mask);
  
  __m128 d = psub(padd(pmul(vec4f_swizzle1(a,2,0,1,1),vec4f_swizzle1(b,2,3,3,0)),
                        pmul(vec4f_swizzle1(a,0,2,0,2),b)),
                  c );
  c = _mm_andnot_ps(mask, d);
  d = pmul(d,d);
  d = _mm_sqrt_ss(_mm_add_ss(
      _mm_add_ss(vec4f_swizzle1(d,3,3,3,3),vec4f_swizzle1(d,2,2,2,2)),
      vec4f_swizzle1(d,1,1,1,1)));
  return 2. * std::atan2( pfirst(d), pfirst(c) );

Note You need to log in before you can comment on or make changes to this bug.