Bugzilla – Bug 642

Add vectorization of sqrt for doubles

Last modified: 2013-08-19 16:01:59 UTC

Currently SSE module has a packet sqrt function only for Packet4f, not for Packet2d, so any double-type coefficient-wise expression containing sqrt has to be calculated using single-data commands. I suggest you add template<> EIGEN_STRONG_INLINE Packet2d psqrt<Packet2d>(const Packet2d& a) { return _mm_sqrt_pd(a); } so that sqrt can be vectorized. Also, it is my understanding that the algorithm used to calculate sqrt in the current implementation of psqrt<Packet4f> (via inverse sqrt) has reduced accuracy, so maybe use it only under EIGEN_FAST_MATH flag, letting the user use the full accuracy of _mm_sqrt_ps otherwise?

I fully agree. I guess the main problem with the fast implementation are wrong results for negative values (should be NaN), +inf (should be inf) and denormalized values (rsqrt treats them as 0.0f, resulting in inf which leads to a NaN after the refinement).

Related discussion on the ML (old!): http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2009/03/msg00114.html And I agree with Christoph, the main pb is not on the accuracy but on the handling of negative, inf and denorms.

https://bitbucket.org/eigen/eigen/commits/97d9a7772106/ Changeset: 97d9a7772106 User: ggael Date: 2013-08-19 16:02:27 Summary: Fix bug 642: add vectorization of sqrt for doubles, and make sqrt really safe if EIGEN_FAST_MATH is disabled