Bugzilla – Bug 613

Bug in internal::psqrt SSE implementation

Last modified: 2013-06-14 09:52:16 UTC

Created attachment 346 [details] Patch to fix the problem. However, it has a lengthy comment that you probably can just remove from the real distribution. The internal::psqrt routine in Eigen/src/Core/arch/SSE/MathFunctions.h appears to have a bug whereby values in the range [1.17549e-038, 1.19209e-007] return a square root value of exactly 0.0, rather than a true square root. This is because the code that computes the non_zero_mask on the Packet4f entry by comparing to see which values are greater than std::numeric_limits<float>::epsilon() (i.e. 1.19209e-007), rather than std::numeric_limits<float>::min() (i.e. 1.17549e-038). The fix is to change this line: Packet4f non_zero_mask = _mm_cmpgt_ps(_x, pset1<Packet4f>(std::numeric_limits<float>::epsilon())); To: Packet4f non_zero_mask = _mm_cmpgt_ps(_x, pset1<Packet4f>(std::numeric_limits<float>::min())); A test program that exhaustively shows where values in this range compute sqrt results that differ by more than 0.00001 from the default sqrt(float) implementation is below. It might be worth incorporating the style of this test program into the library to test all the other specialized Packet implementations in MathFunctions.h in a similar manner (i.e. make sure that the Packet SSE implementations compute answers that are very similar to the default single-value implementations for a large number of randomly selected floating point values). Bug fix for Eigen3 psqrt() operation, where values that were between 0 and std::numeric_limits<float>::epsilon() (1.19209e-007) would be given a value of 0, rather than something closer to what the sqrt(x) function in math.h would return. Wrote simple test program (shown below) to illustate the problem: % g++ -o sqrt_issue -lm -I/google/src/cloud/jeff/eigen-sqrt/google3/third_party/eigen3/ sqrt_issue.cc Without this fix, the program generates the following output: % ./sqrt_issue Sampled 873254095 values out of 873254095 ... Mismatch sample: sqrt() 0 vs. SQRT 0.000345043 for 1.19055e-07 Mismatch sample: sqrt() 0 vs. SQRT 0.000345146 for 1.19126e-07 Mismatch sample: sqrt() 0 vs. SQRT 0.000345249 for 1.19197e-07 Mismatches: 86251777 of 873254095 With the fix, the program shows 0 mismatches: % ./sqrt_issue Sampled 873254095 values out of 873254095 Mismatches: 0 of 873254095 Where sqrt_issue.cc is: ------------------------ /* Program to illustrate an issue with the Eigen psqrt implementation for SSE. Author: Jeff Dean (firstname@google.com) */ #include <stdio.h> #include <math.h> #include <vector> #include "Eigen/Core" typedef unsigned int uint32; namespace { float SQRT(float x) { return sqrt(x); } } int main(int argc, char** argv) { std::vector<float> v; float max_value = 1.1 * std::numeric_limits<float>::epsilon(); int count = 0; for (uint32 i = 0; i < (1ull<<32)-1; i++) { float f = *(reinterpret_cast<float*>(&i)); if (f >= 0.0 && f <= max_value) { count++; // Randomly sample problematic values v.push_back(f); } } const int N = v.size(); fprintf(stderr, "Sampled %d values out of %d\n", N, count); std::vector<float> v2(N), v3(N); Eigen::Map<Eigen::ArrayXf> r1(&v2[0], N); Eigen::Map<Eigen::ArrayXf> r2(&v3[0], N); Eigen::Map<Eigen::ArrayXf> in(&v[0], N); r1 = in.sqrt(); r2 = in.unaryExpr(std::ptr_fun(SQRT)); int mismatches = 0; for (int i = 0; i < N; i++) { if (fabs(r1[i] - r2[i]) > 0.00001) { mismatches++; if ((mismatches % 10000) == 0) { fprintf(stdout, "Mismatch sample: sqrt() %g vs. SQRT %g for %g\n", r1[i], r2[i], in[i]); } } } fprintf(stderr, "Mismatches: %d of %d\n", mismatches, N); }

That's indeed a legitimate bug, as sqrt (like any power function with exponent strictly between 0 and 1) has the property of making small numbers less small. I agree with the proposed 1-line change, for what it's worth.

Thanks a lot for this very detailed report. Fixed with regression unit tests: https://bitbucket.org/eigen/eigen/commits/99f4e5961f47/ Changeset: 99f4e5961f47 User: ggael Date: 2013-06-13 18:12:58 Summary: Extend the magnitude range of tested numbers in packet math unit tests https://bitbucket.org/eigen/eigen/commits/34f92f66abdb/ Changeset: 34f92f66abdb User: Jeff Dean Date: 2013-06-13 18:17:27 Summary: Fix bug 613: psqrt was incorrect for small numbers

Wouldn't it be easier to just compare by zero? Or do I miss something here?

We cannot because then rsqrt will overflow for a value like (0.5 * std::numeric_limits<float>::min()). Test program: #include <Eigen/Core> #include <iostream> using namespace Eigen; int main() { float a = 0.5 * (std::numeric_limits<float>::min)(); Array<float,1,4> A(a,a,a,a), B; B = A.sqrt(); std::cout << a << " -> " << sqrt(a) << " == " << 1.f/std::sqrt(1.f/a) << " == " << B << "\n"; return 0.; } On the other hand, we could change _mm_cmpgt_ps to _mm_cmpge_ps.

(In reply to comment #3) > Wouldn't it be easier to just compare by zero? Or do I miss something here? Thanks for fixing. +Christoph Hertzberg: I initially tried comparing with 0.0, and got a few -Inf sqrt results when I implemented it that way. My test program will print out a few examples of failing values if you run it when the line in the bug fix is instead set to be: Packet4f non_zero_mask = _mm_cmpgt_ps(_x, pset1<Packet4f>(.0f));

@Jeff: Yes, I discussed this with Gael on IRC yesterday. The problem is that mm_rsqrt returns +Inf for denormalized numbers. I should have posted that here, to avoid confusion ...