This bugzilla service is closed. All entries have been migrated to https://gitlab.com/libeigen/eigen
Bug 1438 - Implement numerically stable summation methods
Implement numerically stable summation methods
 Status: ASSIGNED None Eigen Unclassified Tensor (show other bugs) 3.3 (current stable) All All Low Accuracy Problem Nobody accuracy, wrong-result

 Reported: 2017-06-20 04:54 UTC by Lakshay Garg 2019-12-04 17:03 UTC (History) 6 users (show) benoit.steiner.goog chtz gael.guennebaud jacob.benoit.1 lakshayg rmlarsen

Attachments
A proposed patch [Work in progress] (2.79 KB, patch)
2017-06-20 05:50 UTC, Lakshay Garg
no flags Details | Diff
Proof-of-concept tree reduction (2.63 KB, text/x-c++src)
2018-11-09 19:25 UTC, Christoph Hertzberg
no flags Details

 Lakshay Garg 2017-06-20 04:54:37 UTC ```Currently, when asked to sum a tensor, the library naively adds the numbers from left-to-right but this approach can lead to loss of precision in certain cases (example shown below). To prevent this loss of precision, I would like to propose the implementation of summation methods like Kahan or pairwise summation. #include #include int main() { using std::cout; Eigen::Matrix2f mat1, mat2; mat1 << 100000, 0.001, -100000, 0.001; cout << "sum1: " << mat1.sum() << '\n'; // expected: 0.002, output: 0 mat2 << 0.001, 0.001, -1, 1; cout << "sum2: " << mat2.sum() << '\n'; // expected: 0.002, output: 0.002 }``` Lakshay Garg 2017-06-20 05:50:33 UTC ```Created attachment 791 [details] A proposed patch [Work in progress] I tried to resolve the issue by implementing a slightly modified version of the Kahan summation algorithm (https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Further_enhancements). The attached patch basically removes the inbuilt SumReducer and replaces it with the new sum reduce in order to check if it makes a difference. All the unit tests which pass on the latest bitbucket version also pass after this patch but the problem still remains. Maybe I did not make changes at the appropriate place or I am not testing properly. Since this is my first contribution to eigen, any hints and suggestions are appreciated.``` Gael Guennebaud 2017-06-25 06:45:55 UTC `Your path modifies the Tensor module only (so Eigen::Tensor objects only), whereas your test is checking Eigen::Matrix.` Lakshay Garg 2017-06-25 07:41:24 UTC `@gael Thanks for your reply. It would be very helpful if you could you point me to the files I need to modify in order to change the behaviour globally.` Christoph Hertzberg 2017-07-18 12:17:44 UTC ```@Lakshay: In case you did not find it meanwhile: The sum (and other) reductions for matrices is implemented in Eigen/src/Core/Redux.h. Namely, you need to specialize/re-implement `redux_impl`, or you need to implement an alternative to `scalar_sum_op` (which is implemented in Eigen/src/Core/functors/BinaryFunctors.h). However overall, I doubt that this is really worth the effort, except in special cases (and it is easy to construct cases, where even Kahan summation fails, e.g., some permutation of [1e20, 1, 1e-20, 1e-20, -1, -1e20]).``` Lakshay Garg 2017-07-18 12:30:07 UTC ```Thanks for the feedback Christopher. You are right in saying the Kahan summation does not work in all the cases but a better alternative is available which solves the problem. Please see https://en.wikipedia.org/wiki/Kahan_summation_algorithm#Further_enhancements for details. Also, since this method is likely to reduce the efficiency of summation, I think it would make sense to implement it as a separate function.``` Rasmus Munk Larsen 2018-09-12 21:20:17 UTC ```Proposed solution in: https://bitbucket.org/eigen/eigen/pull-requests/486/use-numerically-stable-tree- reduction-in/diff``` Christoph Hertzberg 2018-11-09 19:25:15 UTC ```Created attachment 891 [details] Proof-of-concept tree reduction I attached a proof-of-concept tree reduction. It misses handling of the last elements (currently it only works correctly for sizes which are multiples of 64), and just implements Packet4f-based addition. However, it is not only more accurate (at least for sufficiently large sizes), but also faster than the current implementation (as long as the input fits into the cache). Without properly handling the last elements, this comparison is of course slightly unfair. Examples (Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz): \$ ./a.out 128 5000 double: 10.0117, diff redux: 6.55651e-07, diff tree: 6.55651e-07 GFLOPS: redux_best: 6.74629, tree_best: 8.00961 double: 1.69471, diff redux: 3.57628e-07, diff tree: 0 GFLOPS: redux_best: 6.74629, tree_best: 8.00961 double: 3.73089, diff redux: -5.36442e-07, diff tree: 8.9407e-07 GFLOPS: redux_best: 6.74664, tree_best: 8.00961 ... \$ ./a.out 1280000 500 double: -66.3165, diff redux: 0.00162375, diff tree: -8.52346e-05 GFLOPS: redux_best: 5.42728, tree_best: 5.59063 double: -350.063, diff redux: -0.0029037, diff tree: 5.65052e-05 GFLOPS: redux_best: 5.43354, tree_best: 5.59152 ...``` Gael Guennebaud 2018-11-13 12:54:39 UTC `Note that your reduce<> is equivalent to the already implemented redux_vec_unroller.` Nobody 2019-12-04 17:03:19 UTC ```-- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/1438.```

 Note You need to log in before you can comment on or make changes to this bug.