Talk:Benchmark

From Eigen
Revision as of 14:00, 17 October 2010 by Bjacob (Talk | contribs)

Jump to: navigation, search

Tvisitor

A few observations:

  • some (if not all?) of the benchmark calculations are done in single precission, ideally that should be changed to double
  • to calculate flops I would not count + and - operations but only * and /

so for mat*mat operation we would get n*n*n and not 2*n*n*n as used in the source code of the benchmark

 double nb_op_base( void ){
   return 2.0*_size*_size*_size;
 }
  • ATLAS outperforms eigen on level 3 BLAS: on my system (Core i5 750) I get for mat*mat operation (n*n*n flops, double prec)
 n=2000: ATLAS 5250 mflops, eigen 4840 mflops, non-optimised BLAS 860 mflops
  • eigen is extremely slow for solving equation systems using LU (2/3*n*n*n flops, double prec)
 n=2000: ATLAS threaded 12000 mflops, ATLAS 8870 mflops, eigen 840 mflops, non-optimised BLAS 1960 mflops
 n=5000: ATLAS threaded 21100 mflops, ATLAS 9700 mflops, eigen 800 mflops

So I'm very disappointed with the LU decomposition performance but otherwise brilliant package!

Bjacob

First of all, I don't check wiki updates very often so it's only by chance that I found this! Please use the mailing list or forum.

Are you talking about Eigen 2 or 3 (the development branch)? This benchmark refers to an old state of the development branch, so it's somewhere halfway between Eigen 2 and 3.

To answer your points:

  • Lots of people use single precision so I don't know that it should be changed to double. Ideally we'd benchmark all types but that would take long.
  • About the flop count, + and - are not any cheaper than * on modern CPUs. / is more expensive but there are much fewer / operations so almost every algorithm spends most of its time in + and *.
  • about ATLAS outperforming Eigen on level 3 blas, please try the development branch (which this benchmark is about). These days, both ATLAS and Eigen are very fast i.e. very close to the fastest libs (MKL, Goto)
  • About LU performance, are you talking about full-pivoting or partial-pivoting LU ? Don't compare apples and oranges. If you want partial-pivoting LU (it's much faster but less general/reliable) you have it in the development branch.

Please take this discussion to the mailing list if you want to continue it.