Difference between revisions of "Talk:Benchmark"

From Eigen
Jump to: navigation, search
(New page: A few observations: * some (if not all?) of the benchmark calculations are done in single precission, ideally that should be changed to double * to calculate flops I would not count + and...)
 
Line 1: Line 1:
 +
= Tvisitor =
 +
 
A few observations:
 
A few observations:
  
Line 17: Line 19:
  
 
So I'm very disappointed with the LU decomposition performance but otherwise brilliant package!
 
So I'm very disappointed with the LU decomposition performance but otherwise brilliant package!
 +
 +
= Bjacob =
 +
 +
First of all, I don't check wiki updates very often so it's only by chance that I found this! Please use the mailing list or forum.
 +
 +
Are you talking about Eigen 2 or 3 (the development branch)? This benchmark refers to an old state of the development branch, so it's somewhere halfway between Eigen 2 and 3.
 +
 +
To answer your points:
 +
 +
* Lots of people use single precision so I don't know that it should be changed to double. Ideally we'd benchmark all types but that would take long.
 +
* About the flop count, + and - are not any cheaper than * on modern CPUs. / is more expensive but there are much fewer / operations so almost every algorithm spends most of its time in + and *.
 +
* about ATLAS outperforming Eigen on level 3 blas, please try the development branch (which this benchmark is about). These days, both ATLAS and Eigen are very fast i.e. very close to the fastest libs (MKL, Goto)
 +
* About LU performance, are you talking about full-pivoting or partial-pivoting LU ? Don't compare apples and oranges. If you want partial-pivoting LU (it's much faster but less general/reliable) you have it in the development branch.
 +
 +
Please take this discussion to the mailing list if you want to continue it.

Revision as of 14:00, 17 October 2010

Tvisitor

A few observations:

  • some (if not all?) of the benchmark calculations are done in single precission, ideally that should be changed to double
  • to calculate flops I would not count + and - operations but only * and /

so for mat*mat operation we would get n*n*n and not 2*n*n*n as used in the source code of the benchmark

 double nb_op_base( void ){
   return 2.0*_size*_size*_size;
 }
  • ATLAS outperforms eigen on level 3 BLAS: on my system (Core i5 750) I get for mat*mat operation (n*n*n flops, double prec)
 n=2000: ATLAS 5250 mflops, eigen 4840 mflops, non-optimised BLAS 860 mflops
  • eigen is extremely slow for solving equation systems using LU (2/3*n*n*n flops, double prec)
 n=2000: ATLAS threaded 12000 mflops, ATLAS 8870 mflops, eigen 840 mflops, non-optimised BLAS 1960 mflops
 n=5000: ATLAS threaded 21100 mflops, ATLAS 9700 mflops, eigen 800 mflops

So I'm very disappointed with the LU decomposition performance but otherwise brilliant package!

Bjacob

First of all, I don't check wiki updates very often so it's only by chance that I found this! Please use the mailing list or forum.

Are you talking about Eigen 2 or 3 (the development branch)? This benchmark refers to an old state of the development branch, so it's somewhere halfway between Eigen 2 and 3.

To answer your points:

  • Lots of people use single precision so I don't know that it should be changed to double. Ideally we'd benchmark all types but that would take long.
  • About the flop count, + and - are not any cheaper than * on modern CPUs. / is more expensive but there are much fewer / operations so almost every algorithm spends most of its time in + and *.
  • about ATLAS outperforming Eigen on level 3 blas, please try the development branch (which this benchmark is about). These days, both ATLAS and Eigen are very fast i.e. very close to the fastest libs (MKL, Goto)
  • About LU performance, are you talking about full-pivoting or partial-pivoting LU ? Don't compare apples and oranges. If you want partial-pivoting LU (it's much faster but less general/reliable) you have it in the development branch.

Please take this discussion to the mailing list if you want to continue it.