Difference between revisions of "Benchmark"

From Eigen
Jump to: navigation, search
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
The following benchmark results have been generated using a (heavily) modified version of the Benchmark for Templated Libraries ([http://projects.opencascade.org/btl/ BTL]) from Laurent Plagne. Our modified version can be found in the svn repository in eigen2/bench/btl. We did our best to make the best use of each library, however, any hints on making a lib working better are welcome. All libs have been configured to use '''dynamic-size column-major matrices''' and only '''one thread'''.
+
 
 +
<div style="border:solid;border-color:red;padding:1em 1em 1em 1em">
 +
This page is rather out of date. Please help us to generate and maintain up-to-date benchmarks.
 +
 
 +
See also the [[performance monitoring]] page for benchmark of Eigen along time.
 +
</div>
 +
 
 +
The following benchmark results have been generated using a (heavily) modified version of the Benchmark for Templated Libraries ([http://projects.opencascade.org/btl/ BTL]) from Laurent Plagne. Our modified version can be found in the mercurial repository under eigen/bench/btl. We did our best to make the best use of each library, however, any hints on making a lib working better are welcome. All libs have been configured to use '''dynamic-size column-major matrices''' and only '''one thread'''. [[How to run the benchmark suite|Try it yourself]].
  
 
'''Higher is better.''' By MFLOPS we mean millions of (effective) arithmetic operations per second. The reason why the values are typically low for small sizes, is that in this benchmark we deal with dynamic-size matrices which are relatively inefficient for small sizes. The reason why some libraries/benchmarks show a decline for large sizes, is that for such large matrices issues of CPU cache friendliness become predominant.
 
'''Higher is better.''' By MFLOPS we mean millions of (effective) arithmetic operations per second. The reason why the values are typically low for small sizes, is that in this benchmark we deal with dynamic-size matrices which are relatively inefficient for small sizes. The reason why some libraries/benchmarks show a decline for large sizes, is that for such large matrices issues of CPU cache friendliness become predominant.
  
In this benchmark we only included the fastest BLAS libraries because all other C++ matrix libraries we have tested (ublas, mtl4, blitz, gmm) are consistently much slower. To get an idea, see this [[Benchmark-August2008|previous benchmark]].
+
Previous benchmarks:
 +
* [[Benchmark-August2008|August 2008]]: Eigen 2, includes Eigen w/o vectorization, MKL, Goto, Atlas, ublas, mtl4, blitz, and gmm++.
 +
* [[Benchmark-March2009|March 2009]]: Early version of eigen3, includes Eigen w/o vectorization, MKL, Goto, Atlas, and ACML.
  
 
Here is the list of the libraries included in the following benchmarks:
 
Here is the list of the libraries included in the following benchmarks:
* '''eigen2''': [[Main_Page|ourselves]], with the default options (SSE2 vectorization enabled).
+
* '''eigen3''': [[Main_Page|ourselves]], with the default options (SSE2 vectorization enabled).
* '''eigen2_novec''': [[Main_Page|ourselves]] but with Eigen's explicit vectorization disabled. However, gcc's auto-vectorization was enabled.
+
* '''eigen2''': the previous stable version of Eigen, with the default options (SSE2 vectorization enabled).
 
* '''INTEL_MKL''': The [http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm Intel Math Kernel Library], which includes a BLAS/LAPACK (11.0). Closed-source.
 
* '''INTEL_MKL''': The [http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm Intel Math Kernel Library], which includes a BLAS/LAPACK (11.0). Closed-source.
 
* '''ACML''': The [http://www.amd.com/acml AMD's core math library], which includes a BLAS/LAPACK (4.2.0). Closed-source.
 
* '''ACML''': The [http://www.amd.com/acml AMD's core math library], which includes a BLAS/LAPACK (4.2.0). Closed-source.
* '''GOTO''': The [http://www.csar.cfs.ac.uk/user_information/software/maths/goto.shtml GOTO BLAS] library (1.26). Non-free license. This library has been compiled by hand specifically for the core2 architecture.
+
* '''GOTO''': The [http://www.csar.cfs.ac.uk/user_information/software/maths/goto.shtml GOTO BLAS] library (2-1.13). This library have been compiled by hand specifically for the penryn architecture.
* '''ATLAS''': The [http://math-atlas.sourceforge.net/ math-atlas] BLAS library (3.8.3). This library has been compiled by hand specifically for the core2 architecture.
+
* '''ATLAS''': The [http://math-atlas.sourceforge.net/ math-atlas] BLAS library (3.8.3). This library has been compiled by hand specifically for the penryn architecture.
  
== 17 March 2009 ==
+
== 23 March 2011 ==
<include src="http://download.tuxfamily.org/eigen/btl-results/index.html" nopre noesc />
+
<include src="http://download.tuxfamily.org/eigen/btl-results-110323/index-110323.html" nopre noesc />

Revision as of 15:28, 7 December 2016

This page is rather out of date. Please help us to generate and maintain up-to-date benchmarks.

See also the performance monitoring page for benchmark of Eigen along time.

The following benchmark results have been generated using a (heavily) modified version of the Benchmark for Templated Libraries (BTL) from Laurent Plagne. Our modified version can be found in the mercurial repository under eigen/bench/btl. We did our best to make the best use of each library, however, any hints on making a lib working better are welcome. All libs have been configured to use dynamic-size column-major matrices and only one thread. Try it yourself.

Higher is better. By MFLOPS we mean millions of (effective) arithmetic operations per second. The reason why the values are typically low for small sizes, is that in this benchmark we deal with dynamic-size matrices which are relatively inefficient for small sizes. The reason why some libraries/benchmarks show a decline for large sizes, is that for such large matrices issues of CPU cache friendliness become predominant.

Previous benchmarks:

  • August 2008: Eigen 2, includes Eigen w/o vectorization, MKL, Goto, Atlas, ublas, mtl4, blitz, and gmm++.
  • March 2009: Early version of eigen3, includes Eigen w/o vectorization, MKL, Goto, Atlas, and ACML.

Here is the list of the libraries included in the following benchmarks:

  • eigen3: ourselves, with the default options (SSE2 vectorization enabled).
  • eigen2: the previous stable version of Eigen, with the default options (SSE2 vectorization enabled).
  • INTEL_MKL: The Intel Math Kernel Library, which includes a BLAS/LAPACK (11.0). Closed-source.
  • ACML: The AMD's core math library, which includes a BLAS/LAPACK (4.2.0). Closed-source.
  • GOTO: The GOTO BLAS library (2-1.13). This library have been compiled by hand specifically for the penryn architecture.
  • ATLAS: The math-atlas BLAS library (3.8.3). This library has been compiled by hand specifically for the penryn architecture.

23 March 2011

ERROR in include.php: URL http://download.tuxfamily.org/eigen/btl-results-110323/index-110323.html not in allowed list.