Difference between revisions of "Eigen2 benchmark Intel"

From Eigen
Jump to: navigation, search
Line 1: Line 1:
Out of curiosity, I have performed BTL tests with Eigen2 compiled with 4 different compilers:
+
Out of curiosity, I have performed BTL tests with Eigen2 compiled with 4 different compilers on Intel Pentium D CPU:
 
* GCC 4.3.3: -O3 -march=native -DNDEBUG
 
* GCC 4.3.3: -O3 -march=native -DNDEBUG
 
* GCC 4.1.3: -O3 -march=nocona -msse2 -msse3 -DNDEBUG
 
* GCC 4.1.3: -O3 -march=nocona -msse2 -msse3 -DNDEBUG
Line 5: Line 5:
 
* Intel(R) C++ 11.0: -O3 -DNDEBUG -no-ipo -xHOST -ip -static -no-prec-div
 
* Intel(R) C++ 11.0: -O3 -DNDEBUG -no-ipo -xHOST -ip -static -no-prec-div
 
Although from on my experience the ''-ipo'' option (interprocedural optimization) provides good performance benefits, it was explicitly disabled for Intel, because it failed to work (numerically).
 
Although from on my experience the ''-ipo'' option (interprocedural optimization) provides good performance benefits, it was explicitly disabled for Intel, because it failed to work (numerically).
 +
----
 +
Rookie conclusions:
 +
# The benefit of using newer GCC versions is pretty clear.
 +
# In most cases gcc 4.4 is comparable with gcc 4.3, but in some it's almost 2 times faster.
 +
# Except (anomalous) LU decomposition, gcc 4.1 is nowhere near newer versions of gcc.
 +
# Intel C++ does not provide any performance benefits here. This is somewhat surprising as I was expecting at least some advantage on this CPU. That could be due to disabled IPO, though.
 +
  
 
----
 
----

Revision as of 02:44, 17 March 2009

Out of curiosity, I have performed BTL tests with Eigen2 compiled with 4 different compilers on Intel Pentium D CPU:

  • GCC 4.3.3: -O3 -march=native -DNDEBUG
  • GCC 4.1.3: -O3 -march=nocona -msse2 -msse3 -DNDEBUG
  • GCC 4.4.0: -O3 -march=native -DNDEBUG
  • Intel(R) C++ 11.0: -O3 -DNDEBUG -no-ipo -xHOST -ip -static -no-prec-div

Although from on my experience the -ipo option (interprocedural optimization) provides good performance benefits, it was explicitly disabled for Intel, because it failed to work (numerically).


Rookie conclusions:

  1. The benefit of using newer GCC versions is pretty clear.
  2. In most cases gcc 4.4 is comparable with gcc 4.3, but in some it's almost 2 times faster.
  3. Except (anomalous) LU decomposition, gcc 4.1 is nowhere near newer versions of gcc.
  4. Intel C++ does not provide any performance benefits here. This is somewhat surprising as I was expecting at least some advantage on this CPU. That could be due to disabled IPO, though.



Axpy compare intel.png


Axpby compare intel.png


Atv compare intel.png


Matrix vector compare intel.png


Matrix matrix compare intel.png


Symv compare intel.png


Syr2 compare intel.png


Aat compare intel.png


Ata compare intel.png


Trisolve compare intel.png


Cholesky compare intel.png


Hessenberg compare intel.png


Tridiagonalization compare intel.png


Lu decomp compare intel.png