Difference between revisions of "FAQ"

From Eigen
Jump to: navigation, search
(New question: How do I get good performance?)
(New question: Where in my program are temporary objects created?)
Line 89: Line 89:
 
* Matrix multiplications are costly. See [http://eigen.tuxfamily.org/dox/TopicWritingEfficientProductExpression.html Writing Efficient Matrix Product Expressions] for some advice.
 
* Matrix multiplications are costly. See [http://eigen.tuxfamily.org/dox/TopicWritingEfficientProductExpression.html Writing Efficient Matrix Product Expressions] for some advice.
 
* General programming advice also applies here. In particular, profile your code, find the bottleneck, and optimize that part.
 
* General programming advice also applies here. In particular, profile your code, find the bottleneck, and optimize that part.
 +
 +
==Where in my program are temporary objects created?==
 +
 +
The Eigen library sometimes creates temporary matrices to hold intermediate results. This usually happens silently and may slow down your program, so it is useful to track down where temporary objects are created.
 +
 +
One possibility is to run your program under a debugger and set a break point which will be triggered when a temporary is created. For instance, you can set a break point in check_that_malloc_is_allowed() in Eigen/src/Core/util/Memory.h (this function is probably inlined if you compile with optimizations enabled).
 +
 +
Another possibility is to define the macro EIGEN_NO_MALLOC when compiling. This causes your program to abort whenever a temporary is created. More fine-grained checks are possible with the EIGEN_RUNTIME_NO_MALLOC macro. A minimal usage example follows:
 +
 +
<source lang="cpp">
 +
#define EIGEN_RUNTIME_NO_MALLOC // Define this symbol to enable runtime tests for allocations
 +
#include <Eigen/Dense>
 +
 +
int main(int argc, char** argv)
 +
{
 +
  // It's OK to allocate here
 +
  Eigen::MatrixXd A = Eigen::MatrixXd::Random(20, 20);
 +
 +
  Eigen::internal::set_is_malloc_allowed(false);
 +
  // It's NOT OK to allocate here
 +
  // An assertion will be triggered if an Eigen-related heap allocation takes place
 +
 +
  Eigen::internal::set_is_malloc_allowed(true);
 +
  // It's OK to allocate again
 +
}
 +
</source>
  
 
=Vectorization=
 
=Vectorization=
Line 171: Line 197:
 
asm("#it ends here!")
 
asm("#it ends here!")
 
</source>
 
</source>
 +
See [[Developer's Corner#Studying assembly output|Studying assembly output]] for more information.
  
 
=Algorithms=
 
=Algorithms=

Revision as of 17:15, 2 January 2012

Owl faq.jpg

Licensing

The Licensing FAQ has moved there.

Eigen and other libraries

Should I use Eigen?

Probably, but check pit falls first.

Why another matrix library? What is the need for Eigen?

First of all, see the Overview. No other library provides all of the features and benefits listed there.

The Eigen project started when some hackers from the large KDE meta-project realized the need for a single unified matrix library.

Some other libraries do satisfy very well certain specialized needs, but none is as versatile as Eigen, has such a nice API, etc.

The fact that so many projects are quickly adopting Eigen 2, shows that it fills a gap.

The state of existing matrix libraries before Eigen is that:

  • some are Free Software
  • some are fast
  • some have a decent API
  • some handle fixed-size matrices, some handle dynamic-size dense matrices, some handle sparse matrices
  • some provide linear algebra algorithms (LU, QR, ...)
  • some provide a geometry framework (quaternions, rotations...)

However Eigen is the first library to satisfy all these criteria.

How does Eigen compare to BLAS/LAPACK?

Eigen covers many things that BLAS/LAPACK don't:

  • Eigen handles fixed-size matrices and vectors, which are very widely used.
  • Eigen has built-in support for sparse matrices and vectors.
  • Eigen provides a lot of convenience features (see Geometry module, Array module, etc), which are also very widely used.

Using only one thread, Eigen compares very well performance-wise against the existing BLAS implementations. See the benchmark. It shows that:

  • Eigen is faster than every Free BLAS, such as ATLAS or Boost::uBlas.
  • Eigen is overall of comparable speed (faster or slower depending on what you do) to the best BLAS, namely Intel MKL and GOTO, both of which are non-Free.

However, currently Eigen parallelizes only general matrix-matrix products (bench), so it doesn't by itself take much advantage of parallel hardware.

Eigen has an incomparably better API than BLAS and LAPACK.

  • See the API Showcase.
  • For operations involving complex expressions, Eigen is inherently faster than any BLAS implementation because it can handle and optimize a whole operation globally -- while BLAS forces the programmer to split complex operations into small steps that match the BLAS fixed-function API, which incurs inefficiency due to introduction of temporaries. See for instance the benchmark result of a Y = a*X + b*Y operation which involves two calls to BLAS level1 routines while Eigen automatically generates a single vectorized loop.

Miscellaneous advantages (not specifically against BLAS/LAPACK):

  • Eigen is only a compile-time dependency for your project. No need to redistribute, or ask your user to install, any library.
  • Eigen is small, so it is feasible to include a copy of it in your own source tree, if you want to.
  • Eigen is multi-platform, and is actually being used on a number of different operating systems, hardware platforms, and compilers.
  • Eigen, compared to certain other C++ template libraries, is relatively easy on the compiler. Compilation times stay reasonable -- we are very careful about that.

Compilation

I need help with compiler errors!

  • Did you forget to include a header? See Pit_Falls.
  • Did you check if you triggered a static assertion ? These are compile-time checks guarding from programming mistakes. Eigen has many of those. So even if you got a kilometer of compiler output, you might still find useful information from static assertion messages. Search for "static_assert". The static assertion messages themselves are UPPERCASE_SO_THEY_REALLY_STAND_OUT.

Known MSVC issues

  • MSVC 2010 sometime crashes when the "enable browse" compiler option (/FR) is activated.

Runtime

I need help with Assert crashes!

The asserts are there to protect you from later unexplained crashes due to bad memory accesses.

When you hit such an assert, rerun your program in a debugger and obtain a backtrace. Make sure that you have compiled your program with enough debugging info. This way, you will quickly be able to trace back to the root cause of the problem :)

The most dreaded assert is the "Unaligned array" assert. As you can see, that page is there to help you fix it. If however you are desperate about it, you can always get rid of it.

Other assertions are typically triggered when you have accessed coefficients with out-of-range indices in a matrix; or when you have mixed matrices of mismatched sizes.

Optimization

How do I get good performance?

There are many aspects to this question, but here are some points to get you started:

  • Make sure you compile with optimization enabled. This can easily gain you a factor of ten or more.
  • Define the NDEBUG macro when compiling; this disables some run-time checks which speeds up your program.
  • Enable vectorization, as described in How can I enable vectorization?.
  • If your matrices are very small (size between 2 and 4), then using fixed-size matrices instead of dynamic-size matrices can get you a substantial speed-up.
  • Sometimes, quite a lot of time can be spent by the creation of temporary objects to hold intermediate results. See the next question to track this down.
  • Matrix multiplications are costly. See Writing Efficient Matrix Product Expressions for some advice.
  • General programming advice also applies here. In particular, profile your code, find the bottleneck, and optimize that part.

Where in my program are temporary objects created?

The Eigen library sometimes creates temporary matrices to hold intermediate results. This usually happens silently and may slow down your program, so it is useful to track down where temporary objects are created.

One possibility is to run your program under a debugger and set a break point which will be triggered when a temporary is created. For instance, you can set a break point in check_that_malloc_is_allowed() in Eigen/src/Core/util/Memory.h (this function is probably inlined if you compile with optimizations enabled).

Another possibility is to define the macro EIGEN_NO_MALLOC when compiling. This causes your program to abort whenever a temporary is created. More fine-grained checks are possible with the EIGEN_RUNTIME_NO_MALLOC macro. A minimal usage example follows:

#define EIGEN_RUNTIME_NO_MALLOC // Define this symbol to enable runtime tests for allocations
#include <Eigen/Dense>
 
int main(int argc, char** argv)
{
  // It's OK to allocate here
  Eigen::MatrixXd A = Eigen::MatrixXd::Random(20, 20);
 
  Eigen::internal::set_is_malloc_allowed(false);
  // It's NOT OK to allocate here
  // An assertion will be triggered if an Eigen-related heap allocation takes place
 
  Eigen::internal::set_is_malloc_allowed(true);
  // It's OK to allocate again
}

Vectorization

Which SIMD instruction sets are supported by Eigen?

Eigen supports SSE, AltiVec and ARM NEON.

With SSE, at least SSE2 is required. SSE3, SSSE3 and SSE4 are optional, and will automatically be used if they are enabled.

Of course vectorization is not mandatory -- you can use Eigen on any old CPU.

How can I enable vectorization?

You just need to tell your compiler to enable the corresponding instruction set, and Eigen will then detect it. If it is enabled by default, then you don't need to do anything.

On the x86 architecture, SSE is not enabled by default by most compilers. You need to enable SSE2 (or newer) manually. For example, with GCC, you would pass the -msse2 command-line option.

On the x86-64 architecture, SSE2 is generally enabled by default.

On PowerPC, you have to use the following flags: -maltivec -mabi=altivec.

On ARM NEON, the following: -mfpu=neon -mfloat-abi=softfp.

How can I disable vectorization?

You can disable Eigen's vectorization by defining the EIGEN_DONT_VECTORIZE preprocessor symbol.

If you also want to disable the "unaligned array" assertion or the 128bit alignment code, see the next entry below.

Also notice that your compiler may still be auto-vectorizing.

I disabled vectorization, but I'm still getting annoyed about alignment issues!

For example, you're still getting the "unaligned array" assertion.

If you want to get rid of it, you have two possibilities:

  • Define EIGEN_DONT_ALIGN (this requires Eigen 2.0.6 or later). That disables all 128-bit alignment code, and in particular everything vectorization-related. But do note that this in particular breaks ABI compatibility with vectorized code.
  • Or define both EIGEN_DONT_VECTORIZE and EIGEN_DISABLE_UNALIGNED_ARRAY_ASSERT. This keeps the 128-bit alignment code and thus preserves ABI compatibility.

If you want to know why defining EIGEN_DONT_VECTORIZE doesn't by itself disable 128-bit alignment and the assertion, here's the explanation:

  • It doesn't disable the assertion, because otherwise code that runs fine without vectorization would suddenly crash when enabling vectorization.
  • It doesn't disable 128bit alignment, because that would mean that vectorized and non-vectorized code are not mutually ABI-compatible. This ABI compatibility is very important, even for people who develop only an in-house application, as for instance one may want to have in the same application a vectorized path and a non-vectorized path.

How does vectorization depend on the compiler?

Eigen has its own vectorization system, it does not at all rely on the compiler to automatically vectorize. However it still needs some support from the compiler, in the form of intrinsic functions representing a single SIMD instruction each.

Eigen will automatically enable its vectorization if a supported SIMD instruction set and a supported compiler are detected. Otherwise, Eigen will automatically disable its vectorization and go on.

Eigen vectorization supports the following compilers:

  • GCC 4.2 and newer,
  • MSVC 2008 and newer,
  • All other compilers (for example it works with ICC).

Of course the reason why we "support all other compilers" is that so far we haven't seen other examples of compilers on which we should disable Eigen vectorization. If you know some, please let us know.

What can't be vectorized?

SSE, AltiVec and NEON work with packets of 128 bits, or 16 bytes. This means 4 ints, or 4 floats, or 2 doubles. Moreover, it is often required that the packets themselves be 128-bit aligned. Eigen takes care automatically of all that for you, but there are a few cases where it really can't vectorize. It will then automatically fall back to non-vectorized code, so that again is transparent to you, except of course that the resulting code isn't as fast as if it were vectorized.

Now, there is a big difference between dynamic-size and fixed-size vectors and matrices.

The bad cases are fixed sizes that are not multiples of 16 bytes. For example, Vector2f, Vector3f, Vector3d, Matrix3f, Matrix3d.

However, you may be able to use Vector4f class to perform Vector3f operations and make use of vectorization if you can carefully ensure that the last component is always zero. .

All other cases are good cases and are successfully vectorized by Eigen:

  • fixed sizes that are multiples of 16 bytes. For example, Vector2d, Vector4f, Vector4d, Matrix2f, Matrix2d, Matrix4f, Matrix4d, Transform3f, Transform3d.
  • all dynamic sizes (typically larger). Here, the size is not even required to be a multiple of 16 bytes. For example, VectorXf, VectorXd, MatrixXf, MatrixXd.

Eigen also has some current limitations that can and will be overcome in the future. For instance, some advanced operations, such as visitors, aren't currently vectorized (this is on our to-do).

How can I check that vectorization is actually being used?

First you can check that Eigen vectorization is enabled: the EIGEN_VECTORIZE preprocessor symbol is then defined.

Then, you may want to check the resulting assembly code. This is the best way to check that vectorization actually happened for a specific operation. Add some asm comments in your code around a line of code you're interested in, and tell your compiler to output assembly code. With GCC you could do:

Vector4f a, b;
asm("#it begins here!")
a += b;
asm("#it ends here!")

See Studying assembly output for more information.

Algorithms

Is there a method to compute the (Moore-Penrose) pseudo inverse ?

There's no such method currently, but for plain matrices, you can do it easily using the SVD decompozition. Example by adding a method to the SVD class (you could do it from outside):

 void pinv( MatrixType& pinvmat)
 {
   ei_assert(m_isInitialized && "SVD is not initialized.");
   double  pinvtoler=1.e-6; // choose your tolerance widely!
   SingularValuesType m_sigma_inv=m_sigma;
   for ( long i=0; i<m_workMatrix.cols(); ++i) {
      if ( m_sigma(i) > pinvtoler )
         m_sigma_inv(i)=1.0/m_sigma(i);
     else m_sigma_inv(i)=0;
   }
   pinvmat= (m_matV*m_sigma_inv.asDiagonal()*m_matU.transpose());
 }

There's no pseudo inverse for sparse matrices (yet?) in Eigen.