Working notes - SMP support
From Eigen
Some developments started in this fork: http://bitbucket.org/ggael/eigen-smp using OpenMP.
Underlying threading model
Possibilities:
- OpenMP
- pro:
- easy to implement
- easy to enable for the user (compiler flag)
- cons:
- difficult to control the generated code => difficult to control the overhead
- pro:
- pthreads
- pro:
- everything is possible
- should not be too hard to use for our relatively simple use cases
- cons:
- a bit more work for us
- pro:
- Intel's TBB
- I see only cons:
- Too high level for our purpose.
- Not free
- I see only cons:
API
We need to figure out how to easily control the parallelization. Goals:
- Must be controllable at the expression level.
- The default is no parallelization since we expect parallelization occurs at the highest level, i.e., outside Eigen. Indeed, parallelizing the most outer loop is usually the best strategy though sometimes it might be complicated.
- Question: do we need compile time control, or the cost for one memory access to a global variable + one (or maybe one function call) if is always acceptable for dynamic sizes ?
References
- A paper discussing the new threading model of ATLAS: http://math-atlas.sourceforge.net/timing/newThr395/index.html