Difference between revisions of "Todo for 3.0"

From Eigen
Jump to: navigation, search
 
(21 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
[[Image:Eigen_Owl_Todo.jpg|right]]
 
[[Image:Eigen_Owl_Todo.jpg|right]]
 +
 +
'''This page is deprecated. The new way of tracking the 3.0 release is using the [http://eigen.tuxfamily.org/bz/show_bug.cgi?id=25 tracking bug] on Bugzilla.'''
  
 
This Todo is to keep track of the things that must be done for Eigen 3.0. There is a separate [[Todo]] for long-term and generalist Todo items, not specifically for 3.0. There is also a separate page tracking the [[status of unsupported modules]].
 
This Todo is to keep track of the things that must be done for Eigen 3.0. There is a separate [[Todo]] for long-term and generalist Todo items, not specifically for 3.0. There is also a separate page tracking the [[status of unsupported modules]].
  
Also see the [[Release schedule for 3.0]].
+
 
  
 
Note that a ton of stuff has already been done, and isn't mentioned here, so this page isn't at all intend to give an idea of all of what there will be in 3.0.
 
Note that a ton of stuff has already been done, and isn't mentioned here, so this page isn't at all intend to give an idea of all of what there will be in 3.0.
Line 38: Line 40:
 
|-
 
|-
 
| 0
 
| 0
| Required for 3.0-beta1
+
| Required for 3.0-beta3
 
|-
 
|-
 
| 1
 
| 1
Line 58: Line 60:
 
! Status
 
! Status
 
! Priority
 
! Priority
 +
|-
 +
| Hessenberg decomposition
 +
| Reimplement, copying HouseholderQR. Now that we have HouseholderSequence with shift, it should be really just a matter of copying what HouseholderQR is doing, with a shift of 1.
 +
|
 +
| 0
 +
| 2
 +
|-
 +
| Householder complex conjugation
 +
| I seem to remember there's something awkward in our Housholder API, namely we force the user to complex-conjugate Householder coefficients where we should take care of that. Investigate.
 +
| Benoit
 +
| 0
 +
| 0
 +
|-
 +
| HouseholderSequence API cleanup
 +
| Is it really good to have these constructors taking all these integer parameters? Is "actualVectors" really a good name? Investigate
 +
| Benoit
 +
| 0
 +
| 0
 +
|-
 +
| Direct-access type unification
 +
| All direct access Block/Map types should be unified so as to reduce as much as possible the number of template instantiations. E.g. Block<Block<Matrix>> is just a Map. This could mean that the return type of block(...) becomes an internal thing that we must always hide behind a BlockReturnType helper.
 +
|
 +
| 1 - some discussions
 +
| 2
 +
|-
 +
| no malloc in fixed-size products
 +
| Fixed-size matrix products should never malloc. It's OK if this means that they never do any cache-friendliness/parallellization. Guaranteeing no malloc is more important.
 +
|
 +
| 2 - some discussions
 +
| 0
 +
|-
 +
| useless matrix-vector product instantiations
 +
| The cache-friendly matrix-vector product code is instantiated too many times because of a useless template parameter. This should be fixable now that we have strided map.
 +
|
 +
| 2 - some discussions
 +
| 1
 +
|-
 +
| Strides API
 +
| The current strides API is not very satisfactory. Can we improve it?
 +
|
 +
| 2 - some discussions
 +
| 0
 
|-
 
|-
 
| Core - xpr trees
 
| Core - xpr trees
 
| Investigate making nest-by-value the default for expressions and killing NestByValue. Refactoring of product handling, temporaries and nesting expressions.
 
| Investigate making nest-by-value the default for expressions and killing NestByValue. Refactoring of product handling, temporaries and nesting expressions.
 
| Gael, Hauke
 
| Gael, Hauke
| 4
+
| 5
 
| 0
 
| 0
 
|-
 
|-
Line 68: Line 112:
 
| Do in the QR module (ColPiv and FullPiv), in the SVD module, and in the self-adjoint eigensolver, the same rank-determining API as in FullPivLU. That involves making rank(), isInvertible() etc... methods, using a threshold controlled by a setThreshold method. That also involves adding kernel/image methods.
 
| Do in the QR module (ColPiv and FullPiv), in the SVD module, and in the self-adjoint eigensolver, the same rank-determining API as in FullPivLU. That involves making rank(), isInvertible() etc... methods, using a threshold controlled by a setThreshold method. That also involves adding kernel/image methods.
 
| Benoit
 
| Benoit
| 2 - done in ColPiv QR. Remains to do at least FullPivQR (easy) and SVD and perhaps selfadj eigensolver.
+
| 2 - done in ColPiv QR. Remains to do at least FullPivQR (easy) and perhaps selfadj eigensolver.
 
| 2
 
| 2
 
|-
 
|-
Line 91: Line 135:
 
| Tests : original LAPACK testsuite
 
| Tests : original LAPACK testsuite
 
| The idea is that once we have a LAPACK library implemented using Eigen, we should be able to run the original LAPACK testsuite directly against it.
 
| The idea is that once we have a LAPACK library implemented using Eigen, we should be able to run the original LAPACK testsuite directly against it.
|
+
|  
 
| 1 - thread "back from google"
 
| 1 - thread "back from google"
 
| 2 - however we need precision testing anyhow for 3.0-beta1
 
| 2 - however we need precision testing anyhow for 3.0-beta1
Line 114: Line 158:
 
|-
 
|-
 
| Eigenvalues for real general matrices
 
| Eigenvalues for real general matrices
| The implementation borrowed from JAMA should be rewritten in a clean way separating the Real Schur decomposition as for the complex case. Then the ComplexEigenSolver and EigenSolver should be merged.
+
| The implementation borrowed from JAMA should be rewritten in a clean way separating the Real Schur decomposition as for the complex case. <s>Then the ComplexEigenSolver and EigenSolver should be merged.</s>
|
+
| Jitse
| 1 - some threads
+
| 4
 
| 0, or consider moving part of Eigenvalues module to Unsupported
 
| 0, or consider moving part of Eigenvalues module to Unsupported
 
|-
 
|-
Line 127: Line 171:
 
| cache/block size at runtime
 
| cache/block size at runtime
 
| Currently we only allow to control the cache/block size at compile-time. Big users like Google need to control it at runtime, e.g. to have a single binary that runs on various hardware.
 
| Currently we only allow to control the cache/block size at compile-time. Big users like Google need to control it at runtime, e.g. to have a single binary that runs on various hardware.
| Benoit
+
| Gael
| 1 - thread "back from google"
+
| 4
 
| 0
 
| 0
 
|-
 
|-
Line 134: Line 178:
 
| Improve/simplify them according to discussion on the list
 
| Improve/simplify them according to discussion on the list
 
| Benoit
 
| Benoit
| 1 - thread on the list
+
| 5 - thread on the list
 
| 0
 
| 0
 
|-
 
|-
| SVD improvements
+
| New SVD
| SVD is still essentially C code with for loops. Need to make it use Householder module, need to support all rectangular sizes and complex numbers.
+
| The old SVD has been removed. Need to write a shiny new DivConqSVD implementing Jessup-Sorensen divide-and-conquer SVD, with an efficient (i.e. blocked) bidiagonalization step, and with a fancy API allowing to compute/apply U and V on demand after the decomposition has been done.
 
| Benoit
 
| Benoit
 
| 1 - been talking about it forever
 
| 1 - been talking about it forever
| 0
+
| 2
|-
+
| Polar decomposition API
+
| Currently some functions in SVD. The first thing is that the same thing should be done in JacobiSVD. The next question is whether it's worth putting that code in a base class. And finally, the API itself is open for improvements, at least the function names could be less clunky.
+
| Benoit
+
| 0
+
| 0, or consider moving to Unsupported
+
|-
+
| LDLT on triangles
+
| At the moment there's a BUG comment in LDLT that claims that it does not use only one triangle of the matrix. Investigate and fix.
+
|
+
| 0
+
| 1
+
 
|-
 
|-
 
| Blocked Householder
 
| Blocked Householder
 
| At the moment our Householder transformations are not blocked, doing this and adapting decompositions to use this would result in large performance improvements and is a priority because it seems as if it might reveal needs for API improvements.
 
| At the moment our Householder transformations are not blocked, doing this and adapting decompositions to use this would result in large performance improvements and is a priority because it seems as if it might reveal needs for API improvements.
 
|  
 
|  
| 1 - been discussed forever
+
| 3 - done for non-pivoting HouseholderQR, now do ColPiv...
 
| 2
 
| 2
 
|-
 
|-
Line 166: Line 198:
 
| 5
 
| 5
 
| 2
 
| 2
|-
 
| Permutation matrix
 
| We should add a special matrix class "PermutationMatrix" and use that in various pivoting decompositions to improve the API and factor out the code to apply such permutations.
 
| Benoit
 
| 5 - Can be further improved (return value proxys...), but it seems like compatibility can be kept
 
| 0
 
 
|-
 
|-
 
| SVD - bidiagonalization
 
| SVD - bidiagonalization
| Since SVD does a bidiagonalization anyway, we could expose it in the same way that we expose tridiagonalization.
+
| I think we should axe the UpperBidiagonalization class. I thought it would be useful to implement the new SVD, it's not.
 
| Benoit
 
| Benoit
 
| 4
 
| 4
| 2
 
|-
 
| Tridiagonalization
 
| The tridiagonalization API should leverage the BandMatrix class, right? Since that's API stuff and Tridiagonalization is already here, that's becoming a priority. Either that or make the Tridiagonalization internal for now. Also, if the "Bidiagonalization" item gets done, the same will apply to it.
 
|
 
| 0
 
 
| 0
 
| 0
 
|-
 
|-
Line 194: Line 214:
 
| Decide on API; use new ReturnByValue mechanism? Decide on whether it should move to supported.  
 
| Decide on API; use new ReturnByValue mechanism? Decide on whether it should move to supported.  
 
| Jitse
 
| Jitse
| 4 - usable, but issues remain (see email 2010-02-18)
+
| 5
 
| 3
 
| 3
 
|-
 
|-
Line 200: Line 220:
 
| Implement Schur-Parlett algorithm of Davies & Higham. Depends on: eigenvalues module (for complex Schur decomposition) and permutation matrix.
 
| Implement Schur-Parlett algorithm of Davies & Higham. Depends on: eigenvalues module (for complex Schur decomposition) and permutation matrix.
 
| Jitse
 
| Jitse
| 4 - as above
+
| 5
 
| 3
 
| 3
 
|-
 
|-
Line 216: Line 236:
 
|-
 
|-
 
| SSE4
 
| SSE4
| It seems to provide at least tiny dot products (even on data smaller than a packet) and integer multiplication. Since i have this shiny core i7, i should try using it.
+
| It seems to provide at least tiny dot products (even on data smaller than a packet) and integer multiplication.
| Benoit
+
|
| 2 - integer mul is done. Now play with dot products...
+
| 2 - integer mul is done. Now play with dot products and all the rest...
 
| 3
 
| 3
|-
 
| Refactoring of our expression template design
 
| [[Internal design documentation]]
 
| Gael
 
| 3 - merged
 
| 0
 
 
|-
 
|-
 
| Uniform external library support
 
| Uniform external library support
Line 231: Line 245:
 
| Gael
 
| Gael
 
| 2 -  
 
| 2 -  
| 1
+
| 1 if any module using that makes it into 3.0
 
|-
 
|-
 
| Enable expression template for Transform
 
| Enable expression template for Transform
Line 237: Line 251:
 
| Gael
 
| Gael
 
| 1 -  
 
| 1 -  
| 0
+
| 0 - or declare not doable because of all the operations that don't map well to xpr templates??
 
|-
 
|-
 
| Optimize diagonal products even more
 
| Optimize diagonal products even more
Line 244: Line 258:
 
| 1 -  
 
| 1 -  
 
| 3
 
| 3
 +
|-
 +
| Documentation
 +
| See [[Documentation for Eigen3]] for a planned outline.
 +
|
 +
| 3?
 +
| 0-2?
 +
|-
 +
| Const-correctness
 +
| Map allows to circumvent const-correctness.
 +
|
 +
| 1 - mailing list discussion
 +
| 0
 
|}
 
|}

Latest revision as of 13:54, 21 December 2016

Eigen Owl Todo.jpg

This page is deprecated. The new way of tracking the 3.0 release is using the tracking bug on Bugzilla.

This Todo is to keep track of the things that must be done for Eigen 3.0. There is a separate Todo for long-term and generalist Todo items, not specifically for 3.0. There is also a separate page tracking the status of unsupported modules.


Note that a ton of stuff has already been done, and isn't mentioned here, so this page isn't at all intend to give an idea of all of what there will be in 3.0.

Note that this is a sortable table, so try to make it sort nicely. For the "status" column, let's use this convention:

Status Meaning
0 nothing started
1 some discussion happened
2 some development started
3 sufficient features coded
4 sufficient features and mostly stable API
5 sufficient features, tested, with fully stable API

For the "priority" column, let's use this convention:

Priority Meaning
0 Required for 3.0-beta3
1 Required for 3.0
2 Very important feature, but should not block 3.0
3 Can I have a pony, too?

For the "who" column, put whoever started working on it or is planning to, multiple persons possible, don't hesitate to add yourself as more help is always welcome!

Topic Task Who Status Priority
Hessenberg decomposition Reimplement, copying HouseholderQR. Now that we have HouseholderSequence with shift, it should be really just a matter of copying what HouseholderQR is doing, with a shift of 1. 0 2
Householder complex conjugation I seem to remember there's something awkward in our Housholder API, namely we force the user to complex-conjugate Householder coefficients where we should take care of that. Investigate. Benoit 0 0
HouseholderSequence API cleanup Is it really good to have these constructors taking all these integer parameters? Is "actualVectors" really a good name? Investigate Benoit 0 0
Direct-access type unification All direct access Block/Map types should be unified so as to reduce as much as possible the number of template instantiations. E.g. Block<Block<Matrix>> is just a Map. This could mean that the return type of block(...) becomes an internal thing that we must always hide behind a BlockReturnType helper. 1 - some discussions 2
no malloc in fixed-size products Fixed-size matrix products should never malloc. It's OK if this means that they never do any cache-friendliness/parallellization. Guaranteeing no malloc is more important. 2 - some discussions 0
useless matrix-vector product instantiations The cache-friendly matrix-vector product code is instantiated too many times because of a useless template parameter. This should be fixable now that we have strided map. 2 - some discussions 1
Strides API The current strides API is not very satisfactory. Can we improve it? 2 - some discussions 0
Core - xpr trees Investigate making nest-by-value the default for expressions and killing NestByValue. Refactoring of product handling, temporaries and nesting expressions. Gael, Hauke 5 0
Decompositions - rank API Do in the QR module (ColPiv and FullPiv), in the SVD module, and in the self-adjoint eigensolver, the same rank-determining API as in FullPivLU. That involves making rank(), isInvertible() etc... methods, using a threshold controlled by a setThreshold method. That also involves adding kernel/image methods. Benoit 2 - done in ColPiv QR. Remains to do at least FullPivQR (easy) and perhaps selfadj eigensolver. 2
Decompositions - kernel for given dimension In all rank-revealing decompositions, it would be nice to have a function to construct the kernel matrix for a prescribed dimension of the kernel. At the very least for SVD decompositions, where this would just be taking the n singular vectors associated to the n smallest singular values. In real-world use cases, this is the most useful way to get the kernel. The same idea should then be applied to image() for good measure. Benoit 1 - discussed with Keir 2
Binary library: BLAS Implement a BLAS library using Eigen. Useful both in itself, and as a prerequisite for other Todo items (ultimately for using the LAPACK test-suite). Gael 2 - see blas/ directory 2 - Gael: I don't see it as a stopper since there exist a couple of other BLAS libraries.
Binary library: LAPACK Implement a LAPACK library using Eigen. Useful both in itself, and as a prerequisite for the Todo item about using the LAPACK test-suite. 1 - some threads 2
Tests : original LAPACK testsuite The idea is that once we have a LAPACK library implemented using Eigen, we should be able to run the original LAPACK testsuite directly against it. 1 - thread "back from google" 2 - however we need precision testing anyhow for 3.0-beta1
Tests : precision-oriented tests Most of our current tests only check for rough sanity of the result, we need precision-oriented tests too. Since it's impossible to predict what the precision of a computation should be in general, one should manually record precision at a time when the algorithm is known to be sane, and make that the benchmark to check against, to prevent future regressions. Another approach is to compare precision to other existing libraries, but then, probably only some LAPACK libs are reliable enough for that. Also notice that if the "LAPACK test suite" todo item is achieved, it will already guarantee precision quite nicely, making this todo item less of an emergency. 2 - there is inverse_4x4 2
Core - optimize partial reductions Partial reductions can be evaluated/vectorized in a more clever way with respect to the relative storage order Gael 1 - I know how to do it ;) 2
Sparse - API stability for basic features The goal is to have a stable sparse module for all basics features (matrix assembly, products, triangular solvers). The linear solvers will stay experimental. Gael 3 0, or move to Unsupported for 3.0
Eigenvalues for real general matrices The implementation borrowed from JAMA should be rewritten in a clean way separating the Real Schur decomposition as for the complex case. Then the ComplexEigenSolver and EigenSolver should be merged. Jitse 4 0, or consider moving part of Eigenvalues module to Unsupported
Fixing 'all' STL containers Some more STL containers (e.g. std::list) require the same handling as was required for std::vector. This is due resizing and passing by value. 1 - short discussion on IRC 2
cache/block size at runtime Currently we only allow to control the cache/block size at compile-time. Big users like Google need to control it at runtime, e.g. to have a single binary that runs on various hardware. Gael 4 0
JacobiSVD template parameters Improve/simplify them according to discussion on the list Benoit 5 - thread on the list 0
New SVD The old SVD has been removed. Need to write a shiny new DivConqSVD implementing Jessup-Sorensen divide-and-conquer SVD, with an efficient (i.e. blocked) bidiagonalization step, and with a fancy API allowing to compute/apply U and V on demand after the decomposition has been done. Benoit 1 - been talking about it forever 2
Blocked Householder At the moment our Householder transformations are not blocked, doing this and adapting decompositions to use this would result in large performance improvements and is a priority because it seems as if it might reveal needs for API improvements. 3 - done for non-pivoting HouseholderQR, now do ColPiv... 2
Core - linear traversal At the moment, in Assign.h, the LinearAccessBit is only used in the vectorized case. But non-vectorized code could benefit from it too! Benoit 5 2
SVD - bidiagonalization I think we should axe the UpperBidiagonalization class. I thought it would be useful to implement the new SVD, it's not. Benoit 4 0
Numerical differentiation Provide basic Numerical differentiation, and no more. Orzel 4 2 (or maybe even 3?) required for the "NonLinearOptimization" module.
Matrix functions - matrix exponential Decide on API; use new ReturnByValue mechanism? Decide on whether it should move to supported. Jitse 5 3
Matrix functions - other functions Implement Schur-Parlett algorithm of Davies & Higham. Depends on: eigenvalues module (for complex Schur decomposition) and permutation matrix. Jitse 5 3
Trapezoid matrices Currently triangular matrices are required to be square. Fix that. Benoit 4 - now must use it everywhere (LU, QR...) 0
HouseholderSequence methods to construct matrices Allow to easily construct the full or compact unitary matrices Benoit 1 - been discussed 2
SSE4 It seems to provide at least tiny dot products (even on data smaller than a packet) and integer multiplication. 2 - integer mul is done. Now play with dot products and all the rest... 3
Uniform external library support Via a preprocessor token? a dedicated (sub)module? automatically enable it if a header of the respective library has been included? Gael 2 - 1 if any module using that makes it into 3.0
Enable expression template for Transform Thanks to the new ET design, this should be easy, and this will make the Transform class much more appealing for large dimensions. Gael 1 - 0 - or declare not doable because of all the operations that don't map well to xpr templates??
Optimize diagonal products even more There are still room for better vectorization, e.g., v0.asDiagonal() * v1 is not vectorized while it is simply v0.cwiseProduct(v1); Gael 1 - 3
Documentation See Documentation for Eigen3 for a planned outline. 3? 0-2?
Const-correctness Map allows to circumvent const-correctness. 1 - mailing list discussion 0