Difference between revisions of "3.4"
From Eigen
(→Hardware support) |
|||
(10 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | Eigen 3.4-beta1 has been released on | + | Eigen 3.4-beta1 has been released on ????. It can be downloaded from the Download section on the Main Page. |
− | Since Eigen 3.3, the 3.4 development branch received more than | + | Since Eigen 3.3, the 3.4 development branch received more than 1750 commits [1] representing numerous major changes. |
+ | |||
+ | === Changes that might impact existing code === | ||
+ | |||
+ | * Using float or double for indexing matrices, vectors and array will now fail to compile, ex.: | ||
+ | <source lang="cpp"> | ||
+ | MatrixXd A(10,10); | ||
+ | float one = 1; | ||
+ | double a11 = A(one,1.); // compilation error here | ||
+ | </source> | ||
=== New features === | === New features === | ||
Line 43: | Line 52: | ||
std::sort(c.begin(), c.end()); | std::sort(c.begin(), c.end()); | ||
</source> | </source> | ||
+ | |||
+ | * Add c++11 '''initializer_list constructors''' to Matrix and Array [http://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixClass.html#title3 [doc]]: | ||
+ | <source lang="cpp"> | ||
+ | MatrixXi a { // construct a 2x3 matrix | ||
+ | {1,2,3}, // first row | ||
+ | {4,5,6} // second row | ||
+ | }; | ||
+ | VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamic-size vector with 5 elements | ||
+ | Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5. | ||
+ | </source> | ||
+ | |||
+ | * Add c++11 '''template aliases''' for Matrix, Vector, and Array of common sizes, including generic <code>Vector<Type,Size></code> and <code>RowVector<Type,Size></code> aliases [http://eigen.tuxfamily.org/dox-devel/group__matrixtypedefs.html [doc]]. | ||
* A new '''namespace indexing''' allowing to exclusively import the subset of functions and symbols that are typically used within <code>A(.,.)</code>, that is: all,seq, seqN, lastN, last, lastp1. [http://eigen.tuxfamily.org/dox-devel/namespaceEigen_1_1indexing.html [doc]] | * A new '''namespace indexing''' allowing to exclusively import the subset of functions and symbols that are typically used within <code>A(.,.)</code>, that is: all,seq, seqN, lastN, last, lastp1. [http://eigen.tuxfamily.org/dox-devel/namespaceEigen_1_1indexing.html [doc]] | ||
+ | |||
+ | * All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherits <code>SolverBase</code> and thus support <code>.transpose()</code> and <code>.adjoint()</code> solving [https://eigen.tuxfamily.org/dox/classEigen_1_1SolverBase.html API]. | ||
* Misc | * Misc | ||
Line 51: | Line 74: | ||
** Add diagmat +/- diagmat operators (bug 520) | ** Add diagmat +/- diagmat operators (bug 520) | ||
** Add specializations for <code>res ?= dense +/- sparse</code> and <code>res ?= sparse +/- dense</code>. (see bug 632) | ** Add specializations for <code>res ?= dense +/- sparse</code> and <code>res ?= sparse +/- dense</code>. (see bug 632) | ||
− | ** Add support for SuiteSparse's KLU sparse direct solver (LU-based solver tailored for problems coming from circuit simulation). | + | ** Add <code>sparse_matrix =,+=,-= diagonal_matrix</code> support with smart insertion strategies of missing diagonal coeffs. (see bug 1574) |
+ | ** Add <code>conjugateIf<bool></code> members for conditional conjugation. | ||
+ | ** Add support for SuiteSparse's '''KLU''' sparse direct solver (LU-based solver tailored for problems coming from circuit simulation). | ||
+ | |||
+ | === Alignment === | ||
+ | |||
+ | Eigen now uses c++11 '''alignas''' keyword for static alignment. Users targeting c++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code> EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>. | ||
=== Performance optimizations === | === Performance optimizations === | ||
Line 64: | Line 93: | ||
* 20% speedup of matrix products on ARM64 | * 20% speedup of matrix products on ARM64 | ||
* Speed-up reductions of sub-matrices. | * Speed-up reductions of sub-matrices. | ||
+ | * Huge speedup for LU factorization of small fixed-size matrices. | ||
* Optimize extraction of factor Q in SparseQR. | * Optimize extraction of factor Q in SparseQR. | ||
* SIMD implementations of math functions (exp,log,sin,cos) have been unified as a generic implementation compatible over all supported SIMD engines (SSE,AVX,AVX512,NEON,Altivec,VSX,MSA). | * SIMD implementations of math functions (exp,log,sin,cos) have been unified as a generic implementation compatible over all supported SIMD engines (SSE,AVX,AVX512,NEON,Altivec,VSX,MSA). |
Revision as of 22:17, 25 November 2019
Eigen 3.4-beta1 has been released on ????. It can be downloaded from the Download section on the Main Page. Since Eigen 3.3, the 3.4 development branch received more than 1750 commits [1] representing numerous major changes.
Contents
Changes that might impact existing code
- Using float or double for indexing matrices, vectors and array will now fail to compile, ex.:
MatrixXd A(10,10); float one = 1; double a11 = A(one,1.); // compilation error here
New features
- New versatile API for sub-matrices, slices, and indexed views [doc]. It basically extends
A(.,.)
to let it accept anything that looks-like a sequence of indices with random access. To make it usable this new feature comes with new symbols:Eigen::all
,Eigen::last
, and functions generating arithmetic sequences:Eigen::seq(first,last[,incr])
,Eigen::seqN(first,size[,incr])
,Eigen::lastN(size[,incr])
. Here is an example picking even rows but the first and last ones, and a subset of indexed columns:
MatrixXd A = ...; std::vector<int> col_ind{7,3,4,3}; MatrixXd B = A(seq(2,last-2,fix<2>), col_ind);
- Reshaped views through the new members
reshaped()
andreshaped(rows,cols)
. This feature also comes with new symbols:Eigen::AutoOrder
,Eigen::AutoSize
. [doc]
- A new helper
Eigen::fix<N>
to pass compile-time integer values to Eigen's functions [doc]. It can be used to pass compile-time sizes to.block(...)
,.segment(...)
, and all variants, as well as the first, size and increment parameters of the seq, seqN, and lastN functions introduced above. You can also pass "possibly compile-time values" throughEigen::fix<N>(n)
. Here is an example comparing the old and new way to call.block
with fixed sizes:
template<typename MatrixType,int N> void foo(const MatrixType &A, int i, int j, int n) { A.block(i,j,2,3); // runtime sizes // compile-time nb rows and columns: A.template block<2,3>(i,j); // 3.3 way A.block(i,j,fix<2>,fix<3>); // new 3.4 way // compile-time nb rows only: A.template block<2,Dynamic>(i,j,2,n); // 3.3 way A.block(i,j,fix<2>,n); // new 3.4 way // possibly compile-time nb columns // (use n if N==Dynamic, otherwise we must have n==N): A.template block<2,N>(i,j,2,n); // 3.3 way A.block(i,j,fix<2>,fix<N>(n)); // new 3.4 way }
- Add STL-compatible iterators for dense expressions [doc]. Some examples:
VectorXd v = ...; MatrixXd A = ...; // range for loop over all entries of v then A for(auto x : v) { cout << x << " "; } for(auto x : A.reshaped()) { cout << x << " "; } // sort v then each column of A std::sort(v.begin(), v.end()); for(auto c : A.colwise()) std::sort(c.begin(), c.end());
- Add c++11 initializer_list constructors to Matrix and Array [doc]:
MatrixXi a { // construct a 2x3 matrix {1,2,3}, // first row {4,5,6} // second row }; VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamic-size vector with 5 elements Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5.
- Add c++11 template aliases for Matrix, Vector, and Array of common sizes, including generic
Vector<Type,Size>
andRowVector<Type,Size>
aliases [doc].
- A new namespace indexing allowing to exclusively import the subset of functions and symbols that are typically used within
A(.,.)
, that is: all,seq, seqN, lastN, last, lastp1. [doc]
- All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherits
SolverBase
and thus support.transpose()
and.adjoint()
solving API.
- Misc
- Add templated
subVector<Vertical/Horizonal>(Index)
aliases tocol/row(Index)
methods, andsubVectors<>()
aliases torows()/cols()
. - Add
innerVector()
andinnerVectors()
methods. - Add diagmat +/- diagmat operators (bug 520)
- Add specializations for
res ?= dense +/- sparse
andres ?= sparse +/- dense
. (see bug 632) - Add
sparse_matrix =,+=,-= diagonal_matrix
support with smart insertion strategies of missing diagonal coeffs. (see bug 1574) - Add
conjugateIf<bool>
members for conditional conjugation. - Add support for SuiteSparse's KLU sparse direct solver (LU-based solver tailored for problems coming from circuit simulation).
- Add templated
Alignment
Eigen now uses c++11 alignas keyword for static alignment. Users targeting c++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all issues related to static alignment, including EIGEN_MAKE_ALIGNED_OPERATOR_NEW
.
Performance optimizations
- Vectorization of partial-reductions along outer-dimension, e.g.: colmajor.rowwise().mean()
- Speed up evaluation of HouseholderSequence to a dense matrix, e.g.,
MatrixXd Q = A.qr().householderQ();
- Various optimizations of matrix products for small and medium sizes when using large SIMD registers (e.g., AVX and AVX512).
- Optimize evaluation of small products of the form
s*A*B
by rewriting them as:s*(A.lazyProduct(B))
to save a costly temporary. Measured speedup from 2x to 5x (see bug 1562). - Improve multi-threading heuristic for matrix products with a small number of columns.
- 20% speedup of matrix products on ARM64
- Speed-up reductions of sub-matrices.
- Huge speedup for LU factorization of small fixed-size matrices.
- Optimize extraction of factor Q in SparseQR.
- SIMD implementations of math functions (exp,log,sin,cos) have been unified as a generic implementation compatible over all supported SIMD engines (SSE,AVX,AVX512,NEON,Altivec,VSX,MSA).
Hardware support
- AVX512 support is now complete (including complex scalars) and enabled by default when enabled on compiler side.
- Generalization of the CUDA support to CUDA/HIP for AMD GPUs.
- Add explicit SIMD support for MSA instruction set (MIPS).
Footnotes
[1] $ hg log -r "3.3.0:: and not merge() and not branch(3.2) and not branch(3.3)" | grep "changeset:" | wc -l