Difference between revisions of "3.4"
From Eigen
(→New Major Features in Core) |
|||
(24 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | Eigen 3.4 | + | Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the |
− | + | [https://eigen.tuxfamily.org/index.php?title=Main_Page Main Page] or from [https://gitlab.com/libeigen/eigen/-/releases/3.4.0 Gitlab]. | |
− | + | '''Notice:''' that 3.4.x will be the last major release series of Eigen that will support c++03. The master branch will drop c++03 support after this release. | |
− | * | + | == Changes to supported modules == |
+ | |||
+ | === Changes that might break existing code === | ||
+ | |||
+ | * Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.: | ||
<source lang="cpp"> | <source lang="cpp"> | ||
− | MatrixXd A | + | MatrixXd A(10,10); |
− | + | float one = 1; | |
− | + | double a11 = A(one,1.); // compilation error here | |
</source> | </source> | ||
− | + | === New Major Features in Core === | |
− | * | + | * Add c++11 '''initializer_list constructors''' to Matrix and Array [http://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixClass.html#title3 [doc]]: |
<source lang="cpp"> | <source lang="cpp"> | ||
− | + | MatrixXi a { // construct a 2x3 matrix | |
− | + | {1,2,3}, // first row | |
− | + | {4,5,6} // second row | |
− | + | }; | |
− | + | VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamic-size vector with 5 elements | |
− | + | Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5. | |
− | // | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
</source> | </source> | ||
Line 44: | Line 40: | ||
</source> | </source> | ||
− | * | + | * New versatile API for sub-matrices, '''slices''', and '''indexed views''' [http://eigen.tuxfamily.org/dox-devel/group__TutorialSlicingIndexing.html [doc]]. It basically extends <code>A(.,.)</code> to let it accept anything that looks-like a sequence of indices with random access. To make it usable this new feature comes with new symbols: <code>Eigen::indexing::all</code>, <code>Eigen::indexing::last</code>, and functions generating arithmetic sequences: <code>Eigen::seq(first,last[,incr])</code>, <code>Eigen::seqN(first,size[,incr])</code>, <code>Eigen::lastN(size[,incr])</code>. Here is an example picking even rows but the first and last ones, and a subset of indexed columns: |
<source lang="cpp"> | <source lang="cpp"> | ||
− | + | MatrixXd A = ...; | |
− | {1,2,3}, | + | std::vector<int> col_ind{7,3,4,3}; |
− | { | + | MatrixXd B = A(seq(2,last-2,fix<2>), col_ind); |
+ | </source> | ||
+ | |||
+ | * Add C++11 '''template aliases''' for Matrix, Vector, and Array of common sizes, including generic <code>Vector<Type,Size></code> and <code>RowVector<Type,Size></code> aliases [http://eigen.tuxfamily.org/dox-devel/group__matrixtypedefs.html [doc]]. | ||
+ | <source lang="cpp"> | ||
+ | MatrixX<double> M; // Instead of MatrixXd or Matrix<Dynamic, Dynamic, double> | ||
+ | Vector4<MyType> V; // Instead of Vector<4, MyType> | ||
+ | </source> | ||
+ | |||
+ | * New support for <code>bfloat16</code>. The 16-bit [https://en.wikipedia.org/wiki/Bfloat16_floating-point_format Brain floating point format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>. | ||
+ | <source lang="cpp"> | ||
+ | bfloat16 s(0.25); // explicit construction | ||
+ | uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation | ||
+ | |||
+ | using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; | ||
+ | MatrixBf16 X = s * MatrixBf16::Random(3, 3); | ||
+ | </source> | ||
+ | |||
+ | === New backends === | ||
+ | |||
+ | * '''Arm SVE:''' Eigen now supports Arm's [https://developer.arm.com/documentation/101726/0300/Learn-about-the-Scalable-Vector-Extension--SVE-/What-is-the-Scalable-Vector-Extension- Scalable Vector Extension (SVE)]. Currently only fixed-length SVE vectors for <code>uint32_t</code> and <code>float</code> are available. | ||
+ | * '''MIPS MSA:''' Eigen now supports the [https://www.mips.com/products/architectures/ase/simd/ MIPS SIMD Architecture (MSA)] | ||
+ | * '''AMD ROCm/HIP:''' Eigen now contains a generic GPU backend that unifies support for [https://developer.nvidia.com/cuda-toolkit NVIDIA/CUDA] and [https://rocmdocs.amd.com/en/latest/ AMD/HIP]. | ||
+ | * '''Power 10 MMA Backend:''' Eigen now has initial support for [https://arxiv.org/pdf/2104.03142.pdf Power 10 matrix multiplication assist instructions] for float32 and float64, real and complex. | ||
+ | |||
+ | === Improvements to Eigen Core === | ||
+ | * Eigen now uses the c++11 '''alignas''' keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code>EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>. | ||
+ | * Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented: | ||
+ | ** By using half- and quater-packets the performance of matrix multiplications of small to medium sized matrices has been improved | ||
+ | ** Eigen's GEMM now falls back to GEMV if it detects that a matrix is a run-time vector | ||
+ | ** The performance of matrix products using Arm Neon has been drastically improved (up to 20%) | ||
+ | ** Performance of many special cases of matrix products has been improved | ||
+ | * Large speed up from blocked algorithm for <code>.transposeInPlace</code>. | ||
+ | * Speed up misc. operations by propagating compile-time sizes (col/row-wise reverse, PartialPivLU, and others) | ||
+ | * Faster specialized SIMD kernels for small fixed-size inverse, LU decomposition, and determinant. | ||
+ | * Improved or added vectorization of partial or slice reductions along the outer-dimension, for instance: <code>colmajor_mat.rowwise().mean()</code> | ||
+ | |||
+ | === Elementwise math functions === | ||
+ | * Many functions are now implemented and vectorized in generic (backend-agnostic) form. | ||
+ | * Many improvements to correctness, accuracy, and compatibility with c++ standard library. | ||
+ | ** Much improved implementation of <code>ldexp</code>. | ||
+ | ** Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions. | ||
+ | ** New implementation of the Payne-Hanek for argument reduction algorithm for <code>sin</code> and <code>cos</code> with huge arguments. | ||
+ | ** New faithfully rounded algorithm for <code>pow(x,y)</code>. | ||
+ | * Speedups from (new or improved) vectorized versions of <code>pow, log, sin, cos, arg, pow, log2</code>, complex <code>sqrt, erf, expm1, logp1, logistic, rint, gamma</code> and <code>bessel</code> functions, and more. | ||
+ | * Improved special function support (Bessel and gamma functions, <code>ndtri, erfc</code>, inverse hyperbolic functions and more) | ||
+ | * New elementwise functions for <code>absolute_difference</code>, <code>rint</code>. | ||
+ | |||
+ | === Dense matrix decompositions and solvers === | ||
+ | * All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support <code>.transpose()</code>, <code>.adjoint()</code> and <code>.solve()</code> APIs. | ||
+ | * SVD implementations now have an <code>info()</code> method for checking convergence. | ||
+ | <source lang="cpp"> | ||
+ | #include <Eigen/SVD> | ||
+ | MatrixXf m = MatrixXf::Random(3,2); | ||
+ | JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV); | ||
+ | if (svd.info() == ComputationInfo::Success) { | ||
+ | // SVD computation was successful. | ||
+ | VectorXf x = svd.solve(b); | ||
+ | } | ||
+ | </source> | ||
+ | * Most decompositions now fail quickly when invalid inputs are detected. | ||
+ | * Optimized the product of a <code>HouseholderSequence</code> with the identity, as well as the evaluation of a <code>HouseholderSequence</code> to a dense matrix using faster blocked product. | ||
+ | * Fixed aliasing issues with in-place small matrix inversions. | ||
+ | * Fixed several edge-cases with empty or zero inputs. | ||
+ | |||
+ | === Sparse matrix support, decompositions and solvers === | ||
+ | * Enabled assignment and addition with diagonal matrix expressions. | ||
+ | <source lang="cpp"> | ||
+ | SparseMatrix<float> A(10, 10); | ||
+ | VectorXf x = VectorXf::Random(10); | ||
+ | A = x.asDiagonal(); | ||
+ | A += x.asDiagonal(); | ||
+ | </source> | ||
+ | * Support added for SuiteSparse KLU routines via the <code>KLUSupport</code> module. SuiteSparse must be installed to use this module. | ||
+ | <source lang="cpp"> | ||
+ | #include <Eigen/KLUSupport> | ||
+ | A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. | ||
+ | KLU<SparseMatrix<T> > klu(A); | ||
+ | if (klu.info() == ComputationInfo::Success) { | ||
+ | VectorXf x = klu.solve(b); | ||
+ | } | ||
+ | </source> | ||
+ | * <code>SparseCholesky</code> now works with row-major matrices. | ||
+ | * Various bug fixes and performance improvements. | ||
+ | |||
+ | === Type support === | ||
+ | * Improved support for <code>half</code> | ||
+ | ** Native support added for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, and <code>F16C</code> conversion intrinsics. | ||
+ | ** Better vectorization support added across all backends. | ||
+ | * Improved bool support | ||
+ | ** Partial vectorization support added for boolean operations. | ||
+ | ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>. | ||
+ | * Improved support for custom types | ||
+ | ** More custom types work out-of-the-box (see [https://gitlab.com/libeigen/eigen/-/issues/2201 #2201]). | ||
+ | |||
+ | === Improved Geometry Module === | ||
+ | * '''Behavioral change:''' <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see [https://gitlab.com/libeigen/eigen/-/merge_requests/349 !349]). | ||
+ | * New partial vectorization support added for <code>Quaternion</code>. | ||
+ | * Generic vectorized 4x4 matrix inversion. | ||
+ | |||
+ | === Backend-specific improvements === | ||
+ | * '''Arm NEON''' | ||
+ | ** Now provides vectorization for <code>uint64_t</code>, <code>int64_t</code>, <code>uint32_t</code>, <code>int16_t</code>, <code>uint16_t</code>, <code>int16_t</code>, <code>int8_t</code>, and <code>uint8_t</code> | ||
+ | ** Emulates <code>bfloat16</code> support when using <code>Eigen::bfloat16</code> | ||
+ | ** Supports emulated and native <code>float16</code> when using <code>Eigen::half</code> | ||
+ | * '''SSE/AVX/AVX512''' | ||
+ | ** General performance improvements and bugfixes. | ||
+ | ** Enabled AVX512 instructions by default if available. | ||
+ | ** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added. | ||
+ | ** Many missing packet functions added. | ||
+ | * '''Altivec/Power''' | ||
+ | ** General performance improvement and bugfixes. | ||
+ | ** Enhanced vectorization of real and complex scalars. | ||
+ | ** Changes to the <code>gebp_kernel</code> specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products. | ||
+ | ** Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on <code>__builtin_cpu_supports</code>. | ||
+ | * '''GPU (CUDA and HIP)''' | ||
+ | ** Several optimized math functions added, better support for <code>std::complex</code>. | ||
+ | ** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>. | ||
+ | ** Many more functions can now be used in device code (e.g. comparisons, small matrix inversion). | ||
+ | * '''ZVector''' | ||
+ | ** Vectorized <code>float</code> and <code>std::complex<float></code> support added. | ||
+ | ** Added z14 support. | ||
+ | * '''SYCL''' | ||
+ | ** Redesigned SYCL implementation for use with the [https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html Tensor] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>. | ||
+ | ** New generic memory model introduced used by <code>TensorDeviceSycl</code>. | ||
+ | ** Better integration with OpenCL devices. | ||
+ | ** Added many math function specializations. | ||
+ | |||
+ | === Miscellaneous API Changes === | ||
+ | * New <code>setConstant(...)</code> methods for preserving one dimension of a matrix by passing in <code>NoChange</code>. | ||
+ | <source lang="cpp"> | ||
+ | MatrixXf A(10, 5); // 10x5 matrix. | ||
+ | A.setConstant(NoChange, 10, 2); // 10x10 matrix of 2s. | ||
+ | A.setConstant(5, NoChange, 3); // 5x10 matrix of 3s. | ||
+ | A.setZero(NoChange, 20); // 5x20 matrix of 0s. | ||
+ | A.setZero(20, NoChange); // 20x20 matrix of 0s. | ||
+ | A.setOnes(NoChange, 5); // 20x5 matrix of 1s. | ||
+ | A.setOnes(5, NoChange); // 5x5 matrix of 1s. | ||
+ | A.setRandom(NoChange, 10); // 5x10 random matrix. | ||
+ | A.setRandom(10, NoChange); // 10x10 random matrix. | ||
+ | </source> | ||
+ | * Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero. | ||
+ | <source lang="cpp"> | ||
+ | VectorXf v(5); | ||
+ | v.setUnit(3); // { 0, 0, 0, 1, 0} | ||
+ | </source> | ||
+ | * Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>. | ||
+ | * Added <code>shiftLeft<N>()</code> and <code>shiftRight<N>()</code> coefficient-wise arithmetic shift functions to Arrays. | ||
+ | <source lang="cpp"> | ||
+ | ArrayXXi A = ArrayXXi::Random(2, 3); | ||
+ | ArrayXXi B = A.shiftRight<2>(); | ||
+ | ArrayXXi C = A.shiftLeft<6>(); | ||
+ | </source> | ||
+ | * Enabled adding and subtracting of diagonal expressions. | ||
+ | <source lang="cpp"> | ||
+ | VectorXf x = VectorXf::Random(5); | ||
+ | VectorXf y = VectorXf::Random(5); | ||
+ | MatrixXf A = MatrixXf::Identity(5, 5); | ||
+ | A += x.asDiagonal() - y.asDiagonal(); | ||
+ | </source> | ||
+ | * Allow user-defined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>. | ||
+ | * Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable. | ||
+ | * Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>-DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>). | ||
+ | * Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>. | ||
+ | |||
+ | === Improvement to NaN propagation === | ||
+ | |||
+ | * Improvements to NaN correctness for elementwise functions. | ||
+ | * New <code>NaNPropagation</code> template argument to control whether NaNs are propagated or suppressed in elementwise <code>min/max</code> and corresponding reductions on <code>Array</code>, <code>Matrix</code>, and <code>Tensor</code>. Example for max: | ||
+ | <source lang="cpp"> | ||
+ | // Elementwise maximum | ||
+ | Eigen::MatrixXf left, right, r0, r1, r2; | ||
+ | r0 = left.cwiseMax(right); // Implementation defined behavior. | ||
+ | // Propagate NaN if either argument is NaN. | ||
+ | r1 = left.template cwiseMax<PropagateNaN>(right); | ||
+ | // Suppress NaN if at least one argument is not a NaN. | ||
+ | r2 = left.template cwiseMax<PropagateNumbers>(right); | ||
+ | |||
+ | // Max reductions | ||
+ | Eigen::MatrixXf m; | ||
+ | float nan_or_max = m.maxCoeff(); // Implementation defined behavior. | ||
+ | float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>(); | ||
+ | float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>(); | ||
+ | </source> | ||
+ | |||
+ | == Changes to unsupported modules == | ||
+ | === New low-latency non-blocking ThreadPool module === | ||
+ | * Originally a part of the Tensor module, <code>Eigen::ThreadPool</code> is now separate and more portable, and forms the basis for multi-threading in TensorFlow, for example. Example: | ||
+ | <source lang="cpp"> | ||
+ | #include <Eigen/CXX11/ThreadPool> | ||
+ | |||
+ | const int num_threads = 42; | ||
+ | Eigen::ThreadPool tp(num_threads); | ||
+ | auto do_stuff = []() { ... }; | ||
+ | tp.Schedule(do_stuff); | ||
+ | </source> | ||
+ | |||
+ | === Changes to Tensor module === | ||
+ | * Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of <code>nvcc</code>. | ||
+ | * Performance optimizations of Tensor contraction | ||
+ | ** Speed up "outer-product-like" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting. | ||
+ | ** Improved threading heuristics. | ||
+ | ** Support for fusing element-wise operations into contraction during evaluation. Example: | ||
+ | <source lang="cpp"> | ||
+ | // This example applies std::sqrt to all output elements from a tensor contraction. | ||
+ | // The optional OutputKernel argument to the contraction in this example is a functor over a | ||
+ | // 2-dimensional buffer. The functor is called once for each output block of the contraction | ||
+ | // result, to perform the elementwise sqrt operation while the block is hot in cache. | ||
+ | struct SqrtOutputKernel { | ||
+ | template <typename Index, typename Scalar> | ||
+ | EIGEN_ALWAYS_INLINE void operator()( | ||
+ | const internal::blas_data_mapper<Scalar, Index, ColMajor>& output_mapper, | ||
+ | const TensorContractionParams&, Index, Index, Index num_rows, | ||
+ | Index num_cols) const { | ||
+ | for (int i = 0; i < num_rows; ++i) { | ||
+ | for (int j = 0; j < num_cols; ++j) { | ||
+ | output_mapper(i, j) = std::sqrt(output_mapper(i, j)); | ||
+ | } | ||
+ | } | ||
+ | } | ||
}; | }; | ||
− | + | ||
− | + | Tensor<float, 4, DataLayout> left(30, 50, 8, 31); | |
+ | Tensor<float, 5, DataLayout> right(8, 31, 7, 20, 10); | ||
+ | Tensor<float, 5, DataLayout> result(30, 50, 7, 20, 10); | ||
+ | Eigen::array<DimPair, 2> dims({{DimPair(2, 0), DimPair(3, 1)}}); | ||
+ | |||
+ | result = left.contract(right, dims, SqrtOutputKernel()); | ||
</source> | </source> | ||
− | * | + | * Performance optimizations of other Tensor operator |
+ | ** Speedups from improved vectorization, block evaluation, and multi-threading for most operators. | ||
+ | ** Significant speedup to broadcasting. | ||
+ | ** Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding. | ||
+ | * Complete rewrite of the block (tiling) evaluation framework for tensor expressions lead to significant speedups and reduced number of memory allocations. | ||
+ | * Added new API for asynchronous evaluation of tensor expressions. Example: | ||
+ | <source lang="cpp"> | ||
+ | Tensor<float, 3> in1(200, 30, 70); | ||
+ | Tensor<float, 3> in2(200, 30, 70); | ||
+ | Tensor<float, 3> out(200, 30, 70); | ||
− | + | Eigen::ThreadPool tp(internal::random<int>(3, 11)); | |
+ | Eigen::ThreadPoolDevice thread_pool_device(&tp, internal::random<int>(3, 11)); | ||
− | + | Eigen::Barrier b(1); | |
+ | auto done = [&b]() { b.Notify(); }; | ||
+ | out.device(thread_pool_device, std::move(done)) = in1 + in2 * 3.14f; | ||
+ | b.Wait(); | ||
+ | </source> | ||
+ | * Misc. minor behavior changes & fixes: | ||
+ | ** Fix const correctness for TensorMap. | ||
+ | ** Modify tensor argmin/argmax to always return first occurrence. | ||
+ | ** More numerically stable tree reduction. | ||
+ | ** Improve randomness of the tensor random generator. | ||
+ | ** Update the padding computation for PADDING_SAME to be consistent with TensorFlow. | ||
+ | ** Support static dimensions (aka IndexList) in resizing/reshape/broadcast. | ||
+ | ** Improved accuracy of Tensor FFT. | ||
− | + | === Improvements to FFT module === | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | * Faster and more accurate twiddle factor computation. | |
− | + | === Improvements to EulerAngles === | |
− | + | * EulerAngles can now be directly constructed from 3D vectors | |
+ | * EulerAngles now provide <code>isApprox()</code> and <code>cast()</code> functions | ||
− | + | === Changes to sparse iterative solvers === | |
− | * | + | * Added new IDRS iterative linear solver. |
− | + | <source lang="cpp"> | |
+ | #include <unsupported/Eigen/IterativeSolvers> | ||
+ | A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. | ||
+ | IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); | ||
+ | VectorXf x = idrs.solve(b); | ||
+ | bool success = (idrs.info() == ComputationInfo::Success); | ||
</source> | </source> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | === | + | === Improvements to Polynomials === |
+ | |||
+ | * PolynomialSolver can now be used with complex numbers | ||
+ | * The solver will automatically choose between <code>EigenSolver</code> and <code>ComplexEigenSolver</code> depending on the scalar type used | ||
+ | |||
+ | == Other relevant changes == | ||
+ | |||
+ | * Eigen now provides an option to test with an external BLAS library | ||
+ | * Eigen can now be used with the [https://en.wikipedia.org/wiki/The_Portland_Group PGI Compiler] | ||
+ | * Printing when using GDB has been improved | ||
+ | * Eigen can now detect if a platform supports <code>int128</code> intrinsics | ||
+ | |||
+ | == Testing == | ||
+ | The full Eigen test suite was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations: | ||
+ | |||
+ | {| class="wikitable" | ||
+ | !Compiler !! Version !! Platform !! Operating system | ||
+ | |- | ||
+ | |Microsoft Visual Studio || 2015 Update 3 || x86-64 || Windows | ||
+ | |- | ||
+ | |Microsoft Visual Studio || Community 2017 - 15.9.38 || x86-64 || Windows | ||
+ | |- | ||
+ | |Microsoft Visual Studio || Community 2019 - 16.11 || x86-64 || Windows | ||
+ | |- | ||
+ | |GCC || 4.8 || x86-64 || Linux | ||
+ | |- | ||
+ | |GCC || 9 || x86-64 || Linux | ||
+ | |- | ||
+ | |GCC || 10 || x86-64 || Linux | ||
+ | |- | ||
+ | |Clang || 6.0 || x86-64 || Linux | ||
+ | |- | ||
+ | |Clang || 10 || x86-64 || Linux | ||
+ | |- | ||
+ | |Clang || 11 || x86-64 || Linux | ||
+ | |- | ||
+ | |GCC || 10 || armv8.2-a || Linux | ||
+ | |- | ||
+ | |Clang || 6 || armv8.2-a || Linux | ||
+ | |- | ||
+ | |Clang || 9 || armv8.2-a || Linux | ||
+ | |- | ||
+ | |Clang || 10 || armv8.2-a || Linux | ||
+ | |- | ||
+ | |Clang || 11 || armv8.2-a || Linux | ||
+ | |- | ||
+ | |AppleClang || 12.0.5 || x86-64 || macOS | ||
+ | |- | ||
+ | |GCC || 10 || ppc64le || Linux | ||
+ | |- | ||
+ | |Clang || 10 || ppc64le || Linux | ||
+ | |- | ||
+ | |} | ||
− | + | == List of issues fixed in Eigen 3.4 == | |
− | + | ||
− | + | ||
− | == | + | {| |
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2298 Issue #2298] | ||
+ | | List of dense linear decompositions lacks completeorthogonal decomposition | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2284 Issue #2284] | ||
+ | | JacobiSVD Outputs Invalid U (Reads Past End of Array) | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2267 Issue #2267] | ||
+ | | [3.4 bug] FixedInt<0> error with gcc 4.9.3 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2263 Issue #2263] | ||
+ | | usage of signed zeros leads to wrong results with -ffast-math | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2251 Issue #2251] | ||
+ | | Method unaryExpr() does not support function pointers in Eigen 3.4rc1 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2242 Issue #2242] | ||
+ | | No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h' | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2229 Issue #2229] | ||
+ | | Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2) | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2213 Issue #2213] | ||
+ | | template maxCoeff<PropagateNaN> compilation error with Eigen 3.4. | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2209 Issue #2209] | ||
+ | | unaryExpr deduces wrong return type on MSVC | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2157 Issue #2157] | ||
+ | | forward_adolc test fails since PR !363 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2119 Issue #2119] | ||
+ | | Move assignment swaps even for non-dynamic storage | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2112 Issue #2112] | ||
+ | | Build failure with boost::multiprecision type | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/2093 Issue #2093] | ||
+ | | Incorrect evaluation of Ref | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1906 Issue #1906] | ||
+ | | Eigen failed with error C2440 with MSVC on windows | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1850 Issue #1850] | ||
+ | | error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1833 Issue #1833] | ||
+ | | c++20 compilation failure | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1826 Issue #1826] | ||
+ | | -Wdeprecated-anon-enum-enum-conversion warnings (c++20) | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1815 Issue #1815] | ||
+ | | IndexedView of a vector should allow linear access | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1805 Issue #1805] | ||
+ | | Uploaded doxygen documentation does not build LaTeX formulae | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1790 Issue #1790] | ||
+ | | packetmath_1 unit test fails | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1788 Issue #1788] | ||
+ | | Rule-of-three/rule-of-five violations | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1776 Issue #1776] | ||
+ | | subvector_stl_iterator::operator-> triggers 'taking address of rvalue' warning | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1774 Issue #1774] | ||
+ | | std::cbegin() returns non-const iterator | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1752 Issue #1752] | ||
+ | | A change to the C++ Standard will break some tests | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1741 Issue #1741] | ||
+ | | Map<>.noalias()=A*B gives wrong result | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1736 Issue #1736] | ||
+ | | Column access of some IndexedView won't compile | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1718 Issue #1718] | ||
+ | | Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1695 Issue #1695] | ||
+ | | Stuck in loop for a certain input when using mpreal support | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1692 Issue #1692] | ||
+ | | pass enumeration argument to constructor of VectorXd | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1684 Issue #1684] | ||
+ | | array_reverse fails with clang >=6 + AVX + -O2 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1674 Issue #1674] | ||
+ | | SIMD sin/cos gives wrong results with -ffast-math | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1669 Issue #1669] | ||
+ | | Zero-sized matrices generate assertion failures | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1664 Issue #1664] | ||
+ | | dot product with single column block fails with new static checks | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1652 Issue #1652] | ||
+ | | Corner cases in SIMD sin/cos | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1643 Issue #1643] | ||
+ | | Compilation failure | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1637 Issue #1637] | ||
+ | | Register spilling with recent gcc & clang | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1619 Issue #1619] | ||
+ | | const_iterator vs iterator compilation error | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1615 Issue #1615] | ||
+ | | Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1611 Issue #1611] | ||
+ | | NEON: plog(+/-0) should return -inf and not NaN | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1585 Issue #1585] | ||
+ | | Matrix product is repeatedly evaluated when iterating over the product expression | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1557 Issue #1557] | ||
+ | | Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1544 Issue #1544] | ||
+ | | SparseQR generates incorrect Q matrix in complex case | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1543 Issue #1543] | ||
+ | | \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1493 Issue #1493] | ||
+ | | dense Q extraction and solve is sometimes erroneous for complex matrices | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1453 Issue #1453] | ||
+ | | Strange behavior for Matrix::Map, if only InnerStride is provided | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1409 Issue #1409] | ||
+ | | Add support for C++17 operator new alignment | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1340 Issue #1340] | ||
+ | | Add operator + to sparse matrix iterator | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1318 Issue #1318] | ||
+ | | More robust quaternion from matrix | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1306 Issue #1306] | ||
+ | | Add support for AVX512 to Eigen | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1305 Issue #1305] | ||
+ | | Implementation of additional component-wise unary functions | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1221 Issue #1221] | ||
+ | | I get tons of error since my distribution upgraded to GCC 6.1.1 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1195 Issue #1195] | ||
+ | | vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1194 Issue #1194] | ||
+ | | Improve det4x4 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1049 Issue #1049] | ||
+ | | std::make_shared fails to fulfill structure aliment | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1046 Issue #1046] | ||
+ | | fixed matrix types do not report correct alignment requirements | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1014 Issue #1014] | ||
+ | | Eigenvalues 3x3 matrix | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/1001 Issue #1001] | ||
+ | | infer dimensions of Dynamic-sized temporaries from the entire expression (if possible) | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/977 Issue #977] | ||
+ | | Add stable versions of normalize() and normalized() | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/899 Issue #899] | ||
+ | | SparseQR occasionally fails for under-determined systems | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/864 Issue #864] | ||
+ | | C++11 alias templates for commonly used types | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/751 Issue #751] | ||
+ | | Make AMD Ordering numerically more robust | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/747 Issue #747] | ||
+ | | Allow for negative stride | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/720 Issue #720] | ||
+ | | Gaussian NullaryExpr | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/663 Issue #663] | ||
+ | | Permit NoChange in setZero, setOnes, setConstant, setRandom | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/645 Issue #645] | ||
+ | | GeneralizedEigenSolver: missing computation of eigenvectors | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/632 Issue #632] | ||
+ | | Optimize addition/subtraction of sparse and dense matrices/vectors | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/631 Issue #631] | ||
+ | | (Optionally) throw an exception when using an unsuccessful decomposition | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/564 Issue #564] | ||
+ | | maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/556 Issue #556] | ||
+ | | Matrix multiplication crashes using mingw 4.7 | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/505 Issue #505] | ||
+ | | Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11) | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/445 Issue #445] | ||
+ | | ParametrizedLine should have transform method | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/437 Issue #437] | ||
+ | | [feature request] Add Reshape Operation | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/426 Issue #426] | ||
+ | | Behavior of sum() for Matrix<bool> is unexpected and confusing | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/329 Issue #329] | ||
+ | | Feature request: Ability to get a \"view\" into a sub-matrix by indexing it with a vector or matrix of indices | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/231 Issue #231] | ||
+ | | STL compatible iterators | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/96 Issue #96] | ||
+ | | Clean internal::result_of | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/65 Issue #65] | ||
+ | | Core - optimize partial reductions | ||
+ | |- | ||
+ | | [https://gitlab.com/libeigen/eigen/-/issues/64 Issue #64] | ||
+ | | Tests : precision-oriented tests | ||
+ | |} | ||
− | [ | + | == Additional information == |
+ | * A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found [https://docs.google.com/document/d/e/2PACX-1vSGvp4Kv9dJ-gKzJN4CBjppP46flDbe3pJtI9N3m3WkKSoLXmANXuK5gJlw1CPcpCfjAWhgXAtQNzm-/pub here]. |
Latest revision as of 15:26, 14 October 2021
Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the Main Page or from Gitlab.
Notice: that 3.4.x will be the last major release series of Eigen that will support c++03. The master branch will drop c++03 support after this release.
Contents
- 1 Changes to supported modules
- 1.1 Changes that might break existing code
- 1.2 New Major Features in Core
- 1.3 New backends
- 1.4 Improvements to Eigen Core
- 1.5 Elementwise math functions
- 1.6 Dense matrix decompositions and solvers
- 1.7 Sparse matrix support, decompositions and solvers
- 1.8 Type support
- 1.9 Improved Geometry Module
- 1.10 Backend-specific improvements
- 1.11 Miscellaneous API Changes
- 1.12 Improvement to NaN propagation
- 2 Changes to unsupported modules
- 3 Other relevant changes
- 4 Testing
- 5 List of issues fixed in Eigen 3.4
- 6 Additional information
Changes to supported modules
Changes that might break existing code
- Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.:
MatrixXd A(10,10); float one = 1; double a11 = A(one,1.); // compilation error here
New Major Features in Core
- Add c++11 initializer_list constructors to Matrix and Array [doc]:
MatrixXi a { // construct a 2x3 matrix {1,2,3}, // first row {4,5,6} // second row }; VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamic-size vector with 5 elements Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5.
- Add STL-compatible iterators for dense expressions [doc]. Some examples:
VectorXd v = ...; MatrixXd A = ...; // range for loop over all entries of v then A for(auto x : v) { cout << x << " "; } for(auto x : A.reshaped()) { cout << x << " "; } // sort v then each column of A std::sort(v.begin(), v.end()); for(auto c : A.colwise()) std::sort(c.begin(), c.end());
- New versatile API for sub-matrices, slices, and indexed views [doc]. It basically extends
A(.,.)
to let it accept anything that looks-like a sequence of indices with random access. To make it usable this new feature comes with new symbols:Eigen::indexing::all
,Eigen::indexing::last
, and functions generating arithmetic sequences:Eigen::seq(first,last[,incr])
,Eigen::seqN(first,size[,incr])
,Eigen::lastN(size[,incr])
. Here is an example picking even rows but the first and last ones, and a subset of indexed columns:
MatrixXd A = ...; std::vector<int> col_ind{7,3,4,3}; MatrixXd B = A(seq(2,last-2,fix<2>), col_ind);
- Add C++11 template aliases for Matrix, Vector, and Array of common sizes, including generic
Vector<Type,Size>
andRowVector<Type,Size>
aliases [doc].
MatrixX<double> M; // Instead of MatrixXd or Matrix<Dynamic, Dynamic, double> Vector4<MyType> V; // Instead of Vector<4, MyType>
- New support for
bfloat16
. The 16-bit Brain floating point format is now available asEigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth betweenuint16_t
to extract the bit representation, useEigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
- Arm SVE: Eigen now supports Arm's Scalable Vector Extension (SVE). Currently only fixed-length SVE vectors for
uint32_t
andfloat
are available. - MIPS MSA: Eigen now supports the MIPS SIMD Architecture (MSA)
- AMD ROCm/HIP: Eigen now contains a generic GPU backend that unifies support for NVIDIA/CUDA and AMD/HIP.
- Power 10 MMA Backend: Eigen now has initial support for Power 10 matrix multiplication assist instructions for float32 and float64, real and complex.
Improvements to Eigen Core
- Eigen now uses the c++11 alignas keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all issues related to static alignment, including
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
. - Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:
- By using half- and quater-packets the performance of matrix multiplications of small to medium sized matrices has been improved
- Eigen's GEMM now falls back to GEMV if it detects that a matrix is a run-time vector
- The performance of matrix products using Arm Neon has been drastically improved (up to 20%)
- Performance of many special cases of matrix products has been improved
- Large speed up from blocked algorithm for
.transposeInPlace
. - Speed up misc. operations by propagating compile-time sizes (col/row-wise reverse, PartialPivLU, and others)
- Faster specialized SIMD kernels for small fixed-size inverse, LU decomposition, and determinant.
- Improved or added vectorization of partial or slice reductions along the outer-dimension, for instance:
colmajor_mat.rowwise().mean()
Elementwise math functions
- Many functions are now implemented and vectorized in generic (backend-agnostic) form.
- Many improvements to correctness, accuracy, and compatibility with c++ standard library.
- Much improved implementation of
ldexp
. - Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.
- New implementation of the Payne-Hanek for argument reduction algorithm for
sin
andcos
with huge arguments. - New faithfully rounded algorithm for
pow(x,y)
.
- Much improved implementation of
- Speedups from (new or improved) vectorized versions of
pow, log, sin, cos, arg, pow, log2
, complexsqrt, erf, expm1, logp1, logistic, rint, gamma
andbessel
functions, and more. - Improved special function support (Bessel and gamma functions,
ndtri, erfc
, inverse hyperbolic functions and more) - New elementwise functions for
absolute_difference
,rint
.
Dense matrix decompositions and solvers
- All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support
.transpose()
,.adjoint()
and.solve()
APIs. - SVD implementations now have an
info()
method for checking convergence.
#include <Eigen/SVD> MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. VectorXf x = svd.solve(b); }
- Most decompositions now fail quickly when invalid inputs are detected.
- Optimized the product of a
HouseholderSequence
with the identity, as well as the evaluation of aHouseholderSequence
to a dense matrix using faster blocked product. - Fixed aliasing issues with in-place small matrix inversions.
- Fixed several edge-cases with empty or zero inputs.
Sparse matrix support, decompositions and solvers
- Enabled assignment and addition with diagonal matrix expressions.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
- Support added for SuiteSparse KLU routines via the
KLUSupport
module. SuiteSparse must be installed to use this module.
#include <Eigen/KLUSupport> A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }
-
SparseCholesky
now works with row-major matrices. - Various bug fixes and performance improvements.
Type support
- Improved support for
half
- Native support added for ARM
__fp16
, CUDA/HIP__half
, andF16C
conversion intrinsics. - Better vectorization support added across all backends.
- Native support added for ARM
- Improved bool support
- Partial vectorization support added for boolean operations.
- Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
- Improved support for custom types
- More custom types work out-of-the-box (see #2201).
Improved Geometry Module
- Behavioral change:
Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349). - New partial vectorization support added for
Quaternion
. - Generic vectorized 4x4 matrix inversion.
Backend-specific improvements
- Arm NEON
- Now provides vectorization for
uint64_t
,int64_t
,uint32_t
,int16_t
,uint16_t
,int16_t
,int8_t
, anduint8_t
- Emulates
bfloat16
support when usingEigen::bfloat16
- Supports emulated and native
float16
when usingEigen::half
- Now provides vectorization for
- SSE/AVX/AVX512
- General performance improvements and bugfixes.
- Enabled AVX512 instructions by default if available.
- New
std::complex
,half
, andbfloat16
vectorization support added. - Many missing packet functions added.
- Altivec/Power
- General performance improvement and bugfixes.
- Enhanced vectorization of real and complex scalars.
- Changes to the
gebp_kernel
specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products. - Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on
__builtin_cpu_supports
.
- GPU (CUDA and HIP)
- Several optimized math functions added, better support for
std::complex
. - Added option to disable CUDA entirely by defining
EIGEN_NO_CUDA
. - Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).
- Several optimized math functions added, better support for
- ZVector
- Vectorized
float
andstd::complex<float>
support added. - Added z14 support.
- Vectorized
- SYCL
- Redesigned SYCL implementation for use with the Tensor module, which can be enabled by defining
EIGEN_USE_SYCL
. - New generic memory model introduced used by
TensorDeviceSycl
. - Better integration with OpenCL devices.
- Added many math function specializations.
- Redesigned SYCL implementation for use with the Tensor module, which can be enabled by defining
Miscellaneous API Changes
- New
setConstant(...)
methods for preserving one dimension of a matrix by passing inNoChange
.
MatrixXf A(10, 5); // 10x5 matrix. A.setConstant(NoChange, 10, 2); // 10x10 matrix of 2s. A.setConstant(5, NoChange, 3); // 5x10 matrix of 3s. A.setZero(NoChange, 20); // 5x20 matrix of 0s. A.setZero(20, NoChange); // 20x20 matrix of 0s. A.setOnes(NoChange, 5); // 20x5 matrix of 1s. A.setOnes(5, NoChange); // 5x5 matrix of 1s. A.setRandom(NoChange, 10); // 5x10 random matrix. A.setRandom(10, NoChange); // 10x10 random matrix.
- Added
setUnit(Index i)
for vectors that sets the i th coefficient to one and all others to zero.
VectorXf v(5); v.setUnit(3); // { 0, 0, 0, 1, 0}
- Added
transpose()
,adjoint()
,conjugate()
methods toSelfAdjointView
. - Added
shiftLeft<N>()
andshiftRight<N>()
coefficient-wise arithmetic shift functions to Arrays.
ArrayXXi A = ArrayXXi::Random(2, 3); ArrayXXi B = A.shiftRight<2>(); ArrayXXi C = A.shiftLeft<6>();
- Enabled adding and subtracting of diagonal expressions.
VectorXf x = VectorXf::Random(5); VectorXf y = VectorXf::Random(5); MatrixXf A = MatrixXf::Identity(5, 5); A += x.asDiagonal() - y.asDiagonal();
- Allow user-defined default cache sizes via defining
EIGEN_DEFAULT_L1_CACHE_SIZE
, ...,EIGEN_DEFAULT_L3_CACHE_SIZE
. - Added
EIGEN_ALIGNOF(X)
macro for determining alignment of a provided variable. - Allow plugins for
VectorwiseOp
by defining a fileEIGEN_VECTORWISEOP_PLUGIN
(e.g.-DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h
). - Allow disabling of IO operations by defining
EIGEN_NO_IO
.
Improvement to NaN propagation
- Improvements to NaN correctness for elementwise functions.
- New
NaNPropagation
template argument to control whether NaNs are propagated or suppressed in elementwisemin/max
and corresponding reductions onArray
,Matrix
, andTensor
. Example for max:
// Elementwise maximum Eigen::MatrixXf left, right, r0, r1, r2; r0 = left.cwiseMax(right); // Implementation defined behavior. // Propagate NaN if either argument is NaN. r1 = left.template cwiseMax<PropagateNaN>(right); // Suppress NaN if at least one argument is not a NaN. r2 = left.template cwiseMax<PropagateNumbers>(right); // Max reductions Eigen::MatrixXf m; float nan_or_max = m.maxCoeff(); // Implementation defined behavior. float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>(); float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();
Changes to unsupported modules
New low-latency non-blocking ThreadPool module
- Originally a part of the Tensor module,
Eigen::ThreadPool
is now separate and more portable, and forms the basis for multi-threading in TensorFlow, for example. Example:
#include <Eigen/CXX11/ThreadPool> const int num_threads = 42; Eigen::ThreadPool tp(num_threads); auto do_stuff = []() { ... }; tp.Schedule(do_stuff);
Changes to Tensor module
- Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of
nvcc
. - Performance optimizations of Tensor contraction
- Speed up "outer-product-like" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.
- Improved threading heuristics.
- Support for fusing element-wise operations into contraction during evaluation. Example:
// This example applies std::sqrt to all output elements from a tensor contraction. // The optional OutputKernel argument to the contraction in this example is a functor over a // 2-dimensional buffer. The functor is called once for each output block of the contraction // result, to perform the elementwise sqrt operation while the block is hot in cache. struct SqrtOutputKernel { template <typename Index, typename Scalar> EIGEN_ALWAYS_INLINE void operator()( const internal::blas_data_mapper<Scalar, Index, ColMajor>& output_mapper, const TensorContractionParams&, Index, Index, Index num_rows, Index num_cols) const { for (int i = 0; i < num_rows; ++i) { for (int j = 0; j < num_cols; ++j) { output_mapper(i, j) = std::sqrt(output_mapper(i, j)); } } } }; Tensor<float, 4, DataLayout> left(30, 50, 8, 31); Tensor<float, 5, DataLayout> right(8, 31, 7, 20, 10); Tensor<float, 5, DataLayout> result(30, 50, 7, 20, 10); Eigen::array<DimPair, 2> dims({{DimPair(2, 0), DimPair(3, 1)}}); result = left.contract(right, dims, SqrtOutputKernel());
- Performance optimizations of other Tensor operator
- Speedups from improved vectorization, block evaluation, and multi-threading for most operators.
- Significant speedup to broadcasting.
- Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.
- Complete rewrite of the block (tiling) evaluation framework for tensor expressions lead to significant speedups and reduced number of memory allocations.
- Added new API for asynchronous evaluation of tensor expressions. Example:
Tensor<float, 3> in1(200, 30, 70); Tensor<float, 3> in2(200, 30, 70); Tensor<float, 3> out(200, 30, 70); Eigen::ThreadPool tp(internal::random<int>(3, 11)); Eigen::ThreadPoolDevice thread_pool_device(&tp, internal::random<int>(3, 11)); Eigen::Barrier b(1); auto done = [&b]() { b.Notify(); }; out.device(thread_pool_device, std::move(done)) = in1 + in2 * 3.14f; b.Wait();
- Misc. minor behavior changes & fixes:
- Fix const correctness for TensorMap.
- Modify tensor argmin/argmax to always return first occurrence.
- More numerically stable tree reduction.
- Improve randomness of the tensor random generator.
- Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
- Support static dimensions (aka IndexList) in resizing/reshape/broadcast.
- Improved accuracy of Tensor FFT.
Improvements to FFT module
- Faster and more accurate twiddle factor computation.
Improvements to EulerAngles
- EulerAngles can now be directly constructed from 3D vectors
- EulerAngles now provide
isApprox()
andcast()
functions
Changes to sparse iterative solvers
- Added new IDRS iterative linear solver.
#include <unsupported/Eigen/IterativeSolvers> A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); VectorXf x = idrs.solve(b); bool success = (idrs.info() == ComputationInfo::Success);
Improvements to Polynomials
- PolynomialSolver can now be used with complex numbers
- The solver will automatically choose between
EigenSolver
andComplexEigenSolver
depending on the scalar type used
Other relevant changes
- Eigen now provides an option to test with an external BLAS library
- Eigen can now be used with the PGI Compiler
- Printing when using GDB has been improved
- Eigen can now detect if a platform supports
int128
intrinsics
Testing
The full Eigen test suite was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations:
Compiler | Version | Platform | Operating system |
---|---|---|---|
Microsoft Visual Studio | 2015 Update 3 | x86-64 | Windows |
Microsoft Visual Studio | Community 2017 - 15.9.38 | x86-64 | Windows |
Microsoft Visual Studio | Community 2019 - 16.11 | x86-64 | Windows |
GCC | 4.8 | x86-64 | Linux |
GCC | 9 | x86-64 | Linux |
GCC | 10 | x86-64 | Linux |
Clang | 6.0 | x86-64 | Linux |
Clang | 10 | x86-64 | Linux |
Clang | 11 | x86-64 | Linux |
GCC | 10 | armv8.2-a | Linux |
Clang | 6 | armv8.2-a | Linux |
Clang | 9 | armv8.2-a | Linux |
Clang | 10 | armv8.2-a | Linux |
Clang | 11 | armv8.2-a | Linux |
AppleClang | 12.0.5 | x86-64 | macOS |
GCC | 10 | ppc64le | Linux |
Clang | 10 | ppc64le | Linux |
List of issues fixed in Eigen 3.4
Issue #2298 | List of dense linear decompositions lacks completeorthogonal decomposition |
Issue #2284 | JacobiSVD Outputs Invalid U (Reads Past End of Array) |
Issue #2267 | [3.4 bug] FixedInt<0> error with gcc 4.9.3 |
Issue #2263 | usage of signed zeros leads to wrong results with -ffast-math |
Issue #2251 | Method unaryExpr() does not support function pointers in Eigen 3.4rc1 |
Issue #2242 | No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h' |
Issue #2229 | Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2) |
Issue #2213 | template maxCoeff<PropagateNaN> compilation error with Eigen 3.4. |
Issue #2209 | unaryExpr deduces wrong return type on MSVC |
Issue #2157 | forward_adolc test fails since PR !363 |
Issue #2119 | Move assignment swaps even for non-dynamic storage |
Issue #2112 | Build failure with boost::multiprecision type |
Issue #2093 | Incorrect evaluation of Ref |
Issue #1906 | Eigen failed with error C2440 with MSVC on windows |
Issue #1850 | error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t |
Issue #1833 | c++20 compilation failure |
Issue #1826 | -Wdeprecated-anon-enum-enum-conversion warnings (c++20) |
Issue #1815 | IndexedView of a vector should allow linear access |
Issue #1805 | Uploaded doxygen documentation does not build LaTeX formulae |
Issue #1790 | packetmath_1 unit test fails |
Issue #1788 | Rule-of-three/rule-of-five violations |
Issue #1776 | subvector_stl_iterator::operator-> triggers 'taking address of rvalue' warning |
Issue #1774 | std::cbegin() returns non-const iterator |
Issue #1752 | A change to the C++ Standard will break some tests |
Issue #1741 | Map<>.noalias()=A*B gives wrong result |
Issue #1736 | Column access of some IndexedView won't compile |
Issue #1718 | Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC |
Issue #1695 | Stuck in loop for a certain input when using mpreal support |
Issue #1692 | pass enumeration argument to constructor of VectorXd |
Issue #1684 | array_reverse fails with clang >=6 + AVX + -O2 |
Issue #1674 | SIMD sin/cos gives wrong results with -ffast-math |
Issue #1669 | Zero-sized matrices generate assertion failures |
Issue #1664 | dot product with single column block fails with new static checks |
Issue #1652 | Corner cases in SIMD sin/cos |
Issue #1643 | Compilation failure |
Issue #1637 | Register spilling with recent gcc & clang |
Issue #1619 | const_iterator vs iterator compilation error |
Issue #1615 | Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow |
Issue #1611 | NEON: plog(+/-0) should return -inf and not NaN |
Issue #1585 | Matrix product is repeatedly evaluated when iterating over the product expression |
Issue #1557 | Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding |
Issue #1544 | SparseQR generates incorrect Q matrix in complex case |
Issue #1543 | \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product |
Issue #1493 | dense Q extraction and solve is sometimes erroneous for complex matrices |
Issue #1453 | Strange behavior for Matrix::Map, if only InnerStride is provided |
Issue #1409 | Add support for C++17 operator new alignment |
Issue #1340 | Add operator + to sparse matrix iterator |
Issue #1318 | More robust quaternion from matrix |
Issue #1306 | Add support for AVX512 to Eigen |
Issue #1305 | Implementation of additional component-wise unary functions |
Issue #1221 | I get tons of error since my distribution upgraded to GCC 6.1.1 |
Issue #1195 | vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling |
Issue #1194 | Improve det4x4 |
Issue #1049 | std::make_shared fails to fulfill structure aliment |
Issue #1046 | fixed matrix types do not report correct alignment requirements |
Issue #1014 | Eigenvalues 3x3 matrix |
Issue #1001 | infer dimensions of Dynamic-sized temporaries from the entire expression (if possible) |
Issue #977 | Add stable versions of normalize() and normalized() |
Issue #899 | SparseQR occasionally fails for under-determined systems |
Issue #864 | C++11 alias templates for commonly used types |
Issue #751 | Make AMD Ordering numerically more robust |
Issue #747 | Allow for negative stride |
Issue #720 | Gaussian NullaryExpr |
Issue #663 | Permit NoChange in setZero, setOnes, setConstant, setRandom |
Issue #645 | GeneralizedEigenSolver: missing computation of eigenvectors |
Issue #632 | Optimize addition/subtraction of sparse and dense matrices/vectors |
Issue #631 | (Optionally) throw an exception when using an unsuccessful decomposition |
Issue #564 | maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works |
Issue #556 | Matrix multiplication crashes using mingw 4.7 |
Issue #505 | Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11) |
Issue #445 | ParametrizedLine should have transform method |
Issue #437 | [feature request] Add Reshape Operation |
Issue #426 | Behavior of sum() for Matrix<bool> is unexpected and confusing |
Issue #329 | Feature request: Ability to get a \"view\" into a sub-matrix by indexing it with a vector or matrix of indices |
Issue #231 | STL compatible iterators |
Issue #96 | Clean internal::result_of |
Issue #65 | Core - optimize partial reductions |
Issue #64 | Tests : precision-oriented tests |
Additional information
- A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found here.