Difference between revisions of "User:Rmlarsen/3.4"

From Eigen
Jump to: navigation, search
(Dense matrix decompositions and solvers)
(Elementwise math functions)
 
(58 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== Changes to Supported Modules ==
+
Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the
 +
[https://eigen.tuxfamily.org/index.php?title=Main_Page Main Page].
  
=== Changes that might impact existing code ===
+
== Changes to supported modules ==
  
* Using float or double for indexing matrices, vectors and array will now fail to compile, ex.:
+
=== Changes that might break existing code ===
 +
 
 +
* Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.:
 
<source lang="cpp">
 
<source lang="cpp">
 
MatrixXd A(10,10);
 
MatrixXd A(10,10);
Line 41: Line 44:
 
</source>
 
</source>
  
* New support for <code>bfloat16</code>.  The 16-bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floating-point_format] is now available as <code>Eigen::bfloat16</code>.  The constructor must be called explicitly, but it can otherwise be used as any other scalar type.  To convert back-and-forth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.
+
* New support for <code>bfloat16</code>.  The 16-bit [https://en.wikipedia.org/wiki/Bfloat16_floating-point_format Brain floating point format] is now available as <code>Eigen::bfloat16</code>.  The constructor must be called explicitly, but it can otherwise be used as any other scalar type.  To convert back-and-forth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.
 
<source lang="cpp">
 
<source lang="cpp">
 
   bfloat16 s(0.25);                                // explicit construction
 
   bfloat16 s(0.25);                                // explicit construction
Line 52: Line 55:
 
=== New backends ===
 
=== New backends ===
  
* '''Arm SVE:''' Eigen now supports Arm's [https://developer.arm.com/documentation/101726/0300/Learn-about-the-Scalable-Vector-Extension--SVE-/What-is-the-Scalable-Vector-Extension-  Scalable Vector Extension (SVE)]. Currently only fixed-lenght SVE vectors for <code>uint32_t</code> and <code>float</code> are available.
+
* '''Arm SVE:''' Eigen now supports Arm's [https://developer.arm.com/documentation/101726/0300/Learn-about-the-Scalable-Vector-Extension--SVE-/What-is-the-Scalable-Vector-Extension-  Scalable Vector Extension (SVE)]. Currently only fixed-length SVE vectors for <code>uint32_t</code> and <code>float</code> are available.
 
* '''MIPS MSA:''' Eigen now supports the [https://www.mips.com/products/architectures/ase/simd/ MIPS SIMD Architecture (MSA)]  
 
* '''MIPS MSA:''' Eigen now supports the [https://www.mips.com/products/architectures/ase/simd/ MIPS SIMD Architecture (MSA)]  
* '''AMD ROCm HIP:''' Eigen now contains a generic GPU backend for NVIDIA/AMD that unifies support for CUDA and HIP.
+
* '''AMD ROCm/HIP:''' Eigen now contains a generic GPU backend that unifies support for [https://developer.nvidia.com/cuda-toolkit NVIDIA/CUDA] and [https://rocmdocs.amd.com/en/latest/ AMD/HIP].
* '''Power 10 MMA Backend:''' Eigen now has initial support for Power 10 matrix multiplication assist instructions for float32, float64 real and complex.
+
* '''Power 10 MMA Backend:''' Eigen now has initial support for [https://arxiv.org/pdf/2104.03142.pdf Power 10 matrix multiplication assist instructions] for float32 and float64, real and complex.
  
 
=== Improvements to Eigen Core ===
 
=== Improvements to Eigen Core ===
* Eigen now uses c++11 the '''alignas''' keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code>EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>.
+
* Eigen now uses the c++11 '''alignas''' keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code>EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>.
 
* Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:
 
* Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:
 
** By using half- and quater-packets the performance of matrix multiplications of small to medium sized matrices has been improved
 
** By using half- and quater-packets the performance of matrix multiplications of small to medium sized matrices has been improved
Line 64: Line 67:
 
** The performance of matrix products using Arm Neon has been drastically improved (up to 20%)
 
** The performance of matrix products using Arm Neon has been drastically improved (up to 20%)
 
** Performance of many special cases of matrix products has been improved
 
** Performance of many special cases of matrix products has been improved
* Large speed up from blocked algorithm for <code>transposeInPlace</code>.
+
* Large speed up from blocked algorithm for <code>.transposeInPlace</code>.
 
* Speed up misc. operations by propagating compile-time sizes (col/row-wise reverse, PartialPivLU, and others)
 
* Speed up misc. operations by propagating compile-time sizes (col/row-wise reverse, PartialPivLU, and others)
 
* Faster specialized SIMD kernels for small fixed-size inverse, LU decomposition, and determinant.
 
* Faster specialized SIMD kernels for small fixed-size inverse, LU decomposition, and determinant.
* Improve or add vectorization of partial or slice reductions along the outer-dimension, for instance: <code>colmajor_mat.rowwise().mean()</code>
+
* Improved or added vectorization of partial or slice reductions along the outer-dimension, for instance: <code>colmajor_mat.rowwise().mean()</code>
  
 
=== Elementwise math functions ===
 
=== Elementwise math functions ===
Line 75: Line 78:
 
** Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.
 
** Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.
 
** New Payne-Hanek argument reduction algorithm for <code>sin</code> and <code>cos</code> with huge arguments.
 
** New Payne-Hanek argument reduction algorithm for <code>sin</code> and <code>cos</code> with huge arguments.
** New vectorized faithfully rounded algorithm for <code>pow(x,y)</code>.
+
** New faithfully rounded algorithm for <code>pow(x,y)</code>.
 
* Speedups from (new or improved) vectorized versions of <code>pow, log, sin, cos, arg, pow, log2</code>, complex <code>sqrt, erf, expm1, logp1, logistic, rint, gamma</code> and <code>bessel</code> functions, and more.
 
* Speedups from (new or improved) vectorized versions of <code>pow, log, sin, cos, arg, pow, log2</code>, complex <code>sqrt, erf, expm1, logp1, logistic, rint, gamma</code> and <code>bessel</code> functions, and more.
 
* Improved special function support (Bessel and gamma functions, <code>ndtri, erfc</code>, inverse hyperbolic functions and more)
 
* Improved special function support (Bessel and gamma functions, <code>ndtri, erfc</code>, inverse hyperbolic functions and more)
Line 81: Line 84:
  
 
=== Dense matrix decompositions and solvers ===
 
=== Dense matrix decompositions and solvers ===
 +
* All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support <code>.transpose()</code>, <code>.adjoint()</code> and <code>.solve()</code> APIs.
 
* SVD implementations now have an <code>info()</code> method for checking convergence.
 
* SVD implementations now have an <code>info()</code> method for checking convergence.
 
<source lang="cpp">
 
<source lang="cpp">
Line 91: Line 95:
 
   }
 
   }
 
</source>
 
</source>
* Decompositions now fail quickly when invalid inputs are detected.
+
* Most decompositions now fail quickly when invalid inputs are detected.
* Optimized the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.
+
* Optimized the product of a <code>HouseholderSequence</code> with the identity, as well as the evaluation of a <code>HouseholderSequence</code> to a dense matrix using faster blocked product.
 
* Fixed aliasing issues with in-place small matrix inversions.
 
* Fixed aliasing issues with in-place small matrix inversions.
 
* Fixed several edge-cases with empty or zero inputs.
 
* Fixed several edge-cases with empty or zero inputs.
  
 
=== Sparse matrix support, decompositions and solvers ===
 
=== Sparse matrix support, decompositions and solvers ===
* Enabled assignment and addition with diagonal matrices.
+
* Enabled assignment and addition with diagonal matrix expressions.
 
<source lang="cpp">
 
<source lang="cpp">
 
   SparseMatrix<float> A(10, 10);
 
   SparseMatrix<float> A(10, 10);
Line 104: Line 108:
 
   A += x.asDiagonal();
 
   A += x.asDiagonal();
 
</source>
 
</source>
* Support added for SuiteSparse KLU routines via the module <code>KLUSupport</code>.
+
* Support added for SuiteSparse KLU routines via the <code>KLUSupport</code> module.  SuiteSparse must be installed to use this module.
 
<source lang="cpp">
 
<source lang="cpp">
 
   #include <Eigen/KLUSupport>
 
   #include <Eigen/KLUSupport>
Line 123: Line 127:
 
** Partial vectorization support added for boolean operations.
 
** Partial vectorization support added for boolean operations.
 
** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.
 
** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.
* Improved support for custom types ===
+
* Improved support for custom types
** More custom types work out-of-the-box (see #2201[https://gitlab.com/libeigen/eigen/-/issues/2201]).
+
** More custom types work out-of-the-box (see [https://gitlab.com/libeigen/eigen/-/issues/2201 #2201]).
  
 
=== Improved Geometry Module ===
 
=== Improved Geometry Module ===
 
* '''Behavioral change:''' <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see [https://gitlab.com/libeigen/eigen/-/merge_requests/349 !349]).
 
* '''Behavioral change:''' <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see [https://gitlab.com/libeigen/eigen/-/merge_requests/349 !349]).
* New minimal vectorization support added for <code>Quaternion</code>.
+
* New partial vectorization support added for <code>Quaternion</code>.
* Vectorized 4x4 matrix inversion.
+
* Generic vectorized 4x4 matrix inversion.
  
 
=== Backend-specific improvements ===
 
=== Backend-specific improvements ===
* The '''Arm NEON''' backend has been significantly improved:
+
* '''Arm NEON'''
** It now provides vectorization for <code>uint64_t</code>, <code>int64_t</code>, <code>uint32_t</code>, <code>int16_t</code>, <code>uint16_t</code>, <code>int16_t</code>, <code>int8_t</code>, and <code>uint8_t</code>
+
** Now provides vectorization for <code>uint64_t</code>, <code>int64_t</code>, <code>uint32_t</code>, <code>int16_t</code>, <code>uint16_t</code>, <code>int16_t</code>, <code>int8_t</code>, and <code>uint8_t</code>
** It now can emulate <code>bfloat16</code> support when using <code>Eigen::bfloat16</code>
+
** Emulates <code>bfloat16</code> support when using <code>Eigen::bfloat16</code>
** It now supports emulated and native `float16` when using <code>Eigen::float16</code>
+
** Supports emulated and native <code>float16</code> when using <code>Eigen::half</code>
 
* '''SSE/AVX/AVX512'''
 
* '''SSE/AVX/AVX512'''
 +
** General performance improvements and bugfixes.
 
** Enabled AVX512 instructions by default if available.
 
** Enabled AVX512 instructions by default if available.
 
** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added.
 
** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added.
 
** Many missing packet functions added.
 
** Many missing packet functions added.
* Altivec/Power
+
* '''Altivec/Power'''
 
** General performance improvement and bugfixes.
 
** General performance improvement and bugfixes.
 
** Enhanced vectorization of current real and complex scalars.
 
** Enhanced vectorization of current real and complex scalars.
** Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
+
** Changes to the <code>gebp_kernel</code> specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
** Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports.
+
** Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on <code>__builtin_cpu_supports</code>.
* GPU (CUDA and HIP)
+
* '''GPU (CUDA and HIP)'''
 
** Several optimized math functions added, better support for <code>std::complex</code>.
 
** Several optimized math functions added, better support for <code>std::complex</code>.
 
** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>.
 
** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>.
 
** Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).
 
** Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).
* ZVector
+
* '''ZVector'''
 
** Vectorized <code>float</code> and <code>std::complex<float></code> support added.
 
** Vectorized <code>float</code> and <code>std::complex<float></code> support added.
 
** Added z14 support.
 
** Added z14 support.
* SYCL
+
* '''SYCL'''
 
** Redesigned SYCL implementation for use with the [https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html Tensor] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.
 
** Redesigned SYCL implementation for use with the [https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html Tensor] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.
 
** New generic memory model introduced used by <code>TensorDeviceSycl</code>.
 
** New generic memory model introduced used by <code>TensorDeviceSycl</code>.
Line 159: Line 164:
  
 
=== Miscellaneous API Changes ===
 
=== Miscellaneous API Changes ===
* New <code>setConstant(...)</code> methods for preserving one dimension of a matrix by passing in <code>NoChange_t</code>.
+
* New <code>setConstant(...)</code> methods for preserving one dimension of a matrix by passing in <code>NoChange</code>.
 
<source lang="cpp">
 
<source lang="cpp">
   MatrixXf A(10, 5);                   // 10x5  matrix.
+
   MatrixXf A(10, 5);               // 10x5  matrix.
   A.setConstant(NoChange_t(), 10, 2);  // 10x10 matrix of 2s.
+
   A.setConstant(NoChange, 10, 2);  // 10x10 matrix of 2s.
   A.setConstant(5, NoChange_t(), 3);  //  5x10 matrix of 3s.
+
   A.setConstant(5, NoChange, 3);  //  5x10 matrix of 3s.
   A.setZero(NoChange_t(), 20);        //  5x20 matrix of 0s.
+
   A.setZero(NoChange, 20);        //  5x20 matrix of 0s.
   A.setZero(20, NoChange_t());        // 20x20 matrix of 0s.
+
   A.setZero(20, NoChange);        // 20x20 matrix of 0s.
   A.setOnes(NoChange_t(), 5);          // 20x5  matrix of 1s.
+
   A.setOnes(NoChange, 5);          // 20x5  matrix of 1s.
   A.setOnes(5, NoChange_t());          //  5x5  matrix of 1s.
+
   A.setOnes(5, NoChange);          //  5x5  matrix of 1s.
   A.setRandom(NoChange_t(), 10);      //  5x10 random matrix.
+
   A.setRandom(NoChange, 10);      //  5x10 random matrix.
   A.setRandom(10, NoChange_t());      // 10x10 random matrix.
+
   A.setRandom(10, NoChange);      // 10x10 random matrix.
 
</source>
 
</source>
 
* Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero.
 
* Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero.
Line 177: Line 182:
 
</source>
 
</source>
 
* Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.
 
* Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.
* Added <code>shift_left<N>()</code> and <code>shift_right<N>()</code> coefficient-wise array functions.
+
* Added <code>shiftLeft<N>()</code> and <code>shiftRight<N>()</code> coefficient-wise arithmetic shift functions to Arrays.
 +
<source lang="cpp">
 +
  ArrayXXi A = ArrayXXi::Random(2, 3);
 +
  ArrayXXi B = A.shiftRight<2>();
 +
  ArrayXXi C = A.shiftLeft<6>();
 +
</source>
 
* Enabled adding and subtracting of diagonal expressions.
 
* Enabled adding and subtracting of diagonal expressions.
 
<source lang="cpp">
 
<source lang="cpp">
Line 196: Line 206:
 
<source lang="cpp">
 
<source lang="cpp">
 
// Elementwise maximum
 
// Elementwise maximum
Eigen::MatrixXf left, right, r1, r2;
+
Eigen::MatrixXf left, right, r0, r1, r2;
 +
r0 = left.cwiseMax(right); // Implementation defined behavior.
 
// Propagate NaN if either argument is NaN.
 
// Propagate NaN if either argument is NaN.
r1 = left.cwiseMax<PropagateNaN>(right);
+
r1 = left.template cwiseMax<PropagateNaN>(right);
// Suppress NaN if at least one argument is non NaN.
+
// Suppress NaN if at least one argument is not a NaN.
r2 = left.cwiseMax<PropagateNumbers>(right);
+
r2 = left.template cwiseMax<PropagateNumbers>(right);
  
 
// Max reductions
 
// Max reductions
 
Eigen::MatrixXf m;
 
Eigen::MatrixXf m;
 +
float nan_or_max = m.maxCoeff(); // Implementation defined behavior.
 
float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>();
 
float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>();
 
float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();
 
float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();
Line 221: Line 233:
  
 
=== Changes to Tensor module ===
 
=== Changes to Tensor module ===
 +
* Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of <code>nvcc</code>.
 
* Performance optimizations of Tensor contraction
 
* Performance optimizations of Tensor contraction
 
** Speed up "outer-product-like" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.
 
** Speed up "outer-product-like" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.
 
** Improved threading heuristics.
 
** Improved threading heuristics.
 
** Support for fusing element-wise operations into contraction during evaluation. Example:  
 
** Support for fusing element-wise operations into contraction during evaluation. Example:  
<source lang="cpp">// Apply Sqrt to all output elements. The optional OutputKernel argument to contraction in this example is a functor over 2-dimensional.  
+
<source lang="cpp">
The functor is called for each output block of the results, to perform the elementwise sqrt operation while the block is hot in cache.
+
// This example applies std::sqrt to all output elements from a tensor contraction.  
 +
// The optional OutputKernel argument to the contraction in this example is a functor over a
 +
// 2-dimensional buffer. The functor is called once for each output block of the contraction
 +
// result, to perform the elementwise sqrt operation while the block is hot in cache.
 
struct SqrtOutputKernel {
 
struct SqrtOutputKernel {
 
   template <typename Index, typename Scalar>
 
   template <typename Index, typename Scalar>
Line 250: Line 266:
  
 
* Performance optimizations of other Tensor operator
 
* Performance optimizations of other Tensor operator
** Added vectorization, block evaluation, and multi-threading for most operators.
+
** Speedups from improved vectorization, block evaluation, and multi-threading for most operators.
 
** Significant speedup to broadcasting.
 
** Significant speedup to broadcasting.
 
** Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.
 
** Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.
Line 268: Line 284:
 
   b.Wait();
 
   b.Wait();
 
</source>
 
</source>
* Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway.
 
 
* Misc. minor behavior changes & fixes:
 
* Misc. minor behavior changes & fixes:
 
** Fix const correctness for TensorMap.
 
** Fix const correctness for TensorMap.
Line 278: Line 293:
 
** Improved accuracy of Tensor FFT.
 
** Improved accuracy of Tensor FFT.
  
=== Changes to FFT module ===
+
=== Improvements to FFT module ===
  
 
* Faster and more accurate twiddle factor computation.
 
* Faster and more accurate twiddle factor computation.
Line 301: Line 316:
  
 
* PolynomialSolver can now be used with complex numbers
 
* PolynomialSolver can now be used with complex numbers
* The used solver will automatically choose between <code>EigenSolver</code> and <code>ComplexEigenSolver</code> depending on the scalar type used
+
* The solver will automatically choose between <code>EigenSolver</code> and <code>ComplexEigenSolver</code> depending on the scalar type used
  
 
== Other relevant changes ==
 
== Other relevant changes ==
Line 309: Line 324:
 
* Printing when using GDB has been improved
 
* Printing when using GDB has been improved
 
* Eigen can now detect if a platform supports <code>int128</code> intrinsics
 
* Eigen can now detect if a platform supports <code>int128</code> intrinsics
 +
 +
== Testing ==
 +
The full Eigen test suite  was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations:
 +
 +
{| class="wikitable"
 +
!Compiler  !! Version                            !! Platform !! Operating system
 +
|-
 +
|Microsoft Visual Studio || 2015 Update 3 || x86-64 || Windows
 +
|-
 +
|Microsoft Visual Studio || Community 2017 - 15.9.38 || x86-64  || Windows
 +
|-
 +
|Microsoft Visual Studio || Community 2019 - 16.11 || x86-64  || Windows
 +
|-
 +
|GCC || 4.8 || x86-64 || Linux
 +
|-
 +
|GCC || 9 || x86-64 || Linux
 +
|-
 +
|GCC || 10 ||  x86-64 || Linux
 +
|-
 +
|Clang || 6.0 ||  x86-64 || Linux
 +
|-
 +
|Clang || 10 ||  x86-64 || Linux
 +
|-
 +
|Clang || 11 || x86-64 || Linux
 +
|-
 +
|GCC || 10 ||  armv8.2-a || Linux
 +
|-
 +
|Clang || 6 ||  armv8.2-a || Linux
 +
|-
 +
|Clang || 9 ||  armv8.2-a || Linux
 +
|-
 +
|Clang || 10 ||  armv8.2-a || Linux
 +
|-
 +
|Clang || 11 ||  armv8.2-a || Linux
 +
|-
 +
|AppleClang || 12.0.5 ||  x86-64 || macOS
 +
|-
 +
|GCC || 10 ||  ppc64le || Linux
 +
|-
 +
|Clang || 10 || ppc64le || Linux
 +
|-
 +
|}
 +
 +
== List of issues fixed in Eigen 3.4 ==
 +
 +
{|
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2298 Issue #2298]
 +
| List of dense linear decompositions lacks completeorthogonal decomposition
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2284 Issue #2284]
 +
| JacobiSVD Outputs Invalid U (Reads Past End of Array)
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2267 Issue #2267]
 +
| [3.4 bug] FixedInt<0> error with gcc 4.9.3
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2263 Issue #2263]
 +
| usage of signed zeros leads to wrong results with -ffast-math
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2251 Issue #2251]
 +
| Method unaryExpr() does not support function pointers in Eigen 3.4rc1
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2242 Issue #2242]
 +
| No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h'
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2229 Issue #2229]
 +
| Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2)
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2213 Issue #2213]
 +
| template maxCoeff<PropagateNaN> compilation error with Eigen 3.4.
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2209 Issue #2209]
 +
| unaryExpr deduces wrong return type on MSVC
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2157 Issue #2157]
 +
| forward_adolc test fails since PR !363
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2119 Issue #2119]
 +
| Move assignment swaps even for non-dynamic storage
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2112 Issue #2112]
 +
| Build failure with boost::multiprecision type
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/2093 Issue #2093]
 +
| Incorrect evaluation of Ref
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1906 Issue #1906]
 +
| Eigen failed with error C2440 with MSVC on windows
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1850 Issue #1850]
 +
| error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1833 Issue #1833]
 +
| c++20 compilation failure
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1826 Issue #1826]
 +
| -Wdeprecated-anon-enum-enum-conversion warnings (c++20)
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1815 Issue #1815]
 +
| IndexedView of a vector should allow linear access
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1805 Issue #1805]
 +
| Uploaded doxygen documentation does not build LaTeX formulae
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1790 Issue #1790]
 +
| packetmath_1 unit test fails
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1788 Issue #1788]
 +
| Rule-of-three/rule-of-five violations
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1776 Issue #1776]
 +
| subvector_stl_iterator::operator-> triggers 'taking address of rvalue' warning
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1774 Issue #1774]
 +
| std::cbegin() returns non-const iterator
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1752 Issue #1752]
 +
| A change to the C++ Standard will break some tests
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1741 Issue #1741]
 +
| Map<>.noalias()=A*B gives wrong result
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1736 Issue #1736]
 +
| Column access of some IndexedView won't compile
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1718 Issue #1718]
 +
| Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1695 Issue #1695]
 +
| Stuck in loop for a certain input when using mpreal support
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1692 Issue #1692]
 +
| pass enumeration argument to constructor of VectorXd
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1684 Issue #1684]
 +
| array_reverse fails with clang >=6 + AVX + -O2
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1674 Issue #1674]
 +
| SIMD sin/cos gives wrong results with -ffast-math
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1669 Issue #1669]
 +
| Zero-sized matrices generate assertion failures
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1664 Issue #1664]
 +
| dot product with single column block fails with new static checks
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1652 Issue #1652]
 +
| Corner cases in SIMD sin/cos
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1643 Issue #1643]
 +
| Compilation failure
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1637 Issue #1637]
 +
| Register spilling with recent gcc & clang
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1619 Issue #1619]
 +
| const_iterator vs iterator compilation error
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1615 Issue #1615]
 +
| Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1611 Issue #1611]
 +
| NEON: plog(+/-0) should return -inf and not NaN
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1585 Issue #1585]
 +
| Matrix product is repeatedly evaluated when iterating over the product expression
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1557 Issue #1557]
 +
| Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1544 Issue #1544]
 +
| SparseQR generates incorrect Q matrix in complex case
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1543 Issue #1543]
 +
| \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1493 Issue #1493]
 +
| dense Q extraction and solve is sometimes erroneous for complex matrices
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1453 Issue #1453]
 +
| Strange behavior for Matrix::Map, if only InnerStride is provided
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1409 Issue #1409]
 +
| Add support for C++17 operator new alignment
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1340 Issue #1340]
 +
| Add operator + to sparse matrix iterator
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1318 Issue #1318]
 +
| More robust quaternion from matrix
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1306 Issue #1306]
 +
| Add support for AVX512 to Eigen
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1305 Issue #1305]
 +
| Implementation of additional component-wise unary functions
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1221 Issue #1221]
 +
| I get tons of error since my distribution upgraded to GCC 6.1.1
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1195 Issue #1195]
 +
| vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1194 Issue #1194]
 +
| Improve det4x4
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1049 Issue #1049]
 +
| std::make_shared fails to fulfill structure aliment
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1046 Issue #1046]
 +
| fixed matrix types do not report correct alignment requirements
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1014 Issue #1014]
 +
| Eigenvalues 3x3 matrix
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/1001 Issue #1001]
 +
| infer dimensions of Dynamic-sized temporaries from the entire expression (if possible)
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/977 Issue #977]
 +
| Add stable versions of normalize() and normalized()
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/899 Issue #899]
 +
| SparseQR occasionally fails for under-determined systems
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/864 Issue #864]
 +
| C++11 alias templates for commonly used types
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/751 Issue #751]
 +
| Make AMD Ordering numerically more robust
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/747 Issue #747]
 +
| Allow for negative stride
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/720 Issue #720]
 +
| Gaussian NullaryExpr
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/663 Issue #663]
 +
| Permit NoChange in setZero, setOnes, setConstant, setRandom
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/645 Issue #645]
 +
| GeneralizedEigenSolver: missing computation of eigenvectors
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/632 Issue #632]
 +
| Optimize addition/subtraction of sparse and dense matrices/vectors
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/631 Issue #631]
 +
| (Optionally) throw an exception when using an unsuccessful decomposition
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/564 Issue #564]
 +
| maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/556 Issue #556]
 +
| Matrix multiplication crashes using mingw 4.7
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/505 Issue #505]
 +
| Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11)
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/445 Issue #445]
 +
| ParametrizedLine should have transform method
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/437 Issue #437]
 +
| [feature request] Add Reshape Operation
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/426 Issue #426]
 +
| Behavior of sum() for Matrix<bool> is unexpected and confusing
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/329 Issue #329]
 +
| Feature request: Ability to get a \"view\" into a sub-matrix by indexing it with a vector or matrix of indices
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/231 Issue #231]
 +
| STL compatible iterators
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/96 Issue #96]
 +
| Clean internal::result_of
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/65 Issue #65]
 +
| Core - optimize partial reductions
 +
|-
 +
| [https://gitlab.com/libeigen/eigen/-/issues/64 Issue #64]
 +
| Tests : precision-oriented tests
 +
|}
 +
 +
== Additional information ==
 +
* A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found [https://docs.google.com/document/d/e/2PACX-1vSGvp4Kv9dJ-gKzJN4CBjppP46flDbe3pJtI9N3m3WkKSoLXmANXuK5gJlw1CPcpCfjAWhgXAtQNzm-/pub here].

Latest revision as of 20:24, 18 August 2021

Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the Main Page.

Changes to supported modules

Changes that might break existing code

  • Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.:
MatrixXd A(10,10);
float one = 1;
double a11 = A(one,1.); // compilation error here

New Major Features in Core

  • Add c++11 initializer_list constructors to Matrix and Array [doc]:
MatrixXi a {      // construct a 2x3 matrix
      {1,2,3},    // first row
      {4,5,6}     // second row
};
VectorXd v{{1, 2, 3, 4, 5}};    // construct a dynamic-size vector with 5 elements
Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5.
  • Add STL-compatible iterators for dense expressions [doc]. Some examples:
VectorXd v = ...;
MatrixXd A = ...;
// range for loop over all entries of v then A
for(auto x : v) { cout << x << " "; }
for(auto x : A.reshaped()) { cout << x << " "; }
// sort v then each column of A
std::sort(v.begin(), v.end());
for(auto c : A.colwise())
    std::sort(c.begin(), c.end());
  • Add C++11 template aliases for Matrix, Vector, and Array of common sizes, including generic Vector<Type,Size> and RowVector<Type,Size> aliases [doc].
MatrixX<double> M;  // Instead of MatrixXd or Matrix<Dynamic, Dynamic, double>
Vector4<MyType> V;  // Instead of Vector<4, MyType>
  • New support for bfloat16. The 16-bit Brain floating point format is now available as Eigen::bfloat16. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between uint16_t to extract the bit representation, use Eigen::numext::bit_cast.
  bfloat16 s(0.25);                                 // explicit construction
  uint16_t s_bits = numext::bit_cast<uint16_t>(s);  // bit representation
 
  using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;
  MatrixBf16 X = s * MatrixBf16::Random(3, 3);

New backends

Improvements to Eigen Core

  • Eigen now uses the c++11 alignas keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all issues related to static alignment, including EIGEN_MAKE_ALIGNED_OPERATOR_NEW.
  • Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:
    • By using half- and quater-packets the performance of matrix multiplications of small to medium sized matrices has been improved
    • Eigen's GEMM now falls back to GEMV if it detects that a matrix is a run-time vector
    • The performance of matrix products using Arm Neon has been drastically improved (up to 20%)
    • Performance of many special cases of matrix products has been improved
  • Large speed up from blocked algorithm for .transposeInPlace.
  • Speed up misc. operations by propagating compile-time sizes (col/row-wise reverse, PartialPivLU, and others)
  • Faster specialized SIMD kernels for small fixed-size inverse, LU decomposition, and determinant.
  • Improved or added vectorization of partial or slice reductions along the outer-dimension, for instance: colmajor_mat.rowwise().mean()

Elementwise math functions

  • Many functions are now implemented and vectorized in generic (backend-agnostic) form.
  • Many improvements to correctness, accuracy, and compatibility with c++ standard library.
    • Much improved implementation of ldexp.
    • Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.
    • New Payne-Hanek argument reduction algorithm for sin and cos with huge arguments.
    • New faithfully rounded algorithm for pow(x,y).
  • Speedups from (new or improved) vectorized versions of pow, log, sin, cos, arg, pow, log2, complex sqrt, erf, expm1, logp1, logistic, rint, gamma and bessel functions, and more.
  • Improved special function support (Bessel and gamma functions, ndtri, erfc, inverse hyperbolic functions and more)
  • New elementwise functions for absolute_difference, rint.

Dense matrix decompositions and solvers

  • All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support .transpose(), .adjoint() and .solve() APIs.
  • SVD implementations now have an info() method for checking convergence.
  #include <Eigen/SVD>
  MatrixXf m = MatrixXf::Random(3,2);
  JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV);
  if (svd.info() == ComputationInfo::Success) {
    // SVD computation was successful.
    VectorXf x = svd.solve(b);
  }
  • Most decompositions now fail quickly when invalid inputs are detected.
  • Optimized the product of a HouseholderSequence with the identity, as well as the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.
  • Fixed aliasing issues with in-place small matrix inversions.
  • Fixed several edge-cases with empty or zero inputs.

Sparse matrix support, decompositions and solvers

  • Enabled assignment and addition with diagonal matrix expressions.
  SparseMatrix<float> A(10, 10);
  VectorXf x = VectorXf::Random(10);
  A = x.asDiagonal();
  A += x.asDiagonal();
  • Support added for SuiteSparse KLU routines via the KLUSupport module. SuiteSparse must be installed to use this module.
  #include <Eigen/KLUSupport>
  A.makeCompressed();   // Recommendation is to compress input before calling sparse solvers.
  KLU<SparseMatrix<T> > klu(A);
  if (klu.info() == ComputationInfo::Success) {
    VectorXf x = klu.solve(b);
  }
  • SparseCholesky now works with row-major matrices.
  • Various bug fixes and performance improvements.

Type support

  • Improved support for half
    • Native support added for ARM __fp16, CUDA/HIP __half, F16C.
    • Better vectorization support added across all backends.
  • Improved bool support
    • Partial vectorization support added for boolean operations.
    • Significantly improved performance (x25) for logical operations with Matrix or Tensor of bool.
  • Improved support for custom types
    • More custom types work out-of-the-box (see #2201).

Improved Geometry Module

  • Behavioral change: Transform::computeRotationScaling() and Transform::computeScalingRotation() are now more continuous across degeneracies (see !349).
  • New partial vectorization support added for Quaternion.
  • Generic vectorized 4x4 matrix inversion.

Backend-specific improvements

  • Arm NEON
    • Now provides vectorization for uint64_t, int64_t, uint32_t, int16_t, uint16_t, int16_t, int8_t, and uint8_t
    • Emulates bfloat16 support when using Eigen::bfloat16
    • Supports emulated and native float16 when using Eigen::half
  • SSE/AVX/AVX512
    • General performance improvements and bugfixes.
    • Enabled AVX512 instructions by default if available.
    • New std::complex, half, and bfloat16 vectorization support added.
    • Many missing packet functions added.
  • Altivec/Power
    • General performance improvement and bugfixes.
    • Enhanced vectorization of current real and complex scalars.
    • Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
    • Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports.
  • GPU (CUDA and HIP)
    • Several optimized math functions added, better support for std::complex.
    • Added option to disable CUDA entirely by defining EIGEN_NO_CUDA.
    • Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).
  • ZVector
    • Vectorized float and std::complex<float> support added.
    • Added z14 support.
  • SYCL
    • Redesigned SYCL implementation for use with the Tensor module, which can be enabled by defining EIGEN_USE_SYCL.
    • New generic memory model introduced used by TensorDeviceSycl.
    • Better integration with OpenCL devices.
    • Added many math function specializations.

Miscellaneous API Changes

  • New setConstant(...) methods for preserving one dimension of a matrix by passing in NoChange.
  MatrixXf A(10, 5);               // 10x5  matrix.
  A.setConstant(NoChange, 10, 2);  // 10x10 matrix of 2s.
  A.setConstant(5, NoChange, 3);   //  5x10 matrix of 3s.
  A.setZero(NoChange, 20);         //  5x20 matrix of 0s.
  A.setZero(20, NoChange);         // 20x20 matrix of 0s.
  A.setOnes(NoChange, 5);          // 20x5  matrix of 1s.
  A.setOnes(5, NoChange);          //  5x5  matrix of 1s.
  A.setRandom(NoChange, 10);       //  5x10 random matrix.
  A.setRandom(10, NoChange);       // 10x10 random matrix.
  • Added setUnit(Index i) for vectors that sets the i th coefficient to one and all others to zero.
  VectorXf v(5);
  v.setUnit(3);   // { 0, 0, 0, 1, 0}
  • Added transpose(), adjoint(), conjugate() methods to SelfAdjointView.
  • Added shiftLeft<N>() and shiftRight<N>() coefficient-wise arithmetic shift functions to Arrays.
  ArrayXXi A = ArrayXXi::Random(2, 3);
  ArrayXXi B = A.shiftRight<2>();
  ArrayXXi C = A.shiftLeft<6>();
  • Enabled adding and subtracting of diagonal expressions.
  VectorXf x = VectorXf::Random(5);
  VectorXf y = VectorXf::Random(5);
  MatrixXf A = MatrixXf::Identity(5, 5);
  A += x.asDiagonal() - y.asDiagonal();
  • Allow user-defined default cache sizes via defining EIGEN_DEFAULT_L1_CACHE_SIZE, ..., EIGEN_DEFAULT_L3_CACHE_SIZE.
  • Added EIGEN_ALIGNOF(X) macro for determining alignment of a provided variable.
  • Allow plugins for VectorwiseOp by defining a file EIGEN_VECTORWISEOP_PLUGIN (e.g. -DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h).
  • Allow disabling of IO operations by defining EIGEN_NO_IO.

Improvement to NaN propagation

  • Improvements to NaN correctness for elementwise functions.
  • New NaNPropagation template argument to control whether NaNs are propagated or suppressed in elementwise min/max and corresponding reductions on Array, Matrix, and Tensor. Example for max:
// Elementwise maximum
Eigen::MatrixXf left, right, r0, r1, r2;
r0 = left.cwiseMax(right); // Implementation defined behavior.
// Propagate NaN if either argument is NaN.
r1 = left.template cwiseMax<PropagateNaN>(right);
// Suppress NaN if at least one argument is not a NaN.
r2 = left.template cwiseMax<PropagateNumbers>(right);
 
// Max reductions
Eigen::MatrixXf m;
float nan_or_max = m.maxCoeff(); // Implementation defined behavior.
float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>();
float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();

Changes to unsupported modules

New low-latency non-blocking ThreadPool module

  • Originally a part of the Tensor module, Eigen::ThreadPool is now separate and more portable, and forms the basis for multi-threading in TensorFlow, for example. Example:
  #include <Eigen/CXX11/ThreadPool>
 
  const int num_threads = 42;
  Eigen::ThreadPool tp(num_threads);
  auto do_stuff = []() { ... };
  tp.Schedule(do_stuff);

Changes to Tensor module

  • Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of nvcc.
  • Performance optimizations of Tensor contraction
    • Speed up "outer-product-like" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.
    • Improved threading heuristics.
    • Support for fusing element-wise operations into contraction during evaluation. Example:
// This example applies std::sqrt to all output elements from a tensor contraction. 
// The optional OutputKernel argument to the contraction in this example is a functor over a 
// 2-dimensional buffer. The functor is called once for each output block of the contraction 
// result, to perform the elementwise sqrt operation while the block is hot in cache.
struct SqrtOutputKernel {
  template <typename Index, typename Scalar>
  EIGEN_ALWAYS_INLINE void operator()(
      const internal::blas_data_mapper<Scalar, Index, ColMajor>& output_mapper,
      const TensorContractionParams&, Index, Index, Index num_rows,
      Index num_cols) const {
    for (int i = 0; i < num_rows; ++i) {
      for (int j = 0; j < num_cols; ++j) {
        output_mapper(i, j) = std::sqrt(output_mapper(i, j));
      }
    }
  }
};
 
Tensor<float, 4, DataLayout> left(30, 50, 8, 31);
Tensor<float, 5, DataLayout> right(8, 31, 7, 20, 10);
Tensor<float, 5, DataLayout> result(30, 50, 7, 20, 10);
Eigen::array<DimPair, 2> dims({{DimPair(2, 0), DimPair(3, 1)}});
 
result = left.contract(right, dims, SqrtOutputKernel());
  • Performance optimizations of other Tensor operator
    • Speedups from improved vectorization, block evaluation, and multi-threading for most operators.
    • Significant speedup to broadcasting.
    • Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.
  • Complete rewrite of the block (tiling) evaluation framework for tensor expressions lead to significant speedups and reduced number of memory allocations.
  • Added new API for asynchronous evaluation of tensor expressions. Example:
  Tensor<float, 3> in1(200, 30, 70);
  Tensor<float, 3> in2(200, 30, 70);
  Tensor<float, 3> out(200, 30, 70);
 
  Eigen::ThreadPool tp(internal::random<int>(3, 11));
  Eigen::ThreadPoolDevice thread_pool_device(&tp, internal::random<int>(3, 11));
 
  Eigen::Barrier b(1);
  auto done = [&b]() { b.Notify(); };
  out.device(thread_pool_device, std::move(done)) = in1 + in2 * 3.14f;
  b.Wait();
  • Misc. minor behavior changes & fixes:
    • Fix const correctness for TensorMap.
    • Modify tensor argmin/argmax to always return first occurrence.
    • More numerically stable tree reduction.
    • Improve randomness of the tensor random generator.
    • Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
    • Support static dimensions (aka IndexList) in resizing/reshape/broadcast.
    • Improved accuracy of Tensor FFT.

Improvements to FFT module

  • Faster and more accurate twiddle factor computation.

Improvements to EulerAngles

  • EulerAngles can now be directly constructed from 3D vectors
  • EulerAngles now provide isApprox() and cast() functions

Changes to sparse iterative solvers

  • Added new IRDS iterative linear solver.
  #include <unsupported/Eigen/IterativeSolvers>
  A.makeCompressed();   // Recommendation is to compress input before calling sparse solvers.
  IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A);
  if (idrs.info() == ComputationInfo::Success) {
    VectorXf x = idrs.solve(b);
  }

Improvements to Polynomials

  • PolynomialSolver can now be used with complex numbers
  • The solver will automatically choose between EigenSolver and ComplexEigenSolver depending on the scalar type used

Other relevant changes

  • Eigen now provides an option to test with an external BLAS library
  • Eigen can now be used with the PGI Compiler
  • Printing when using GDB has been improved
  • Eigen can now detect if a platform supports int128 intrinsics

Testing

The full Eigen test suite was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations:

Compiler Version Platform Operating system
Microsoft Visual Studio 2015 Update 3 x86-64 Windows
Microsoft Visual Studio Community 2017 - 15.9.38 x86-64 Windows
Microsoft Visual Studio Community 2019 - 16.11 x86-64 Windows
GCC 4.8 x86-64 Linux
GCC 9 x86-64 Linux
GCC 10 x86-64 Linux
Clang 6.0 x86-64 Linux
Clang 10 x86-64 Linux
Clang 11 x86-64 Linux
GCC 10 armv8.2-a Linux
Clang 6 armv8.2-a Linux
Clang 9 armv8.2-a Linux
Clang 10 armv8.2-a Linux
Clang 11 armv8.2-a Linux
AppleClang 12.0.5 x86-64 macOS
GCC 10 ppc64le Linux
Clang 10 ppc64le Linux

List of issues fixed in Eigen 3.4

Issue #2298 List of dense linear decompositions lacks completeorthogonal decomposition
Issue #2284 JacobiSVD Outputs Invalid U (Reads Past End of Array)
Issue #2267 [3.4 bug] FixedInt<0> error with gcc 4.9.3
Issue #2263 usage of signed zeros leads to wrong results with -ffast-math
Issue #2251 Method unaryExpr() does not support function pointers in Eigen 3.4rc1
Issue #2242 No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h'
Issue #2229 Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2)
Issue #2213 template maxCoeff<PropagateNaN> compilation error with Eigen 3.4.
Issue #2209 unaryExpr deduces wrong return type on MSVC
Issue #2157 forward_adolc test fails since PR !363
Issue #2119 Move assignment swaps even for non-dynamic storage
Issue #2112 Build failure with boost::multiprecision type
Issue #2093 Incorrect evaluation of Ref
Issue #1906 Eigen failed with error C2440 with MSVC on windows
Issue #1850 error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t
Issue #1833 c++20 compilation failure
Issue #1826 -Wdeprecated-anon-enum-enum-conversion warnings (c++20)
Issue #1815 IndexedView of a vector should allow linear access
Issue #1805 Uploaded doxygen documentation does not build LaTeX formulae
Issue #1790 packetmath_1 unit test fails
Issue #1788 Rule-of-three/rule-of-five violations
Issue #1776 subvector_stl_iterator::operator-> triggers 'taking address of rvalue' warning
Issue #1774 std::cbegin() returns non-const iterator
Issue #1752 A change to the C++ Standard will break some tests
Issue #1741 Map<>.noalias()=A*B gives wrong result
Issue #1736 Column access of some IndexedView won't compile
Issue #1718 Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC
Issue #1695 Stuck in loop for a certain input when using mpreal support
Issue #1692 pass enumeration argument to constructor of VectorXd
Issue #1684 array_reverse fails with clang >=6 + AVX + -O2
Issue #1674 SIMD sin/cos gives wrong results with -ffast-math
Issue #1669 Zero-sized matrices generate assertion failures
Issue #1664 dot product with single column block fails with new static checks
Issue #1652 Corner cases in SIMD sin/cos
Issue #1643 Compilation failure
Issue #1637 Register spilling with recent gcc & clang
Issue #1619 const_iterator vs iterator compilation error
Issue #1615 Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow
Issue #1611 NEON: plog(+/-0) should return -inf and not NaN
Issue #1585 Matrix product is repeatedly evaluated when iterating over the product expression
Issue #1557 Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding
Issue #1544 SparseQR generates incorrect Q matrix in complex case
Issue #1543 \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product
Issue #1493 dense Q extraction and solve is sometimes erroneous for complex matrices
Issue #1453 Strange behavior for Matrix::Map, if only InnerStride is provided
Issue #1409 Add support for C++17 operator new alignment
Issue #1340 Add operator + to sparse matrix iterator
Issue #1318 More robust quaternion from matrix
Issue #1306 Add support for AVX512 to Eigen
Issue #1305 Implementation of additional component-wise unary functions
Issue #1221 I get tons of error since my distribution upgraded to GCC 6.1.1
Issue #1195 vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling
Issue #1194 Improve det4x4
Issue #1049 std::make_shared fails to fulfill structure aliment
Issue #1046 fixed matrix types do not report correct alignment requirements
Issue #1014 Eigenvalues 3x3 matrix
Issue #1001 infer dimensions of Dynamic-sized temporaries from the entire expression (if possible)
Issue #977 Add stable versions of normalize() and normalized()
Issue #899 SparseQR occasionally fails for under-determined systems
Issue #864 C++11 alias templates for commonly used types
Issue #751 Make AMD Ordering numerically more robust
Issue #747 Allow for negative stride
Issue #720 Gaussian NullaryExpr
Issue #663 Permit NoChange in setZero, setOnes, setConstant, setRandom
Issue #645 GeneralizedEigenSolver: missing computation of eigenvectors
Issue #632 Optimize addition/subtraction of sparse and dense matrices/vectors
Issue #631 (Optionally) throw an exception when using an unsuccessful decomposition
Issue #564 maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works
Issue #556 Matrix multiplication crashes using mingw 4.7
Issue #505 Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11)
Issue #445 ParametrizedLine should have transform method
Issue #437 [feature request] Add Reshape Operation
Issue #426 Behavior of sum() for Matrix<bool> is unexpected and confusing
Issue #329 Feature request: Ability to get a \"view\" into a sub-matrix by indexing it with a vector or matrix of indices
Issue #231 STL compatible iterators
Issue #96 Clean internal::result_of
Issue #65 Core - optimize partial reductions
Issue #64 Tests : precision-oriented tests

Additional information

  • A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found here.