Difference between revisions of "3.4"
From Eigen
(→Performance optimizations) 
(→New Major Features in Core) 

(30 intermediate revisions by 3 users not shown)  
Line 1:  Line 1:  
−  +  Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the  
+  [https://eigen.tuxfamily.org/index.php?title=Main_Page Main Page] or from [https://gitlab.com/libeigen/eigen//releases/3.4.0 Gitlab].  
−  +  '''Notice:''' that 3.4.x will be the last major release series of Eigen that will support c++03. The master branch will drop c++03 support after this release.  
−  *  +  == Changes to supported modules == 
+  
+  === Changes that might break existing code ===  
+  
+  * Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.:  
<source lang="cpp">  <source lang="cpp">  
−  MatrixXd A  +  MatrixXd A(10,10); 
−  +  float one = 1;  
−  +  double a11 = A(one,1.); // compilation error here  
</source>  </source>  
−  +  === New Major Features in Core ===  
−  *  +  * Add c++11 '''initializer_list constructors''' to Matrix and Array [http://eigen.tuxfamily.org/doxdevel/group__TutorialMatrixClass.html#title3 [doc]]: 
<source lang="cpp">  <source lang="cpp">  
−  +  MatrixXi a { // construct a 2x3 matrix  
−  +  {1,2,3}, // first row  
−  +  {4,5,6} // second row  
−  +  };  
−  +  VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamicsize vector with 5 elements  
−  +  Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixedsize 1D array of size 5.  
−  //  +  
−  +  
−  +  
−  +  
−  +  
−  +  
−  +  
−  +  
</source>  </source>  
Line 43:  Line 40:  
</source>  </source>  
−  *  +  * New versatile API for submatrices, '''slices''', and '''indexed views''' [http://eigen.tuxfamily.org/doxdevel/group__TutorialSlicingIndexing.html [doc]]. It basically extends <code>A(.,.)</code> to let it accept anything that lookslike a sequence of indices with random access. To make it usable this new feature comes with new symbols: <code>Eigen::indexing::all</code>, <code>Eigen::indexing::last</code>, and functions generating arithmetic sequences: <code>Eigen::seq(first,last[,incr])</code>, <code>Eigen::seqN(first,size[,incr])</code>, <code>Eigen::lastN(size[,incr])</code>. Here is an example picking even rows but the first and last ones, and a subset of indexed columns: 
+  <source lang="cpp">  
+  MatrixXd A = ...;  
+  std::vector<int> col_ind{7,3,4,3};  
+  MatrixXd B = A(seq(2,last2,fix<2>), col_ind);  
+  </source>  
−  +  * Add C++11 '''template aliases''' for Matrix, Vector, and Array of common sizes, including generic <code>Vector<Type,Size></code> and <code>RowVector<Type,Size></code> aliases [http://eigen.tuxfamily.org/doxdevel/group__matrixtypedefs.html [doc]].  
−  +  <source lang="cpp">  
−  +  MatrixX<double> M; // Instead of MatrixXd or Matrix<Dynamic, Dynamic, double>  
−  +  Vector4<MyType> V; // Instead of Vector<4, MyType>  
−  +  </source>  
−  +  
−  ==  +  * New support for <code>bfloat16</code>. The 16bit [https://en.wikipedia.org/wiki/Bfloat16_floatingpoint_format Brain floating point format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>. 
+  <source lang="cpp">  
+  bfloat16 s(0.25); // explicit construction  
+  uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation  
+  
+  using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;  
+  MatrixBf16 X = s * MatrixBf16::Random(3, 3);  
+  </source>  
−  *  +  === New backends === 
−  *  +  
−  +  * '''Arm SVE:''' Eigen now supports Arm's [https://developer.arm.com/documentation/101726/0300/LearnabouttheScalableVectorExtensionSVE/WhatistheScalableVectorExtension Scalable Vector Extension (SVE)]. Currently only fixedlength SVE vectors for <code>uint32_t</code> and <code>float</code> are available.  
+  * '''MIPS MSA:''' Eigen now supports the [https://www.mips.com/products/architectures/ase/simd/ MIPS SIMD Architecture (MSA)]  
+  * '''AMD ROCm/HIP:''' Eigen now contains a generic GPU backend that unifies support for [https://developer.nvidia.com/cudatoolkit NVIDIA/CUDA] and [https://rocmdocs.amd.com/en/latest/ AMD/HIP].  
+  * '''Power 10 MMA Backend:''' Eigen now has initial support for [https://arxiv.org/pdf/2104.03142.pdf Power 10 matrix multiplication assist instructions] for float32 and float64, real and complex.  
+  
+  === Improvements to Eigen Core ===  
+  * Eigen now uses the c++11 '''alignas''' keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/doxdevel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code>EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>.  
+  * Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:  
+  ** By using half and quaterpackets the performance of matrix multiplications of small to medium sized matrices has been improved  
+  ** Eigen's GEMM now falls back to GEMV if it detects that a matrix is a runtime vector  
+  ** The performance of matrix products using Arm Neon has been drastically improved (up to 20%)  
+  ** Performance of many special cases of matrix products has been improved  
+  * Large speed up from blocked algorithm for <code>.transposeInPlace</code>.  
+  * Speed up misc. operations by propagating compiletime sizes (col/rowwise reverse, PartialPivLU, and others)  
+  * Faster specialized SIMD kernels for small fixedsize inverse, LU decomposition, and determinant.  
+  * Improved or added vectorization of partial or slice reductions along the outerdimension, for instance: <code>colmajor_mat.rowwise().mean()</code>  
+  
+  === Elementwise math functions ===  
+  * Many functions are now implemented and vectorized in generic (backendagnostic) form.  
+  * Many improvements to correctness, accuracy, and compatibility with c++ standard library.  
+  ** Much improved implementation of <code>ldexp</code>.  
+  ** Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.  
+  ** New implementation of the PayneHanek for argument reduction algorithm for <code>sin</code> and <code>cos</code> with huge arguments.  
+  ** New faithfully rounded algorithm for <code>pow(x,y)</code>.  
+  * Speedups from (new or improved) vectorized versions of <code>pow, log, sin, cos, arg, pow, log2</code>, complex <code>sqrt, erf, expm1, logp1, logistic, rint, gamma</code> and <code>bessel</code> functions, and more.  
+  * Improved special function support (Bessel and gamma functions, <code>ndtri, erfc</code>, inverse hyperbolic functions and more)  
+  * New elementwise functions for <code>absolute_difference</code>, <code>rint</code>.  
+  
+  === Dense matrix decompositions and solvers ===  
+  * All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support <code>.transpose()</code>, <code>.adjoint()</code> and <code>.solve()</code> APIs.  
+  * SVD implementations now have an <code>info()</code> method for checking convergence.  
+  <source lang="cpp">  
+  #include <Eigen/SVD>  
+  MatrixXf m = MatrixXf::Random(3,2);  
+  JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV);  
+  if (svd.info() == ComputationInfo::Success) {  
+  // SVD computation was successful.  
+  VectorXf x = svd.solve(b);  
+  }  
</source>  </source>  
−  *  +  * Most decompositions now fail quickly when invalid inputs are detected. 
−  *  +  * Optimized the product of a <code>HouseholderSequence</code> with the identity, as well as the evaluation of a <code>HouseholderSequence</code> to a dense matrix using faster blocked product. 
−  *  +  * Fixed aliasing issues with inplace small matrix inversions. 
−  *  +  * Fixed several edgecases with empty or zero inputs. 
−  * Speed  +  
−  *  +  === Sparse matrix support, decompositions and solvers === 
−  *  +  * Enabled assignment and addition with diagonal matrix expressions. 
+  <source lang="cpp">  
+  SparseMatrix<float> A(10, 10);  
+  VectorXf x = VectorXf::Random(10);  
+  A = x.asDiagonal();  
+  A += x.asDiagonal();  
+  </source>  
+  * Support added for SuiteSparse KLU routines via the <code>KLUSupport</code> module. SuiteSparse must be installed to use this module.  
+  <source lang="cpp">  
+  #include <Eigen/KLUSupport>  
+  A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers.  
+  KLU<SparseMatrix<T> > klu(A);  
+  if (klu.info() == ComputationInfo::Success) {  
+  VectorXf x = klu.solve(b);  
+  }  
+  </source>  
+  * <code>SparseCholesky</code> now works with rowmajor matrices.  
+  * Various bug fixes and performance improvements.  
+  
+  === Type support ===  
+  * Improved support for <code>half</code>  
+  ** Native support added for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, and <code>F16C</code> conversion intrinsics.  
+  ** Better vectorization support added across all backends.  
+  * Improved bool support  
+  ** Partial vectorization support added for boolean operations.  
+  ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.  
+  * Improved support for custom types  
+  ** More custom types work outofthebox (see [https://gitlab.com/libeigen/eigen//issues/2201 #2201]).  
+  
+  === Improved Geometry Module ===  
+  * '''Behavioral change:''' <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see [https://gitlab.com/libeigen/eigen//merge_requests/349 !349]).  
+  * New partial vectorization support added for <code>Quaternion</code>.  
+  * Generic vectorized 4x4 matrix inversion.  
+  
+  === Backendspecific improvements ===  
+  * '''Arm NEON'''  
+  ** Now provides vectorization for <code>uint64_t</code>, <code>int64_t</code>, <code>uint32_t</code>, <code>int16_t</code>, <code>uint16_t</code>, <code>int16_t</code>, <code>int8_t</code>, and <code>uint8_t</code>  
+  ** Emulates <code>bfloat16</code> support when using <code>Eigen::bfloat16</code>  
+  ** Supports emulated and native <code>float16</code> when using <code>Eigen::half</code>  
+  * '''SSE/AVX/AVX512'''  
+  ** General performance improvements and bugfixes.  
+  ** Enabled AVX512 instructions by default if available.  
+  ** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added.  
+  ** Many missing packet functions added.  
+  * '''Altivec/Power'''  
+  ** General performance improvement and bugfixes.  
+  ** Enhanced vectorization of real and complex scalars.  
+  ** Changes to the <code>gebp_kernel</code> specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrixmatrix products.  
+  ** Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on <code>__builtin_cpu_supports</code>.  
+  * '''GPU (CUDA and HIP)'''  
+  ** Several optimized math functions added, better support for <code>std::complex</code>.  
+  ** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>.  
+  ** Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).  
+  * '''ZVector'''  
+  ** Vectorized <code>float</code> and <code>std::complex<float></code> support added.  
+  ** Added z14 support.  
+  * '''SYCL'''  
+  ** Redesigned SYCL implementation for use with the [https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html Tensor] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.  
+  ** New generic memory model introduced used by <code>TensorDeviceSycl</code>.  
+  ** Better integration with OpenCL devices.  
+  ** Added many math function specializations.  
+  
+  === Miscellaneous API Changes ===  
+  * New <code>setConstant(...)</code> methods for preserving one dimension of a matrix by passing in <code>NoChange</code>.  
+  <source lang="cpp">  
+  MatrixXf A(10, 5); // 10x5 matrix.  
+  A.setConstant(NoChange, 10, 2); // 10x10 matrix of 2s.  
+  A.setConstant(5, NoChange, 3); // 5x10 matrix of 3s.  
+  A.setZero(NoChange, 20); // 5x20 matrix of 0s.  
+  A.setZero(20, NoChange); // 20x20 matrix of 0s.  
+  A.setOnes(NoChange, 5); // 20x5 matrix of 1s.  
+  A.setOnes(5, NoChange); // 5x5 matrix of 1s.  
+  A.setRandom(NoChange, 10); // 5x10 random matrix.  
+  A.setRandom(10, NoChange); // 10x10 random matrix.  
+  </source>  
+  * Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero.  
+  <source lang="cpp">  
+  VectorXf v(5);  
+  v.setUnit(3); // { 0, 0, 0, 1, 0}  
+  </source>  
+  * Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.  
+  * Added <code>shiftLeft<N>()</code> and <code>shiftRight<N>()</code> coefficientwise arithmetic shift functions to Arrays.  
+  <source lang="cpp">  
+  ArrayXXi A = ArrayXXi::Random(2, 3);  
+  ArrayXXi B = A.shiftRight<2>();  
+  ArrayXXi C = A.shiftLeft<6>();  
+  </source>  
+  * Enabled adding and subtracting of diagonal expressions.  
+  <source lang="cpp">  
+  VectorXf x = VectorXf::Random(5);  
+  VectorXf y = VectorXf::Random(5);  
+  MatrixXf A = MatrixXf::Identity(5, 5);  
+  A += x.asDiagonal()  y.asDiagonal();  
+  </source>  
+  * Allow userdefined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>.  
+  * Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable.  
+  * Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>).  
+  * Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>.  
+  
+  === Improvement to NaN propagation ===  
+  
+  * Improvements to NaN correctness for elementwise functions.  
+  * New <code>NaNPropagation</code> template argument to control whether NaNs are propagated or suppressed in elementwise <code>min/max</code> and corresponding reductions on <code>Array</code>, <code>Matrix</code>, and <code>Tensor</code>. Example for max:  
+  <source lang="cpp">  
+  // Elementwise maximum  
+  Eigen::MatrixXf left, right, r0, r1, r2;  
+  r0 = left.cwiseMax(right); // Implementation defined behavior.  
+  // Propagate NaN if either argument is NaN.  
+  r1 = left.template cwiseMax<PropagateNaN>(right);  
+  // Suppress NaN if at least one argument is not a NaN.  
+  r2 = left.template cwiseMax<PropagateNumbers>(right);  
+  
+  // Max reductions  
+  Eigen::MatrixXf m;  
+  float nan_or_max = m.maxCoeff(); // Implementation defined behavior.  
+  float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>();  
+  float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();  
+  </source>  
+  
+  == Changes to unsupported modules ==  
+  === New lowlatency nonblocking ThreadPool module ===  
+  * Originally a part of the Tensor module, <code>Eigen::ThreadPool</code> is now separate and more portable, and forms the basis for multithreading in TensorFlow, for example. Example:  
+  <source lang="cpp">  
+  #include <Eigen/CXX11/ThreadPool>  
+  
+  const int num_threads = 42;  
+  Eigen::ThreadPool tp(num_threads);  
+  auto do_stuff = []() { ... };  
+  tp.Schedule(do_stuff);  
+  </source>  
+  
+  === Changes to Tensor module ===  
+  * Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of <code>nvcc</code>.  
+  * Performance optimizations of Tensor contraction  
+  ** Speed up "outerproductlike" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.  
+  ** Improved threading heuristics.  
+  ** Support for fusing elementwise operations into contraction during evaluation. Example:  
+  <source lang="cpp">  
+  // This example applies std::sqrt to all output elements from a tensor contraction.  
+  // The optional OutputKernel argument to the contraction in this example is a functor over a  
+  // 2dimensional buffer. The functor is called once for each output block of the contraction  
+  // result, to perform the elementwise sqrt operation while the block is hot in cache.  
+  struct SqrtOutputKernel {  
+  template <typename Index, typename Scalar>  
+  EIGEN_ALWAYS_INLINE void operator()(  
+  const internal::blas_data_mapper<Scalar, Index, ColMajor>& output_mapper,  
+  const TensorContractionParams&, Index, Index, Index num_rows,  
+  Index num_cols) const {  
+  for (int i = 0; i < num_rows; ++i) {  
+  for (int j = 0; j < num_cols; ++j) {  
+  output_mapper(i, j) = std::sqrt(output_mapper(i, j));  
+  }  
+  }  
+  }  
+  };  
+  
+  Tensor<float, 4, DataLayout> left(30, 50, 8, 31);  
+  Tensor<float, 5, DataLayout> right(8, 31, 7, 20, 10);  
+  Tensor<float, 5, DataLayout> result(30, 50, 7, 20, 10);  
+  Eigen::array<DimPair, 2> dims({{DimPair(2, 0), DimPair(3, 1)}});  
+  
+  result = left.contract(right, dims, SqrtOutputKernel());  
+  </source>  
+  
+  * Performance optimizations of other Tensor operator  
+  ** Speedups from improved vectorization, block evaluation, and multithreading for most operators.  
+  ** Significant speedup to broadcasting.  
+  ** Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.  
+  * Complete rewrite of the block (tiling) evaluation framework for tensor expressions lead to significant speedups and reduced number of memory allocations.  
+  * Added new API for asynchronous evaluation of tensor expressions. Example:  
+  <source lang="cpp">  
+  Tensor<float, 3> in1(200, 30, 70);  
+  Tensor<float, 3> in2(200, 30, 70);  
+  Tensor<float, 3> out(200, 30, 70);  
+  
+  Eigen::ThreadPool tp(internal::random<int>(3, 11));  
+  Eigen::ThreadPoolDevice thread_pool_device(&tp, internal::random<int>(3, 11));  
+  
+  Eigen::Barrier b(1);  
+  auto done = [&b]() { b.Notify(); };  
+  out.device(thread_pool_device, std::move(done)) = in1 + in2 * 3.14f;  
+  b.Wait();  
+  </source>  
+  * Misc. minor behavior changes & fixes:  
+  ** Fix const correctness for TensorMap.  
+  ** Modify tensor argmin/argmax to always return first occurrence.  
+  ** More numerically stable tree reduction.  
+  ** Improve randomness of the tensor random generator.  
+  ** Update the padding computation for PADDING_SAME to be consistent with TensorFlow.  
+  ** Support static dimensions (aka IndexList) in resizing/reshape/broadcast.  
+  ** Improved accuracy of Tensor FFT.  
+  
+  === Improvements to FFT module ===  
+  
+  * Faster and more accurate twiddle factor computation.  
+  
+  === Improvements to EulerAngles ===  
+  
+  * EulerAngles can now be directly constructed from 3D vectors  
+  * EulerAngles now provide <code>isApprox()</code> and <code>cast()</code> functions  
+  
+  === Changes to sparse iterative solvers ===  
+  * Added new IDRS iterative linear solver.  
+  <source lang="cpp">  
+  #include <unsupported/Eigen/IterativeSolvers>  
+  A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers.  
+  IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A);  
+  VectorXf x = idrs.solve(b);  
+  bool success = (idrs.info() == ComputationInfo::Success);  
+  </source>  
+  
+  === Improvements to Polynomials ===  
+  
+  * PolynomialSolver can now be used with complex numbers  
+  * The solver will automatically choose between <code>EigenSolver</code> and <code>ComplexEigenSolver</code> depending on the scalar type used  
+  
+  == Other relevant changes ==  
+  
+  * Eigen now provides an option to test with an external BLAS library  
+  * Eigen can now be used with the [https://en.wikipedia.org/wiki/The_Portland_Group PGI Compiler]  
+  * Printing when using GDB has been improved  
+  * Eigen can now detect if a platform supports <code>int128</code> intrinsics  
+  
+  == Testing ==  
+  The full Eigen test suite was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations:  
+  
+  { class="wikitable"  
+  !Compiler !! Version !! Platform !! Operating system  
+    
+  Microsoft Visual Studio  2015 Update 3  x8664  Windows  
+    
+  Microsoft Visual Studio  Community 2017  15.9.38  x8664  Windows  
+    
+  Microsoft Visual Studio  Community 2019  16.11  x8664  Windows  
+    
+  GCC  4.8  x8664  Linux  
+    
+  GCC  9  x8664  Linux  
+    
+  GCC  10  x8664  Linux  
+    
+  Clang  6.0  x8664  Linux  
+    
+  Clang  10  x8664  Linux  
+    
+  Clang  11  x8664  Linux  
+    
+  GCC  10  armv8.2a  Linux  
+    
+  Clang  6  armv8.2a  Linux  
+    
+  Clang  9  armv8.2a  Linux  
+    
+  Clang  10  armv8.2a  Linux  
+    
+  Clang  11  armv8.2a  Linux  
+    
+  AppleClang  12.0.5  x8664  macOS  
+    
+  GCC  10  ppc64le  Linux  
+    
+  Clang  10  ppc64le  Linux  
+    
+  }  
+  
+  == List of issues fixed in Eigen 3.4 ==  
−  ==  +  { 
+   [https://gitlab.com/libeigen/eigen//issues/2298 Issue #2298]  
+   List of dense linear decompositions lacks completeorthogonal decomposition  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2284 Issue #2284]  
+   JacobiSVD Outputs Invalid U (Reads Past End of Array)  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2267 Issue #2267]  
+   [3.4 bug] FixedInt<0> error with gcc 4.9.3  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2263 Issue #2263]  
+   usage of signed zeros leads to wrong results with ffastmath  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2251 Issue #2251]  
+   Method unaryExpr() does not support function pointers in Eigen 3.4rc1  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2242 Issue #2242]  
+   No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h'  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2229 Issue #2229]  
+   Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2)  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2213 Issue #2213]  
+   template maxCoeff<PropagateNaN> compilation error with Eigen 3.4.  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2209 Issue #2209]  
+   unaryExpr deduces wrong return type on MSVC  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2157 Issue #2157]  
+   forward_adolc test fails since PR !363  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2119 Issue #2119]  
+   Move assignment swaps even for nondynamic storage  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2112 Issue #2112]  
+   Build failure with boost::multiprecision type  
+    
+   [https://gitlab.com/libeigen/eigen//issues/2093 Issue #2093]  
+   Incorrect evaluation of Ref  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1906 Issue #1906]  
+   Eigen failed with error C2440 with MSVC on windows  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1850 Issue #1850]  
+   error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1833 Issue #1833]  
+   c++20 compilation failure  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1826 Issue #1826]  
+   Wdeprecatedanonenumenumconversion warnings (c++20)  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1815 Issue #1815]  
+   IndexedView of a vector should allow linear access  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1805 Issue #1805]  
+   Uploaded doxygen documentation does not build LaTeX formulae  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1790 Issue #1790]  
+   packetmath_1 unit test fails  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1788 Issue #1788]  
+   Ruleofthree/ruleoffive violations  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1776 Issue #1776]  
+   subvector_stl_iterator::operator> triggers 'taking address of rvalue' warning  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1774 Issue #1774]  
+   std::cbegin() returns nonconst iterator  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1752 Issue #1752]  
+   A change to the C++ Standard will break some tests  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1741 Issue #1741]  
+   Map<>.noalias()=A*B gives wrong result  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1736 Issue #1736]  
+   Column access of some IndexedView won't compile  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1718 Issue #1718]  
+   Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1695 Issue #1695]  
+   Stuck in loop for a certain input when using mpreal support  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1692 Issue #1692]  
+   pass enumeration argument to constructor of VectorXd  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1684 Issue #1684]  
+   array_reverse fails with clang >=6 + AVX + O2  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1674 Issue #1674]  
+   SIMD sin/cos gives wrong results with ffastmath  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1669 Issue #1669]  
+   Zerosized matrices generate assertion failures  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1664 Issue #1664]  
+   dot product with single column block fails with new static checks  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1652 Issue #1652]  
+   Corner cases in SIMD sin/cos  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1643 Issue #1643]  
+   Compilation failure  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1637 Issue #1637]  
+   Register spilling with recent gcc & clang  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1619 Issue #1619]  
+   const_iterator vs iterator compilation error  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1615 Issue #1615]  
+   Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1611 Issue #1611]  
+   NEON: plog(+/0) should return inf and not NaN  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1585 Issue #1585]  
+   Matrix product is repeatedly evaluated when iterating over the product expression  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1557 Issue #1557]  
+   Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1544 Issue #1544]  
+   SparseQR generates incorrect Q matrix in complex case  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1543 Issue #1543]  
+   \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1493 Issue #1493]  
+   dense Q extraction and solve is sometimes erroneous for complex matrices  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1453 Issue #1453]  
+   Strange behavior for Matrix::Map, if only InnerStride is provided  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1409 Issue #1409]  
+   Add support for C++17 operator new alignment  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1340 Issue #1340]  
+   Add operator + to sparse matrix iterator  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1318 Issue #1318]  
+   More robust quaternion from matrix  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1306 Issue #1306]  
+   Add support for AVX512 to Eigen  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1305 Issue #1305]  
+   Implementation of additional componentwise unary functions  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1221 Issue #1221]  
+   I get tons of error since my distribution upgraded to GCC 6.1.1  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1195 Issue #1195]  
+   vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1194 Issue #1194]  
+   Improve det4x4  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1049 Issue #1049]  
+   std::make_shared fails to fulfill structure aliment  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1046 Issue #1046]  
+   fixed matrix types do not report correct alignment requirements  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1014 Issue #1014]  
+   Eigenvalues 3x3 matrix  
+    
+   [https://gitlab.com/libeigen/eigen//issues/1001 Issue #1001]  
+   infer dimensions of Dynamicsized temporaries from the entire expression (if possible)  
+    
+   [https://gitlab.com/libeigen/eigen//issues/977 Issue #977]  
+   Add stable versions of normalize() and normalized()  
+    
+   [https://gitlab.com/libeigen/eigen//issues/899 Issue #899]  
+   SparseQR occasionally fails for underdetermined systems  
+    
+   [https://gitlab.com/libeigen/eigen//issues/864 Issue #864]  
+   C++11 alias templates for commonly used types  
+    
+   [https://gitlab.com/libeigen/eigen//issues/751 Issue #751]  
+   Make AMD Ordering numerically more robust  
+    
+   [https://gitlab.com/libeigen/eigen//issues/747 Issue #747]  
+   Allow for negative stride  
+    
+   [https://gitlab.com/libeigen/eigen//issues/720 Issue #720]  
+   Gaussian NullaryExpr  
+    
+   [https://gitlab.com/libeigen/eigen//issues/663 Issue #663]  
+   Permit NoChange in setZero, setOnes, setConstant, setRandom  
+    
+   [https://gitlab.com/libeigen/eigen//issues/645 Issue #645]  
+   GeneralizedEigenSolver: missing computation of eigenvectors  
+    
+   [https://gitlab.com/libeigen/eigen//issues/632 Issue #632]  
+   Optimize addition/subtraction of sparse and dense matrices/vectors  
+    
+   [https://gitlab.com/libeigen/eigen//issues/631 Issue #631]  
+   (Optionally) throw an exception when using an unsuccessful decomposition  
+    
+   [https://gitlab.com/libeigen/eigen//issues/564 Issue #564]  
+   maxCoeff() returns nan instead of max, while maxCoeff(&maxRow, &maxCol) works  
+    
+   [https://gitlab.com/libeigen/eigen//issues/556 Issue #556]  
+   Matrix multiplication crashes using mingw 4.7  
+    
+   [https://gitlab.com/libeigen/eigen//issues/505 Issue #505]  
+   Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11)  
+    
+   [https://gitlab.com/libeigen/eigen//issues/445 Issue #445]  
+   ParametrizedLine should have transform method  
+    
+   [https://gitlab.com/libeigen/eigen//issues/437 Issue #437]  
+   [feature request] Add Reshape Operation  
+    
+   [https://gitlab.com/libeigen/eigen//issues/426 Issue #426]  
+   Behavior of sum() for Matrix<bool> is unexpected and confusing  
+    
+   [https://gitlab.com/libeigen/eigen//issues/329 Issue #329]  
+   Feature request: Ability to get a \"view\" into a submatrix by indexing it with a vector or matrix of indices  
+    
+   [https://gitlab.com/libeigen/eigen//issues/231 Issue #231]  
+   STL compatible iterators  
+    
+   [https://gitlab.com/libeigen/eigen//issues/96 Issue #96]  
+   Clean internal::result_of  
+    
+   [https://gitlab.com/libeigen/eigen//issues/65 Issue #65]  
+   Core  optimize partial reductions  
+    
+   [https://gitlab.com/libeigen/eigen//issues/64 Issue #64]  
+   Tests : precisionoriented tests  
+  }  
−  +  == Additional information ==  
−  *  +  * A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found [https://docs.google.com/document/d/e/2PACX1vSGvp4Kv9dJgKzJN4CBjppP46flDbe3pJtI9N3m3WkKSoLXmANXuK5gJlw1CPcpCfjAWhgXAtQNzm/pub here]. 
−  +  
−  + 
Latest revision as of 15:26, 14 October 2021
Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the Main Page or from Gitlab.
Notice: that 3.4.x will be the last major release series of Eigen that will support c++03. The master branch will drop c++03 support after this release.
Contents
 1 Changes to supported modules
 1.1 Changes that might break existing code
 1.2 New Major Features in Core
 1.3 New backends
 1.4 Improvements to Eigen Core
 1.5 Elementwise math functions
 1.6 Dense matrix decompositions and solvers
 1.7 Sparse matrix support, decompositions and solvers
 1.8 Type support
 1.9 Improved Geometry Module
 1.10 Backendspecific improvements
 1.11 Miscellaneous API Changes
 1.12 Improvement to NaN propagation
 2 Changes to unsupported modules
 3 Other relevant changes
 4 Testing
 5 List of issues fixed in Eigen 3.4
 6 Additional information
Changes to supported modules
Changes that might break existing code
 Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.:
MatrixXd A(10,10); float one = 1; double a11 = A(one,1.); // compilation error here
New Major Features in Core
 Add c++11 initializer_list constructors to Matrix and Array [doc]:
MatrixXi a { // construct a 2x3 matrix {1,2,3}, // first row {4,5,6} // second row }; VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamicsize vector with 5 elements Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixedsize 1D array of size 5.
 Add STLcompatible iterators for dense expressions [doc]. Some examples:
VectorXd v = ...; MatrixXd A = ...; // range for loop over all entries of v then A for(auto x : v) { cout << x << " "; } for(auto x : A.reshaped()) { cout << x << " "; } // sort v then each column of A std::sort(v.begin(), v.end()); for(auto c : A.colwise()) std::sort(c.begin(), c.end());
 New versatile API for submatrices, slices, and indexed views [doc]. It basically extends
A(.,.)
to let it accept anything that lookslike a sequence of indices with random access. To make it usable this new feature comes with new symbols:Eigen::indexing::all
,Eigen::indexing::last
, and functions generating arithmetic sequences:Eigen::seq(first,last[,incr])
,Eigen::seqN(first,size[,incr])
,Eigen::lastN(size[,incr])
. Here is an example picking even rows but the first and last ones, and a subset of indexed columns:
MatrixXd A = ...; std::vector<int> col_ind{7,3,4,3}; MatrixXd B = A(seq(2,last2,fix<2>), col_ind);
 Add C++11 template aliases for Matrix, Vector, and Array of common sizes, including generic
Vector<Type,Size>
andRowVector<Type,Size>
aliases [doc].
MatrixX<double> M; // Instead of MatrixXd or Matrix<Dynamic, Dynamic, double> Vector4<MyType> V; // Instead of Vector<4, MyType>
 New support for
bfloat16
. The 16bit Brain floating point format is now available asEigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth betweenuint16_t
to extract the bit representation, useEigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
 Arm SVE: Eigen now supports Arm's Scalable Vector Extension (SVE). Currently only fixedlength SVE vectors for
uint32_t
andfloat
are available.  MIPS MSA: Eigen now supports the MIPS SIMD Architecture (MSA)
 AMD ROCm/HIP: Eigen now contains a generic GPU backend that unifies support for NVIDIA/CUDA and AMD/HIP.
 Power 10 MMA Backend: Eigen now has initial support for Power 10 matrix multiplication assist instructions for float32 and float64, real and complex.
Improvements to Eigen Core
 Eigen now uses the c++11 alignas keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all issues related to static alignment, including
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
.  Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:
 By using half and quaterpackets the performance of matrix multiplications of small to medium sized matrices has been improved
 Eigen's GEMM now falls back to GEMV if it detects that a matrix is a runtime vector
 The performance of matrix products using Arm Neon has been drastically improved (up to 20%)
 Performance of many special cases of matrix products has been improved
 Large speed up from blocked algorithm for
.transposeInPlace
.  Speed up misc. operations by propagating compiletime sizes (col/rowwise reverse, PartialPivLU, and others)
 Faster specialized SIMD kernels for small fixedsize inverse, LU decomposition, and determinant.
 Improved or added vectorization of partial or slice reductions along the outerdimension, for instance:
colmajor_mat.rowwise().mean()
Elementwise math functions
 Many functions are now implemented and vectorized in generic (backendagnostic) form.
 Many improvements to correctness, accuracy, and compatibility with c++ standard library.
 Much improved implementation of
ldexp
.  Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.
 New implementation of the PayneHanek for argument reduction algorithm for
sin
andcos
with huge arguments.  New faithfully rounded algorithm for
pow(x,y)
.
 Much improved implementation of
 Speedups from (new or improved) vectorized versions of
pow, log, sin, cos, arg, pow, log2
, complexsqrt, erf, expm1, logp1, logistic, rint, gamma
andbessel
functions, and more.  Improved special function support (Bessel and gamma functions,
ndtri, erfc
, inverse hyperbolic functions and more)  New elementwise functions for
absolute_difference
,rint
.
Dense matrix decompositions and solvers
 All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support
.transpose()
,.adjoint()
and.solve()
APIs.  SVD implementations now have an
info()
method for checking convergence.
#include <Eigen/SVD> MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. VectorXf x = svd.solve(b); }
 Most decompositions now fail quickly when invalid inputs are detected.
 Optimized the product of a
HouseholderSequence
with the identity, as well as the evaluation of aHouseholderSequence
to a dense matrix using faster blocked product.  Fixed aliasing issues with inplace small matrix inversions.
 Fixed several edgecases with empty or zero inputs.
Sparse matrix support, decompositions and solvers
 Enabled assignment and addition with diagonal matrix expressions.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
 Support added for SuiteSparse KLU routines via the
KLUSupport
module. SuiteSparse must be installed to use this module.
#include <Eigen/KLUSupport> A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }

SparseCholesky
now works with rowmajor matrices.  Various bug fixes and performance improvements.
Type support
 Improved support for
half
 Native support added for ARM
__fp16
, CUDA/HIP__half
, andF16C
conversion intrinsics.  Better vectorization support added across all backends.
 Native support added for ARM
 Improved bool support
 Partial vectorization support added for boolean operations.
 Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
 Improved support for custom types
 More custom types work outofthebox (see #2201).
Improved Geometry Module
 Behavioral change:
Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349).  New partial vectorization support added for
Quaternion
.  Generic vectorized 4x4 matrix inversion.
Backendspecific improvements
 Arm NEON
 Now provides vectorization for
uint64_t
,int64_t
,uint32_t
,int16_t
,uint16_t
,int16_t
,int8_t
, anduint8_t
 Emulates
bfloat16
support when usingEigen::bfloat16
 Supports emulated and native
float16
when usingEigen::half
 Now provides vectorization for
 SSE/AVX/AVX512
 General performance improvements and bugfixes.
 Enabled AVX512 instructions by default if available.
 New
std::complex
,half
, andbfloat16
vectorization support added.  Many missing packet functions added.
 Altivec/Power
 General performance improvement and bugfixes.
 Enhanced vectorization of real and complex scalars.
 Changes to the
gebp_kernel
specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrixmatrix products.  Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on
__builtin_cpu_supports
.
 GPU (CUDA and HIP)
 Several optimized math functions added, better support for
std::complex
.  Added option to disable CUDA entirely by defining
EIGEN_NO_CUDA
.  Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).
 Several optimized math functions added, better support for
 ZVector
 Vectorized
float
andstd::complex<float>
support added.  Added z14 support.
 Vectorized
 SYCL
 Redesigned SYCL implementation for use with the Tensor module, which can be enabled by defining
EIGEN_USE_SYCL
.  New generic memory model introduced used by
TensorDeviceSycl
.  Better integration with OpenCL devices.
 Added many math function specializations.
 Redesigned SYCL implementation for use with the Tensor module, which can be enabled by defining
Miscellaneous API Changes
 New
setConstant(...)
methods for preserving one dimension of a matrix by passing inNoChange
.
MatrixXf A(10, 5); // 10x5 matrix. A.setConstant(NoChange, 10, 2); // 10x10 matrix of 2s. A.setConstant(5, NoChange, 3); // 5x10 matrix of 3s. A.setZero(NoChange, 20); // 5x20 matrix of 0s. A.setZero(20, NoChange); // 20x20 matrix of 0s. A.setOnes(NoChange, 5); // 20x5 matrix of 1s. A.setOnes(5, NoChange); // 5x5 matrix of 1s. A.setRandom(NoChange, 10); // 5x10 random matrix. A.setRandom(10, NoChange); // 10x10 random matrix.
 Added
setUnit(Index i)
for vectors that sets the i th coefficient to one and all others to zero.
VectorXf v(5); v.setUnit(3); // { 0, 0, 0, 1, 0}
 Added
transpose()
,adjoint()
,conjugate()
methods toSelfAdjointView
.  Added
shiftLeft<N>()
andshiftRight<N>()
coefficientwise arithmetic shift functions to Arrays.
ArrayXXi A = ArrayXXi::Random(2, 3); ArrayXXi B = A.shiftRight<2>(); ArrayXXi C = A.shiftLeft<6>();
 Enabled adding and subtracting of diagonal expressions.
VectorXf x = VectorXf::Random(5); VectorXf y = VectorXf::Random(5); MatrixXf A = MatrixXf::Identity(5, 5); A += x.asDiagonal()  y.asDiagonal();
 Allow userdefined default cache sizes via defining
EIGEN_DEFAULT_L1_CACHE_SIZE
, ...,EIGEN_DEFAULT_L3_CACHE_SIZE
.  Added
EIGEN_ALIGNOF(X)
macro for determining alignment of a provided variable.  Allow plugins for
VectorwiseOp
by defining a fileEIGEN_VECTORWISEOP_PLUGIN
(e.g.DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h
).  Allow disabling of IO operations by defining
EIGEN_NO_IO
.
Improvement to NaN propagation
 Improvements to NaN correctness for elementwise functions.
 New
NaNPropagation
template argument to control whether NaNs are propagated or suppressed in elementwisemin/max
and corresponding reductions onArray
,Matrix
, andTensor
. Example for max:
// Elementwise maximum Eigen::MatrixXf left, right, r0, r1, r2; r0 = left.cwiseMax(right); // Implementation defined behavior. // Propagate NaN if either argument is NaN. r1 = left.template cwiseMax<PropagateNaN>(right); // Suppress NaN if at least one argument is not a NaN. r2 = left.template cwiseMax<PropagateNumbers>(right); // Max reductions Eigen::MatrixXf m; float nan_or_max = m.maxCoeff(); // Implementation defined behavior. float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>(); float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();
Changes to unsupported modules
New lowlatency nonblocking ThreadPool module
 Originally a part of the Tensor module,
Eigen::ThreadPool
is now separate and more portable, and forms the basis for multithreading in TensorFlow, for example. Example:
#include <Eigen/CXX11/ThreadPool> const int num_threads = 42; Eigen::ThreadPool tp(num_threads); auto do_stuff = []() { ... }; tp.Schedule(do_stuff);
Changes to Tensor module
 Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of
nvcc
.  Performance optimizations of Tensor contraction
 Speed up "outerproductlike" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.
 Improved threading heuristics.
 Support for fusing elementwise operations into contraction during evaluation. Example:
// This example applies std::sqrt to all output elements from a tensor contraction. // The optional OutputKernel argument to the contraction in this example is a functor over a // 2dimensional buffer. The functor is called once for each output block of the contraction // result, to perform the elementwise sqrt operation while the block is hot in cache. struct SqrtOutputKernel { template <typename Index, typename Scalar> EIGEN_ALWAYS_INLINE void operator()( const internal::blas_data_mapper<Scalar, Index, ColMajor>& output_mapper, const TensorContractionParams&, Index, Index, Index num_rows, Index num_cols) const { for (int i = 0; i < num_rows; ++i) { for (int j = 0; j < num_cols; ++j) { output_mapper(i, j) = std::sqrt(output_mapper(i, j)); } } } }; Tensor<float, 4, DataLayout> left(30, 50, 8, 31); Tensor<float, 5, DataLayout> right(8, 31, 7, 20, 10); Tensor<float, 5, DataLayout> result(30, 50, 7, 20, 10); Eigen::array<DimPair, 2> dims({{DimPair(2, 0), DimPair(3, 1)}}); result = left.contract(right, dims, SqrtOutputKernel());
 Performance optimizations of other Tensor operator
 Speedups from improved vectorization, block evaluation, and multithreading for most operators.
 Significant speedup to broadcasting.
 Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.
 Complete rewrite of the block (tiling) evaluation framework for tensor expressions lead to significant speedups and reduced number of memory allocations.
 Added new API for asynchronous evaluation of tensor expressions. Example:
Tensor<float, 3> in1(200, 30, 70); Tensor<float, 3> in2(200, 30, 70); Tensor<float, 3> out(200, 30, 70); Eigen::ThreadPool tp(internal::random<int>(3, 11)); Eigen::ThreadPoolDevice thread_pool_device(&tp, internal::random<int>(3, 11)); Eigen::Barrier b(1); auto done = [&b]() { b.Notify(); }; out.device(thread_pool_device, std::move(done)) = in1 + in2 * 3.14f; b.Wait();
 Misc. minor behavior changes & fixes:
 Fix const correctness for TensorMap.
 Modify tensor argmin/argmax to always return first occurrence.
 More numerically stable tree reduction.
 Improve randomness of the tensor random generator.
 Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
 Support static dimensions (aka IndexList) in resizing/reshape/broadcast.
 Improved accuracy of Tensor FFT.
Improvements to FFT module
 Faster and more accurate twiddle factor computation.
Improvements to EulerAngles
 EulerAngles can now be directly constructed from 3D vectors
 EulerAngles now provide
isApprox()
andcast()
functions
Changes to sparse iterative solvers
 Added new IDRS iterative linear solver.
#include <unsupported/Eigen/IterativeSolvers> A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); VectorXf x = idrs.solve(b); bool success = (idrs.info() == ComputationInfo::Success);
Improvements to Polynomials
 PolynomialSolver can now be used with complex numbers
 The solver will automatically choose between
EigenSolver
andComplexEigenSolver
depending on the scalar type used
Other relevant changes
 Eigen now provides an option to test with an external BLAS library
 Eigen can now be used with the PGI Compiler
 Printing when using GDB has been improved
 Eigen can now detect if a platform supports
int128
intrinsics
Testing
The full Eigen test suite was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations:
Compiler  Version  Platform  Operating system 

Microsoft Visual Studio  2015 Update 3  x8664  Windows 
Microsoft Visual Studio  Community 2017  15.9.38  x8664  Windows 
Microsoft Visual Studio  Community 2019  16.11  x8664  Windows 
GCC  4.8  x8664  Linux 
GCC  9  x8664  Linux 
GCC  10  x8664  Linux 
Clang  6.0  x8664  Linux 
Clang  10  x8664  Linux 
Clang  11  x8664  Linux 
GCC  10  armv8.2a  Linux 
Clang  6  armv8.2a  Linux 
Clang  9  armv8.2a  Linux 
Clang  10  armv8.2a  Linux 
Clang  11  armv8.2a  Linux 
AppleClang  12.0.5  x8664  macOS 
GCC  10  ppc64le  Linux 
Clang  10  ppc64le  Linux 
List of issues fixed in Eigen 3.4
Issue #2298  List of dense linear decompositions lacks completeorthogonal decomposition 
Issue #2284  JacobiSVD Outputs Invalid U (Reads Past End of Array) 
Issue #2267  [3.4 bug] FixedInt<0> error with gcc 4.9.3 
Issue #2263  usage of signed zeros leads to wrong results with ffastmath 
Issue #2251  Method unaryExpr() does not support function pointers in Eigen 3.4rc1 
Issue #2242  No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h' 
Issue #2229  Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2) 
Issue #2213  template maxCoeff<PropagateNaN> compilation error with Eigen 3.4. 
Issue #2209  unaryExpr deduces wrong return type on MSVC 
Issue #2157  forward_adolc test fails since PR !363 
Issue #2119  Move assignment swaps even for nondynamic storage 
Issue #2112  Build failure with boost::multiprecision type 
Issue #2093  Incorrect evaluation of Ref 
Issue #1906  Eigen failed with error C2440 with MSVC on windows 
Issue #1850  error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t 
Issue #1833  c++20 compilation failure 
Issue #1826  Wdeprecatedanonenumenumconversion warnings (c++20) 
Issue #1815  IndexedView of a vector should allow linear access 
Issue #1805  Uploaded doxygen documentation does not build LaTeX formulae 
Issue #1790  packetmath_1 unit test fails 
Issue #1788  Ruleofthree/ruleoffive violations 
Issue #1776  subvector_stl_iterator::operator> triggers 'taking address of rvalue' warning 
Issue #1774  std::cbegin() returns nonconst iterator 
Issue #1752  A change to the C++ Standard will break some tests 
Issue #1741  Map<>.noalias()=A*B gives wrong result 
Issue #1736  Column access of some IndexedView won't compile 
Issue #1718  Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC 
Issue #1695  Stuck in loop for a certain input when using mpreal support 
Issue #1692  pass enumeration argument to constructor of VectorXd 
Issue #1684  array_reverse fails with clang >=6 + AVX + O2 
Issue #1674  SIMD sin/cos gives wrong results with ffastmath 
Issue #1669  Zerosized matrices generate assertion failures 
Issue #1664  dot product with single column block fails with new static checks 
Issue #1652  Corner cases in SIMD sin/cos 
Issue #1643  Compilation failure 
Issue #1637  Register spilling with recent gcc & clang 
Issue #1619  const_iterator vs iterator compilation error 
Issue #1615  Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow 
Issue #1611  NEON: plog(+/0) should return inf and not NaN 
Issue #1585  Matrix product is repeatedly evaluated when iterating over the product expression 
Issue #1557  Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding 
Issue #1544  SparseQR generates incorrect Q matrix in complex case 
Issue #1543  \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product 
Issue #1493  dense Q extraction and solve is sometimes erroneous for complex matrices 
Issue #1453  Strange behavior for Matrix::Map, if only InnerStride is provided 
Issue #1409  Add support for C++17 operator new alignment 
Issue #1340  Add operator + to sparse matrix iterator 
Issue #1318  More robust quaternion from matrix 
Issue #1306  Add support for AVX512 to Eigen 
Issue #1305  Implementation of additional componentwise unary functions 
Issue #1221  I get tons of error since my distribution upgraded to GCC 6.1.1 
Issue #1195  vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling 
Issue #1194  Improve det4x4 
Issue #1049  std::make_shared fails to fulfill structure aliment 
Issue #1046  fixed matrix types do not report correct alignment requirements 
Issue #1014  Eigenvalues 3x3 matrix 
Issue #1001  infer dimensions of Dynamicsized temporaries from the entire expression (if possible) 
Issue #977  Add stable versions of normalize() and normalized() 
Issue #899  SparseQR occasionally fails for underdetermined systems 
Issue #864  C++11 alias templates for commonly used types 
Issue #751  Make AMD Ordering numerically more robust 
Issue #747  Allow for negative stride 
Issue #720  Gaussian NullaryExpr 
Issue #663  Permit NoChange in setZero, setOnes, setConstant, setRandom 
Issue #645  GeneralizedEigenSolver: missing computation of eigenvectors 
Issue #632  Optimize addition/subtraction of sparse and dense matrices/vectors 
Issue #631  (Optionally) throw an exception when using an unsuccessful decomposition 
Issue #564  maxCoeff() returns nan instead of max, while maxCoeff(&maxRow, &maxCol) works 
Issue #556  Matrix multiplication crashes using mingw 4.7 
Issue #505  Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11) 
Issue #445  ParametrizedLine should have transform method 
Issue #437  [feature request] Add Reshape Operation 
Issue #426  Behavior of sum() for Matrix<bool> is unexpected and confusing 
Issue #329  Feature request: Ability to get a \"view\" into a submatrix by indexing it with a vector or matrix of indices 
Issue #231  STL compatible iterators 
Issue #96  Clean internal::result_of 
Issue #65  Core  optimize partial reductions 
Issue #64  Tests : precisionoriented tests 
Additional information
 A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found here.