Difference between revisions of "3.4"

From Eigen
Jump to: navigation, search
m (Reverted edits by Rmlarsen (talk) to last revision by Chtz)
Line 1: Line 1:
Eigen 3.4 was released on August 18 2021. It can be downloaded from the Download section on the
+
Eigen 3.4-rc1 has been released on April 19th 2021. It can be downloaded from the Download section on the Main Page.
[https://eigen.tuxfamily.org/index.php?title=Main_Page Main Page] or from [https://gitlab.com/libeigen/eigen/-/releases/3.4.0 Gitlab].
+
Since Eigen 3.3, the 3.4 development branch received more than 1750 commits [1] (TODO update!) representing numerous major changes.
  
'''Notice:''' that 3.4.x will be the last major release series of Eigen that will support c++03. The master branch will drop c++03 support after this release.
+
=== Changes that might impact existing code ===
  
== Changes to supported modules ==
+
* Using float or double for indexing matrices, vectors and array will now fail to compile, ex.:
 
+
=== Changes that might break existing code ===
+
 
+
* Using float or double for indexing matrices, vectors and arrays will now fail to compile, ex.:
+
 
<source lang="cpp">
 
<source lang="cpp">
 
MatrixXd A(10,10);
 
MatrixXd A(10,10);
Line 15: Line 11:
 
</source>
 
</source>
  
=== New Major Features in Core ===
+
=== New features ===
  
* Add c++11 '''initializer_list constructors''' to Matrix and Array  [http://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixClass.html#title3 [doc]]:
+
* New versatile API for sub-matrices, '''slices''', and '''indexed views''' [http://eigen.tuxfamily.org/dox-devel/group__TutorialSlicingIndexing.html [doc]]. It basically extends <code>A(.,.)</code> to let it accept anything that looks-like a sequence of indices with random access. To make it usable this new feature comes with new symbols: <code>Eigen::all</code>, <code>Eigen::last</code>, and functions generating arithmetic sequences: <code>Eigen::seq(first,last[,incr])</code>, <code>Eigen::seqN(first,size[,incr])</code>, <code>Eigen::lastN(size[,incr])</code>. Here is an example picking even rows but the first and last ones, and a subset of indexed columns:
 
<source lang="cpp">
 
<source lang="cpp">
MatrixXi a {      // construct a 2x3 matrix
+
MatrixXd A = ...;
      {1,2,3},   // first row
+
std::vector<int> col_ind{7,3,4,3};
      {4,5,6}     // second row
+
MatrixXd B = A(seq(2,last-2,fix<2>), col_ind);
};
+
</source>
VectorXd v{{1, 2, 3, 4, 5}};   // construct a dynamic-size vector with 5 elements
+
 
Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5.
+
* '''Reshaped''' views through the new members <code>reshaped()</code> and <code>reshaped(rows,cols)</code>. This feature also comes with new symbols: <code>Eigen::AutoOrder</code>, <code>Eigen::AutoSize</code>.  [http://eigen.tuxfamily.org/dox-devel/group__TutorialReshape.html [doc]]
 +
 
 +
* A new helper <code>Eigen::fix<N></code> to pass compile-time integer values to Eigen's functions [http://eigen.tuxfamily.org/dox-devel/group__Core__Module.html#title6 [doc]]. It can be used to pass compile-time sizes to <code>.block(...)</code>, <code>.segment(...)</code>, and all variants, as well as the first, size and increment parameters of the seq, seqN, and lastN functions introduced above. You can also pass "possibly compile-time values" through <code>Eigen::fix<N>(n)</code>. Here is an example comparing the old and new way to call <code>.block</code> with fixed sizes:
 +
<source lang="cpp">
 +
template<typename MatrixType,int N>
 +
void foo(const MatrixType &A, int i, int j, int n) {
 +
    A.block(i,j,2,3);                        // runtime sizes
 +
     // compile-time nb rows and columns:
 +
    A.template block<2,3>(i,j);               // 3.3 way
 +
    A.block(i,j,fix<2>,fix<3>);              // new 3.4 way
 +
    // compile-time nb rows only:
 +
    A.template block<2,Dynamic>(i,j,2,n);     // 3.3 way
 +
    A.block(i,j,fix<2>,n);                    // new 3.4 way
 +
    // possibly compile-time nb columns
 +
    // (use n if N==Dynamic, otherwise we must have n==N):
 +
    A.template block<2,N>(i,j,2,n);          // 3.3 way
 +
    A.block(i,j,fix<2>,fix<N>(n));           // new 3.4 way
 +
}
 
</source>
 
</source>
  
Line 40: Line 53:
 
</source>
 
</source>
  
* Add C++11 '''template aliases''' for Matrix, Vector, and Array of common sizes, including generic <code>Vector<Type,Size></code> and <code>RowVector<Type,Size></code> aliases [http://eigen.tuxfamily.org/dox-devel/group__matrixtypedefs.html [doc]].
+
* Add c++11 '''initializer_list constructors''' to Matrix and Array [http://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixClass.html#title3 [doc]]:
 
<source lang="cpp">
 
<source lang="cpp">
MatrixX<double> M;  // Instead of MatrixXd or Matrix<Dynamic, Dynamic, double>
+
MatrixXi a {     // construct a 2x3 matrix
Vector4<MyType> V;  // Instead of Vector<4, MyType>
+
       {1,2,3},   // first row
</source>
+
       {4,5,6}    // second row
 
+
* New support for <code>bfloat16</code>.  The 16-bit [https://en.wikipedia.org/wiki/Bfloat16_floating-point_format Brain floating point format] is now available as <code>Eigen::bfloat16</code>.  The constructor must be called explicitly, but it can otherwise be used as any other scalar type.  To convert back-and-forth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.
+
<source lang="cpp">
+
  bfloat16 s(0.25);                                // explicit construction
+
  uint16_t s_bits = numext::bit_cast<uint16_t>(s);  // bit representation
+
 
+
  using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;
+
  MatrixBf16 X = s * MatrixBf16::Random(3, 3);
+
</source>
+
 
+
=== New backends ===
+
 
+
* '''Arm SVE:''' Eigen now supports Arm's [https://developer.arm.com/documentation/101726/0300/Learn-about-the-Scalable-Vector-Extension--SVE-/What-is-the-Scalable-Vector-Extension-  Scalable Vector Extension (SVE)]. Currently only fixed-length SVE vectors for <code>uint32_t</code> and <code>float</code> are available.
+
* '''MIPS MSA:''' Eigen now supports the [https://www.mips.com/products/architectures/ase/simd/ MIPS SIMD Architecture (MSA)]
+
* '''AMD ROCm/HIP:''' Eigen now contains a generic GPU backend that unifies support for [https://developer.nvidia.com/cuda-toolkit NVIDIA/CUDA] and [https://rocmdocs.amd.com/en/latest/ AMD/HIP].
+
* '''Power 10 MMA Backend:''' Eigen now has initial support for [https://arxiv.org/pdf/2104.03142.pdf Power 10 matrix multiplication assist instructions] for float32 and float64, real and complex.
+
 
+
=== Improvements to Eigen Core ===
+
* Eigen now uses the c++11 '''alignas''' keyword for static alignment. Users targeting C++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code>EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>.
+
* Various performance improvements for products and Eigen's GEBP and GEMV kernels have been implemented:
+
** By using half- and quater-packets the performance of matrix multiplications of small to medium sized matrices has been improved
+
** Eigen's GEMM now falls back to GEMV if it detects that a matrix is a run-time vector
+
** The performance of matrix products using Arm Neon has been drastically improved (up to 20%)
+
** Performance of many special cases of matrix products has been improved
+
* Large speed up from blocked algorithm for <code>.transposeInPlace</code>.
+
* Speed up misc. operations by propagating compile-time sizes (col/row-wise reverse, PartialPivLU, and others)
+
* Faster specialized SIMD kernels for small fixed-size inverse, LU decomposition, and determinant.
+
* Improved or added vectorization of partial or slice reductions along the outer-dimension, for instance: <code>colmajor_mat.rowwise().mean()</code>
+
 
+
=== Elementwise math functions ===
+
* Many functions are now implemented and vectorized in generic (backend-agnostic) form.
+
* Many improvements to correctness, accuracy, and compatibility with c++ standard library.
+
** Much improved implementation of <code>ldexp</code>.
+
** Misc. fixes for corner cases, NaN/Inf inputs and singular points of many functions.
+
** New implementation of the Payne-Hanek for argument reduction algorithm for <code>sin</code> and <code>cos</code> with huge arguments.
+
** New faithfully rounded algorithm for <code>pow(x,y)</code>.
+
* Speedups from (new or improved) vectorized versions of <code>pow, log, sin, cos, arg, pow, log2</code>, complex <code>sqrt, erf, expm1, logp1, logistic, rint, gamma</code> and <code>bessel</code> functions, and more.
+
* Improved special function support (Bessel and gamma functions, <code>ndtri, erfc</code>, inverse hyperbolic functions and more)
+
* New elementwise functions for <code>absolute_difference</code>, <code>rint</code>.
+
 
+
=== Dense matrix decompositions and solvers ===
+
* All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherit SolverBase and thus support <code>.transpose()</code>, <code>.adjoint()</code> and <code>.solve()</code> APIs.
+
* SVD implementations now have an <code>info()</code> method for checking convergence.
+
<source lang="cpp">
+
  #include <Eigen/SVD>
+
  MatrixXf m = MatrixXf::Random(3,2);
+
  JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV);
+
  if (svd.info() == ComputationInfo::Success) {
+
    // SVD computation was successful.
+
    VectorXf x = svd.solve(b);
+
  }
+
</source>
+
* Most decompositions now fail quickly when invalid inputs are detected.
+
* Optimized the product of a <code>HouseholderSequence</code> with the identity, as well as the evaluation of a <code>HouseholderSequence</code> to a dense matrix using faster blocked product.
+
* Fixed aliasing issues with in-place small matrix inversions.
+
* Fixed several edge-cases with empty or zero inputs.
+
 
+
=== Sparse matrix support, decompositions and solvers ===
+
* Enabled assignment and addition with diagonal matrix expressions.
+
<source lang="cpp">
+
  SparseMatrix<float> A(10, 10);
+
  VectorXf x = VectorXf::Random(10);
+
  A = x.asDiagonal();
+
  A += x.asDiagonal();
+
</source>
+
* Support added for SuiteSparse KLU routines via the <code>KLUSupport</code> module.  SuiteSparse must be installed to use this module.
+
<source lang="cpp">
+
  #include <Eigen/KLUSupport>
+
  A.makeCompressed();  // Recommendation is to compress input before calling sparse solvers.
+
  KLU<SparseMatrix<T> > klu(A);
+
  if (klu.info() == ComputationInfo::Success) {
+
    VectorXf x = klu.solve(b);
+
  }
+
</source>
+
* <code>SparseCholesky</code> now works with row-major matrices.
+
* Various bug fixes and performance improvements.
+
 
+
=== Type support ===
+
* Improved support for <code>half</code>
+
** Native support added for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, and <code>F16C</code> conversion intrinsics.
+
** Better vectorization support added across all backends.
+
* Improved bool support
+
** Partial vectorization support added for boolean operations.
+
** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.
+
* Improved support for custom types
+
** More custom types work out-of-the-box (see [https://gitlab.com/libeigen/eigen/-/issues/2201 #2201]).
+
 
+
=== Improved Geometry Module ===
+
* '''Behavioral change:''' <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see [https://gitlab.com/libeigen/eigen/-/merge_requests/349 !349]).
+
* New partial vectorization support added for <code>Quaternion</code>.
+
* Generic vectorized 4x4 matrix inversion.
+
 
+
=== Backend-specific improvements ===
+
* '''Arm NEON'''
+
** Now provides vectorization for <code>uint64_t</code>, <code>int64_t</code>, <code>uint32_t</code>, <code>int16_t</code>, <code>uint16_t</code>, <code>int16_t</code>, <code>int8_t</code>, and <code>uint8_t</code>
+
** Emulates <code>bfloat16</code> support when using <code>Eigen::bfloat16</code>
+
** Supports emulated and native <code>float16</code> when using <code>Eigen::half</code>
+
* '''SSE/AVX/AVX512'''
+
** General performance improvements and bugfixes.
+
** Enabled AVX512 instructions by default if available.
+
** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added.
+
** Many missing packet functions added.
+
* '''Altivec/Power'''
+
** General performance improvement and bugfixes.
+
** Enhanced vectorization of real and complex scalars.
+
** Changes to the <code>gebp_kernel</code> specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
+
** Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on <code>__builtin_cpu_supports</code>.
+
* '''GPU (CUDA and HIP)'''
+
** Several optimized math functions added, better support for <code>std::complex</code>.
+
** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>.
+
** Many more functions can now be used in device code (e.g. comparisons, small matrix inversion).
+
* '''ZVector'''
+
** Vectorized <code>float</code> and <code>std::complex<float></code> support added.
+
** Added z14 support.
+
* '''SYCL'''
+
** Redesigned SYCL implementation for use with the [https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html Tensor] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.
+
** New generic memory model introduced used by <code>TensorDeviceSycl</code>.
+
** Better integration with OpenCL devices.
+
** Added many math function specializations.
+
 
+
=== Miscellaneous API Changes ===
+
* New <code>setConstant(...)</code> methods for preserving one dimension of a matrix by passing in <code>NoChange</code>.
+
<source lang="cpp">
+
  MatrixXf A(10, 5);              // 10x5  matrix.
+
  A.setConstant(NoChange, 10, 2);  // 10x10 matrix of 2s.
+
  A.setConstant(5, NoChange, 3);  //  5x10 matrix of 3s.
+
  A.setZero(NoChange, 20);        //  5x20 matrix of 0s.
+
  A.setZero(20, NoChange);        // 20x20 matrix of 0s.
+
  A.setOnes(NoChange, 5);          // 20x5  matrix of 1s.
+
  A.setOnes(5, NoChange);          //  5x5  matrix of 1s.
+
  A.setRandom(NoChange, 10);       //  5x10 random matrix.
+
  A.setRandom(10, NoChange);      // 10x10 random matrix.
+
</source>
+
* Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero.
+
<source lang="cpp">
+
  VectorXf v(5);
+
  v.setUnit(3);  // { 0, 0, 0, 1, 0}
+
</source>
+
* Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.
+
* Added <code>shiftLeft<N>()</code> and <code>shiftRight<N>()</code> coefficient-wise arithmetic shift functions to Arrays.
+
<source lang="cpp">
+
  ArrayXXi A = ArrayXXi::Random(2, 3);
+
  ArrayXXi B = A.shiftRight<2>();
+
  ArrayXXi C = A.shiftLeft<6>();
+
</source>
+
* Enabled adding and subtracting of diagonal expressions.
+
<source lang="cpp">
+
  VectorXf x = VectorXf::Random(5);
+
  VectorXf y = VectorXf::Random(5);
+
  MatrixXf A = MatrixXf::Identity(5, 5);
+
  A += x.asDiagonal() - y.asDiagonal();
+
</source>
+
* Allow user-defined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>.
+
* Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable.
+
* Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>-DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>).
+
* Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>.
+
 
+
=== Improvement to NaN propagation ===
+
 
+
* Improvements to NaN correctness for elementwise functions.
+
* New <code>NaNPropagation</code> template argument to control whether NaNs are propagated or suppressed in elementwise <code>min/max</code> and corresponding reductions on <code>Array</code>, <code>Matrix</code>, and <code>Tensor</code>. Example for max:
+
<source lang="cpp">
+
// Elementwise maximum
+
Eigen::MatrixXf left, right, r0, r1, r2;
+
r0 = left.cwiseMax(right); // Implementation defined behavior.
+
// Propagate NaN if either argument is NaN.
+
r1 = left.template cwiseMax<PropagateNaN>(right);
+
// Suppress NaN if at least one argument is not a NaN.
+
r2 = left.template cwiseMax<PropagateNumbers>(right);
+
 
+
// Max reductions
+
Eigen::MatrixXf m;
+
float nan_or_max = m.maxCoeff(); // Implementation defined behavior.
+
float nan_if_any_or_max = m.template maxCoeff<PropagateNaN>();
+
float nan_if_all_or_max = m.template maxCoeff<PropagateNumbers>();
+
</source>
+
 
+
== Changes to unsupported modules ==
+
=== New low-latency non-blocking ThreadPool module ===
+
* Originally a part of the Tensor module, <code>Eigen::ThreadPool</code> is now separate and more portable, and forms the basis for multi-threading in TensorFlow, for example. Example:
+
<source lang="cpp">
+
  #include <Eigen/CXX11/ThreadPool>
+
 
+
  const int num_threads = 42;
+
  Eigen::ThreadPool tp(num_threads);
+
  auto do_stuff = []() { ... };
+
  tp.Schedule(do_stuff);
+
</source>
+
 
+
=== Changes to Tensor module ===
+
* Support for c++03 was officially dropped in Tensor module, since most of the code was written in c++11 anyway. This will prevent building the code for CUDA with older version of <code>nvcc</code>.
+
* Performance optimizations of Tensor contraction
+
** Speed up "outer-product-like" operations by parallelizing over the contraction dimension, using thread_local buffers and recursive work splitting.
+
** Improved threading heuristics.
+
** Support for fusing element-wise operations into contraction during evaluation. Example:
+
<source lang="cpp">
+
// This example applies std::sqrt to all output elements from a tensor contraction.
+
// The optional OutputKernel argument to the contraction in this example is a functor over a
+
// 2-dimensional buffer. The functor is called once for each output block of the contraction
+
// result, to perform the elementwise sqrt operation while the block is hot in cache.
+
struct SqrtOutputKernel {
+
  template <typename Index, typename Scalar>
+
  EIGEN_ALWAYS_INLINE void operator()(
+
       const internal::blas_data_mapper<Scalar, Index, ColMajor>& output_mapper,
+
      const TensorContractionParams&, Index, Index, Index num_rows,
+
      Index num_cols) const {
+
    for (int i = 0; i < num_rows; ++i) {
+
      for (int j = 0; j < num_cols; ++j) {
+
        output_mapper(i, j) = std::sqrt(output_mapper(i, j));
+
      }
+
     }
+
  }
+
 
};
 
};
 
+
VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamic-size vector with 5 elements
Tensor<float, 4, DataLayout> left(30, 50, 8, 31);
+
Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5.
Tensor<float, 5, DataLayout> right(8, 31, 7, 20, 10);
+
Tensor<float, 5, DataLayout> result(30, 50, 7, 20, 10);
+
Eigen::array<DimPair, 2> dims({{DimPair(2, 0), DimPair(3, 1)}});
+
 
+
result = left.contract(right, dims, SqrtOutputKernel());
+
 
</source>
 
</source>
  
* Performance optimizations of other Tensor operator
+
* Add c++11 '''template aliases''' for Matrix, Vector, and Array of common sizes, including generic <code>Vector<Type,Size></code> and <code>RowVector<Type,Size></code> aliases [http://eigen.tuxfamily.org/dox-devel/group__matrixtypedefs.html [doc]].
** Speedups from improved vectorization, block evaluation, and multi-threading for most operators.
+
** Significant speedup to broadcasting.
+
** Reduction of index computation overhead, e.g. using fast divisors in TensorGenerator, squeezing dimensions in TensorPadding.
+
* Complete rewrite of the block (tiling) evaluation framework for tensor expressions lead to significant speedups and reduced number of memory allocations.
+
* Added new API for asynchronous evaluation of tensor expressions. Example:
+
<source lang="cpp">
+
  Tensor<float, 3> in1(200, 30, 70);
+
  Tensor<float, 3> in2(200, 30, 70);
+
  Tensor<float, 3> out(200, 30, 70);
+
  
  Eigen::ThreadPool tp(internal::random<int>(3, 11));
+
* A new '''namespace indexing''' allowing to exclusively import the subset of functions and symbols that are typically used within <code>A(.,.)</code>, that is: all,seq, seqN, lastN, last, lastp1. [http://eigen.tuxfamily.org/dox-devel/namespaceEigen_1_1indexing.html [doc]]
  Eigen::ThreadPoolDevice thread_pool_device(&tp, internal::random<int>(3, 11));
+
  
  Eigen::Barrier b(1);
+
* All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherits <code>SolverBase</code> and thus support <code>.transpose()</code> and <code>.adjoint()</code> solving [https://eigen.tuxfamily.org/dox/classEigen_1_1SolverBase.html API].
  auto done = [&b]() { b.Notify(); };
+
  out.device(thread_pool_device, std::move(done)) = in1 + in2 * 3.14f;
+
  b.Wait();
+
</source>
+
* Misc. minor behavior changes & fixes:
+
** Fix const correctness for TensorMap.
+
** Modify tensor argmin/argmax to always return first occurrence.
+
** More numerically stable tree reduction.
+
** Improve randomness of the tensor random generator.
+
** Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
+
** Support static dimensions (aka IndexList) in resizing/reshape/broadcast.
+
** Improved accuracy of Tensor FFT.
+
  
=== Improvements to FFT module ===
+
* Misc
 +
** Add templated <code>subVector<Vertical/Horizonal>(Index)</code> aliases to <code>col/row(Index)</code> methods, and <code>subVectors<>()</code> aliases to <code>rows()/cols()</code>.
 +
** Add <code>innerVector()</code> and <code>innerVectors()</code> methods.
 +
** Add diagmat +/- diagmat operators (bug 520)
 +
** Add specializations for <code>res ?= dense +/- sparse</code> and <code>res ?= sparse +/- dense</code>. (see bug 632)
 +
** Add <code>sparse_matrix =,+=,-= diagonal_matrix</code> support with smart insertion strategies of missing diagonal coeffs. (see bug 1574)
 +
** Add <code>conjugateIf<bool></code> members for conditional conjugation.
 +
** Add support for SuiteSparse's '''KLU''' sparse direct solver (LU-based solver tailored for problems coming from circuit simulation).
  
* Faster and more accurate twiddle factor computation.
+
===  Alignment ===
  
=== Improvements to EulerAngles ===
+
Eigen now uses c++11 '''alignas''' keyword for static alignment. Users targeting c++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all [http://eigen.tuxfamily.org/dox-devel/group__TopicUnalignedArrayAssert.html issues] related to static alignment, including <code> EIGEN_MAKE_ALIGNED_OPERATOR_NEW</code>.
  
* EulerAngles can now be directly constructed from 3D vectors
+
===  Performance optimizations ===
* EulerAngles now provide <code>isApprox()</code> and <code>cast()</code> functions
+
  
=== Changes to sparse iterative solvers ===
+
* Vectorization of partial-reductions along outer-dimension, e.g.: colmajor.rowwise().mean()
* Added new IDRS iterative linear solver.
+
* Speed up evaluation of HouseholderSequence to a dense matrix, e.g.,<source lang="cpp">
<source lang="cpp">
+
MatrixXd Q = A.qr().householderQ();
  #include <unsupported/Eigen/IterativeSolvers>
+
  A.makeCompressed();  // Recommendation is to compress input before calling sparse solvers.
+
  IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A);
+
  VectorXf x = idrs.solve(b);
+
  bool success = (idrs.info() == ComputationInfo::Success);
+
 
</source>
 
</source>
 +
* Various optimizations of matrix products for small and medium sizes when using large SIMD registers (e.g., AVX and AVX512).
 +
* Optimize evaluation of small products of the form <code>s*A*B</code> by rewriting them as: <code>s*(A.lazyProduct(B))</code> to save a costly temporary. Measured speedup from 2x to 5x (see bug 1562).
 +
* Improve multi-threading heuristic for matrix products with a small number of columns.
 +
* 20% speedup of matrix products on ARM64
 +
* Speed-up reductions of sub-matrices.
 +
* Huge speedup for LU factorization of small fixed-size matrices.
 +
* Optimize extraction of factor Q in SparseQR.
 +
* SIMD implementations of math functions (exp,log,sin,cos) have been unified as a generic implementation compatible over all supported SIMD engines (SSE,AVX,AVX512,NEON,Altivec,VSX,MSA).
  
=== Improvements to Polynomials ===
+
=== Hardware support ===
 
+
* PolynomialSolver can now be used with complex numbers
+
* The solver will automatically choose between <code>EigenSolver</code> and <code>ComplexEigenSolver</code> depending on the scalar type used
+
 
+
== Other relevant changes ==
+
 
+
* Eigen now provides an option to test with an external BLAS library
+
* Eigen can now be used with the [https://en.wikipedia.org/wiki/The_Portland_Group PGI Compiler]
+
* Printing when using GDB has been improved
+
* Eigen can now detect if a platform supports <code>int128</code> intrinsics
+
 
+
== Testing ==
+
The full Eigen test suite  was built and run successfully (in c++03 and c++11 mode) with the following compiler/platform/OS combinations:
+
 
+
{| class="wikitable"
+
!Compiler  !! Version                            !! Platform !! Operating system
+
|-
+
|Microsoft Visual Studio || 2015 Update 3 || x86-64 || Windows
+
|-
+
|Microsoft Visual Studio || Community 2017 - 15.9.38 || x86-64  || Windows
+
|-
+
|Microsoft Visual Studio || Community 2019 - 16.11 || x86-64  || Windows
+
|-
+
|GCC || 4.8 || x86-64 || Linux
+
|-
+
|GCC || 9 || x86-64 || Linux
+
|-
+
|GCC || 10 ||  x86-64 || Linux
+
|-
+
|Clang || 6.0 ||  x86-64 || Linux
+
|-
+
|Clang || 10 ||  x86-64 || Linux
+
|-
+
|Clang || 11 || x86-64 || Linux
+
|-
+
|GCC || 10 ||  armv8.2-a || Linux
+
|-
+
|Clang || 6 ||  armv8.2-a || Linux
+
|-
+
|Clang || 9 ||  armv8.2-a || Linux
+
|-
+
|Clang || 10 ||  armv8.2-a || Linux
+
|-
+
|Clang || 11 ||  armv8.2-a || Linux
+
|-
+
|AppleClang || 12.0.5 ||  x86-64 || macOS
+
|-
+
|GCC || 10 ||  ppc64le || Linux
+
|-
+
|Clang || 10 || ppc64le || Linux
+
|-
+
|}
+
  
== List of issues fixed in Eigen 3.4 ==
+
* AVX512 support is now complete (including complex scalars) and enabled by default when enabled on compiler side.
 +
* Generalization of the CUDA support to CUDA/HIP for AMD GPUs.
 +
* Add explicit SIMD support for MSA instruction set (MIPS).
  
{|
+
=== Footnotes ===
| [https://gitlab.com/libeigen/eigen/-/issues/2298 Issue #2298]
+
| List of dense linear decompositions lacks completeorthogonal decomposition
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2284 Issue #2284]
+
| JacobiSVD Outputs Invalid U (Reads Past End of Array)
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2267 Issue #2267]
+
| [3.4 bug] FixedInt<0> error with gcc 4.9.3
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2263 Issue #2263]
+
| usage of signed zeros leads to wrong results with -ffast-math
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2251 Issue #2251]
+
| Method unaryExpr() does not support function pointers in Eigen 3.4rc1
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2242 Issue #2242]
+
| No matching function for call to \"...\" in 'Complex.h' and 'GenericPacketMathFunctions.h'
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2229 Issue #2229]
+
| Copies (& potentially moves?) of Eigen object with large unused MaxRows/ColAtCompileTime are slow (Regression from Eigen 3.2)
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2213 Issue #2213]
+
| template maxCoeff<PropagateNaN> compilation error with Eigen 3.4.
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2209 Issue #2209]
+
| unaryExpr deduces wrong return type on MSVC
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2157 Issue #2157]
+
| forward_adolc test fails since PR !363
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2119 Issue #2119]
+
| Move assignment swaps even for non-dynamic storage
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2112 Issue #2112]
+
| Build failure with boost::multiprecision type
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/2093 Issue #2093]
+
| Incorrect evaluation of Ref
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1906 Issue #1906]
+
| Eigen failed with error C2440 with MSVC on windows
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1850 Issue #1850]
+
| error C4996: 'std::result_of<T>': warning STL4014: std::result_of and std::result_of_t are deprecated in C++17. They are superseded by std::invoke_result and std::invoke_result_t
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1833 Issue #1833]
+
| c++20 compilation failure
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1826 Issue #1826]
+
| -Wdeprecated-anon-enum-enum-conversion warnings (c++20)
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1815 Issue #1815]
+
| IndexedView of a vector should allow linear access
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1805 Issue #1805]
+
| Uploaded doxygen documentation does not build LaTeX formulae
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1790 Issue #1790]
+
| packetmath_1 unit test fails
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1788 Issue #1788]
+
| Rule-of-three/rule-of-five violations
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1776 Issue #1776]
+
| subvector_stl_iterator::operator-> triggers 'taking address of rvalue' warning
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1774 Issue #1774]
+
| std::cbegin() returns non-const iterator
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1752 Issue #1752]
+
| A change to the C++ Standard will break some tests
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1741 Issue #1741]
+
| Map<>.noalias()=A*B gives wrong result
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1736 Issue #1736]
+
| Column access of some IndexedView won't compile
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1718 Issue #1718]
+
| Use of builtin vec_sel is ambiguous when compiling with Clang for PowerPC
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1695 Issue #1695]
+
| Stuck in loop for a certain input when using mpreal support
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1692 Issue #1692]
+
| pass enumeration argument to constructor of VectorXd
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1684 Issue #1684]
+
| array_reverse fails with clang >=6 + AVX + -O2
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1674 Issue #1674]
+
| SIMD sin/cos gives wrong results with -ffast-math
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1669 Issue #1669]
+
| Zero-sized matrices generate assertion failures
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1664 Issue #1664]
+
| dot product with single column block fails with new static checks
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1652 Issue #1652]
+
| Corner cases in SIMD sin/cos
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1643 Issue #1643]
+
| Compilation failure
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1637 Issue #1637]
+
| Register spilling with recent gcc & clang
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1619 Issue #1619]
+
| const_iterator vs iterator compilation error
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1615 Issue #1615]
+
| Performance of (aliased) matrix multiplication with fixed size 3x3 matrices slow
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1611 Issue #1611]
+
| NEON: plog(+/-0) should return -inf and not NaN
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1585 Issue #1585]
+
| Matrix product is repeatedly evaluated when iterating over the product expression
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1557 Issue #1557]
+
| Fail to compute eigenvalues for a simple 3x3 companion matrix for root finding
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1544 Issue #1544]
+
| SparseQR generates incorrect Q matrix in complex case
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1543 Issue #1543]
+
| \"Fix linear indexing in generic block evaluation\" breaks Matrix*Diagonal*Vector product
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1493 Issue #1493]
+
| dense Q extraction and solve is sometimes erroneous for complex matrices
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1453 Issue #1453]
+
| Strange behavior for Matrix::Map, if only InnerStride is provided
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1409 Issue #1409]
+
| Add support for C++17 operator new alignment
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1340 Issue #1340]
+
| Add operator + to sparse matrix iterator
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1318 Issue #1318]
+
| More robust quaternion from matrix
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1306 Issue #1306]
+
| Add support for AVX512 to Eigen
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1305 Issue #1305]
+
| Implementation of additional component-wise unary functions
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1221 Issue #1221]
+
| I get tons of error since my distribution upgraded to GCC 6.1.1
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1195 Issue #1195]
+
| vectorization_logic fails: Matrix3().cwiseQuotient(Matrix3()) expected CompleteUnrolling, got NoUnrolling
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1194 Issue #1194]
+
| Improve det4x4
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1049 Issue #1049]
+
| std::make_shared fails to fulfill structure aliment
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1046 Issue #1046]
+
| fixed matrix types do not report correct alignment requirements
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1014 Issue #1014]
+
| Eigenvalues 3x3 matrix
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/1001 Issue #1001]
+
| infer dimensions of Dynamic-sized temporaries from the entire expression (if possible)
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/977 Issue #977]
+
| Add stable versions of normalize() and normalized()
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/899 Issue #899]
+
| SparseQR occasionally fails for under-determined systems
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/864 Issue #864]
+
| C++11 alias templates for commonly used types
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/751 Issue #751]
+
| Make AMD Ordering numerically more robust
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/747 Issue #747]
+
| Allow for negative stride
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/720 Issue #720]
+
| Gaussian NullaryExpr
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/663 Issue #663]
+
| Permit NoChange in setZero, setOnes, setConstant, setRandom
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/645 Issue #645]
+
| GeneralizedEigenSolver: missing computation of eigenvectors
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/632 Issue #632]
+
| Optimize addition/subtraction of sparse and dense matrices/vectors
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/631 Issue #631]
+
| (Optionally) throw an exception when using an unsuccessful decomposition
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/564 Issue #564]
+
| maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/556 Issue #556]
+
| Matrix multiplication crashes using mingw 4.7
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/505 Issue #505]
+
| Assert if temporary objects that are still referred to get destructed (was: Misbehaving Product on C++11)
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/445 Issue #445]
+
| ParametrizedLine should have transform method
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/437 Issue #437]
+
| [feature request] Add Reshape Operation
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/426 Issue #426]
+
| Behavior of sum() for Matrix<bool> is unexpected and confusing
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/329 Issue #329]
+
| Feature request: Ability to get a \"view\" into a sub-matrix by indexing it with a vector or matrix of indices
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/231 Issue #231]
+
| STL compatible iterators
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/96 Issue #96]
+
| Clean internal::result_of
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/65 Issue #65]
+
| Core - optimize partial reductions
+
|-
+
| [https://gitlab.com/libeigen/eigen/-/issues/64 Issue #64]
+
| Tests : precision-oriented tests
+
|}
+
  
== Additional information ==
+
[1] <code>$ hg log -r "3.3.0:: and not merge() and not branch(3.2) and not branch(3.3)" | grep "changeset:" | wc -l </code>
* A curated list of commits, approximately organized by the same topics as the release notes above, and sorted in reverse chronological order can be found [https://docs.google.com/document/d/e/2PACX-1vSGvp4Kv9dJ-gKzJN4CBjppP46flDbe3pJtI9N3m3WkKSoLXmANXuK5gJlw1CPcpCfjAWhgXAtQNzm-/pub here].
+

Revision as of 19:54, 24 August 2021

Eigen 3.4-rc1 has been released on April 19th 2021. It can be downloaded from the Download section on the Main Page. Since Eigen 3.3, the 3.4 development branch received more than 1750 commits [1] (TODO update!) representing numerous major changes.

Changes that might impact existing code

  • Using float or double for indexing matrices, vectors and array will now fail to compile, ex.:
MatrixXd A(10,10);
float one = 1;
double a11 = A(one,1.); // compilation error here

New features

  • New versatile API for sub-matrices, slices, and indexed views [doc]. It basically extends A(.,.) to let it accept anything that looks-like a sequence of indices with random access. To make it usable this new feature comes with new symbols: Eigen::all, Eigen::last, and functions generating arithmetic sequences: Eigen::seq(first,last[,incr]), Eigen::seqN(first,size[,incr]), Eigen::lastN(size[,incr]). Here is an example picking even rows but the first and last ones, and a subset of indexed columns:
MatrixXd A = ...;
std::vector<int> col_ind{7,3,4,3};
MatrixXd B = A(seq(2,last-2,fix<2>), col_ind);
  • Reshaped views through the new members reshaped() and reshaped(rows,cols). This feature also comes with new symbols: Eigen::AutoOrder, Eigen::AutoSize. [doc]
  • A new helper Eigen::fix<N> to pass compile-time integer values to Eigen's functions [doc]. It can be used to pass compile-time sizes to .block(...), .segment(...), and all variants, as well as the first, size and increment parameters of the seq, seqN, and lastN functions introduced above. You can also pass "possibly compile-time values" through Eigen::fix<N>(n). Here is an example comparing the old and new way to call .block with fixed sizes:
template<typename MatrixType,int N>
void foo(const MatrixType &A, int i, int j, int n) {
    A.block(i,j,2,3);                         // runtime sizes
    // compile-time nb rows and columns:
    A.template block<2,3>(i,j);               // 3.3 way
    A.block(i,j,fix<2>,fix<3>);               // new 3.4 way
    // compile-time nb rows only:
    A.template block<2,Dynamic>(i,j,2,n);     // 3.3 way
    A.block(i,j,fix<2>,n);                    // new 3.4 way
    // possibly compile-time nb columns
    // (use n if N==Dynamic, otherwise we must have n==N):
    A.template block<2,N>(i,j,2,n);           // 3.3 way
    A.block(i,j,fix<2>,fix<N>(n));            // new 3.4 way
}
  • Add STL-compatible iterators for dense expressions [doc]. Some examples:
VectorXd v = ...;
MatrixXd A = ...;
// range for loop over all entries of v then A
for(auto x : v) { cout << x << " "; }
for(auto x : A.reshaped()) { cout << x << " "; }
// sort v then each column of A
std::sort(v.begin(), v.end());
for(auto c : A.colwise())
    std::sort(c.begin(), c.end());
  • Add c++11 initializer_list constructors to Matrix and Array [doc]:
MatrixXi a {      // construct a 2x3 matrix
      {1,2,3},    // first row
      {4,5,6}     // second row
};
VectorXd v{{1, 2, 3, 4, 5}}; // construct a dynamic-size vector with 5 elements
Array<int,1,5> a{1,2, 3, 4, 5}; // initialize a fixed-size 1D array of size 5.
  • Add c++11 template aliases for Matrix, Vector, and Array of common sizes, including generic Vector<Type,Size> and RowVector<Type,Size> aliases [doc].
  • A new namespace indexing allowing to exclusively import the subset of functions and symbols that are typically used within A(.,.), that is: all,seq, seqN, lastN, last, lastp1. [doc]
  • All dense linear solvers (i.e., Cholesky, *LU, *QR, CompleteOrthogonalDecomposition, *SVD) now inherits SolverBase and thus support .transpose() and .adjoint() solving API.
  • Misc
    • Add templated subVector<Vertical/Horizonal>(Index) aliases to col/row(Index) methods, and subVectors<>() aliases to rows()/cols().
    • Add innerVector() and innerVectors() methods.
    • Add diagmat +/- diagmat operators (bug 520)
    • Add specializations for res ?= dense +/- sparse and res ?= sparse +/- dense. (see bug 632)
    • Add sparse_matrix =,+=,-= diagonal_matrix support with smart insertion strategies of missing diagonal coeffs. (see bug 1574)
    • Add conjugateIf<bool> members for conditional conjugation.
    • Add support for SuiteSparse's KLU sparse direct solver (LU-based solver tailored for problems coming from circuit simulation).

Alignment

Eigen now uses c++11 alignas keyword for static alignment. Users targeting c++17 only and recent compilers (e.g., GCC>=7, clang>=5, MSVC>=19.12) will thus be able to completely forget about all issues related to static alignment, including EIGEN_MAKE_ALIGNED_OPERATOR_NEW.

Performance optimizations

  • Vectorization of partial-reductions along outer-dimension, e.g.: colmajor.rowwise().mean()
  • Speed up evaluation of HouseholderSequence to a dense matrix, e.g.,
    MatrixXd Q = A.qr().householderQ();
  • Various optimizations of matrix products for small and medium sizes when using large SIMD registers (e.g., AVX and AVX512).
  • Optimize evaluation of small products of the form s*A*B by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x (see bug 1562).
  • Improve multi-threading heuristic for matrix products with a small number of columns.
  • 20% speedup of matrix products on ARM64
  • Speed-up reductions of sub-matrices.
  • Huge speedup for LU factorization of small fixed-size matrices.
  • Optimize extraction of factor Q in SparseQR.
  • SIMD implementations of math functions (exp,log,sin,cos) have been unified as a generic implementation compatible over all supported SIMD engines (SSE,AVX,AVX512,NEON,Altivec,VSX,MSA).

Hardware support

  • AVX512 support is now complete (including complex scalars) and enabled by default when enabled on compiler side.
  • Generalization of the CUDA support to CUDA/HIP for AMD GPUs.
  • Add explicit SIMD support for MSA instruction set (MIPS).

Footnotes

[1] $ hg log -r "3.3.0:: and not merge() and not branch(3.2) and not branch(3.3)" | grep "changeset:" | wc -l