Difference between revisions of "User:Cantonios/3.4"
From Eigen
Line 4:  Line 4:  
The 16bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floatingpoint_format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.  The 16bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floatingpoint_format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.  
−  +  <source lang="cpp">  
bfloat16 s(0.25); // explicit construction  bfloat16 s(0.25); // explicit construction  
uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation  uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation  
Line 10:  Line 10:  
using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;  using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;  
MatrixBf16 X = s * MatrixBf16::Random(3, 3);  MatrixBf16 X = s * MatrixBf16::Random(3, 3);  
−  +  </source>  
=== New backends ===  === New backends ===  
Line 17:  Line 17:  
=== Improvements/Cleanups to Core modules ===  === Improvements/Cleanups to Core modules ===  
+  
+  * Dense matrix decompositions and solvers  
+  ** SVD implementations now have an <code>info()</code> method for checking convergence.  
+  <source lang="cpp">  
+  MatrixXf m = MatrixXf::Random(3,2);  
+  JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV);  
+  if (svd.info() == ComputationInfo::Success) {  
+  // SVD computation was successful.  
+  }  
+  </source>  
+  ** Decompositions now fail quickly for detected invalid inputs.  
+  ** Fixed aliasing issues with inplace small matrix inversions.  
+  ** Fixed several edgecases with empty or zero inputs.  
+  * Sparse matrix support, decompositions and solvers  
+  ** Enable assignment and addition with diagonal matrices.  
+  <source lang="cpp">  
+  SparseMatrix<float> A(10, 10);  
+  VectorXf x = VectorXf::Random(10);  
+  A = x.asDiagonal();  
+  A += x.asDiagonal();  
+  </source>  
+  ** Added new IRDS iterative linear solver.  
+  <source lang="cpp">  
+  A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers.  
+  IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A);  
+  if (idrs.info() == ComputationInfo::Success) {  
+  VectorXf x = idrs.solve(b);  
+  }  
+  </source>  
+  ** Support added for SuiteSparse KLU routines.  
+  <source lang="cpp">  
+  A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers.  
+  KLU<SparseMatrix<T> > klu(A);  
+  if (klu.info() == ComputationInfo::Success) {  
+  VectorXf x = klu.solve(b);  
+  }  
+  </source>  
+  ** Various bug fixes and performance improvements in <code>SparseLU</code>, <code>SparseQR</code>, <code>SimplicialLDLT</code>.  
* Improved support for <code>half</code>  * Improved support for <code>half</code>  
** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>.  ** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>.  
** Better vectorization support across backends.  ** Better vectorization support across backends.  
+  * Improved bool support  
+  ** Partial vectorization support for boolean operations.  
+  ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.  
* Improved support for custom types  * Improved support for custom types  
** More custom types work outofthebox (see #2201[https://gitlab.com/libeigen/eigen//issues/2201]).  ** More custom types work outofthebox (see #2201[https://gitlab.com/libeigen/eigen//issues/2201]). 
Revision as of 21:05, 17 August 2021
Contents
New Major Features in Core
 New support for
bfloat16
The 16bit Brain floating point format[1] is now available as Eigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between uint16_t
to extract the bit representation, use Eigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
 AMD ROCm HIP:
 Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.
Improvements/Cleanups to Core modules
 Dense matrix decompositions and solvers
 SVD implementations now have an
info()
method for checking convergence.
 SVD implementations now have an
MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. }
 Decompositions now fail quickly for detected invalid inputs.
 Fixed aliasing issues with inplace small matrix inversions.
 Fixed several edgecases with empty or zero inputs.
 Sparse matrix support, decompositions and solvers
 Enable assignment and addition with diagonal matrices.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
 Added new IRDS iterative linear solver.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); if (idrs.info() == ComputationInfo::Success) { VectorXf x = idrs.solve(b); }
 Support added for SuiteSparse KLU routines.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }
 Various bug fixes and performance improvements in
SparseLU
,SparseQR
,SimplicialLDLT
.
 Various bug fixes and performance improvements in
 Improved support for
half
 Native support for ARM
__fp16
, CUDA/HIP__half
, ClangF16C
.  Better vectorization support across backends.
 Native support for ARM
 Improved bool support
 Partial vectorization support for boolean operations.
 Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
 Improved support for custom types
 More custom types work outofthebox (see #2201[2]).
 Improved Geometry Module

Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349[3]).  New minimal vectorization support.

Backendspecific improvements
 SSE/AVX/AVX512
 Enable AVX512 instructions by default if available.
 New
std::complex
,half
,bfloat16
vectorization support.  Better accuracy for several vectorized math functions including
exp
,log
,pow
,sqrt
.  Many missing packet functions added.
 GPU (CUDA and HIP)
 Several optimized math functions added, better support for
std::complex
.  Option to disable CUDA entirely by defining
EIGEN_NO_CUDA
.  Many more functions can now be used in device code (e.g. comparisons, matrix inversion).
 Several optimized math functions added, better support for
 SYCL
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
EIGEN_USE_SYCL
.  New generic memory model used by
TensorDeviceSycl
.  Better integration with OpenCL devices.
 Added many math function specializations.
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining