Difference between revisions of "User:Cantonios/3.4"
From Eigen
(7 intermediate revisions by the same user not shown)  
Line 1:  Line 1:  
+  === New Major Features in Core ===  
+  
* New support for <code>bfloat16</code>  * New support for <code>bfloat16</code>  
The 16bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floatingpoint_format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.  The 16bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floatingpoint_format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>.  
−  +  <source lang="cpp">  
bfloat16 s(0.25); // explicit construction  bfloat16 s(0.25); // explicit construction  
uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation  uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation  
Line 8:  Line 10:  
using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;  using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;  
MatrixBf16 X = s * MatrixBf16::Random(3, 3);  MatrixBf16 X = s * MatrixBf16::Random(3, 3);  
+  </source>  
+  === New backends ===  
+  
+  * AMD ROCm HIP:  
+  ** Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.  
+  
+  === Improvements/Cleanups to Core modules ===  
+  
+  * Dense matrix decompositions and solvers  
+  ** SVD implementations now have an <code>info()</code> method for checking convergence.  
+  <source lang="cpp">  
+  MatrixXf m = MatrixXf::Random(3,2);  
+  JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV);  
+  if (svd.info() == ComputationInfo::Success) {  
+  // SVD computation was successful.  
+  VectorXf x = svd.solve(b);  
+  }  
+  </source>  
+  ** Decompositions now fail quickly when invalid inputs are detected.  
+  ** Fixed aliasing issues with inplace small matrix inversions.  
+  ** Fixed several edgecases with empty or zero inputs.  
+  * Sparse matrix support, decompositions and solvers  
+  ** Enabled assignment and addition with diagonal matrices.  
+  <source lang="cpp">  
+  SparseMatrix<float> A(10, 10);  
+  VectorXf x = VectorXf::Random(10);  
+  A = x.asDiagonal();  
+  A += x.asDiagonal();  
+  </source>  
+  ** Added new IRDS iterative linear solver.  
+  <source lang="cpp">  
+  A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers.  
+  IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A);  
+  if (idrs.info() == ComputationInfo::Success) {  
+  VectorXf x = idrs.solve(b);  
+  }  
+  </source>  
+  ** Support added for SuiteSparse KLU routines.  
+  <source lang="cpp">  
+  A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers.  
+  KLU<SparseMatrix<T> > klu(A);  
+  if (klu.info() == ComputationInfo::Success) {  
+  VectorXf x = klu.solve(b);  
+  }  
+  </source>  
+  ** <code>SparseCholesky</code> now works with rowmajor matrices.  
+  ** Various bug fixes and performance improvements.  
−  *  +  * Improved support for <code>half</code> 
−  **  +  ** Native support added for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>. 
+  ** Better vectorization support added across all backends.  
+  * Improved bool support  
+  ** Partial vectorization support added for boolean operations.  
+  ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.  
+  * Improved support for custom types  
+  ** More custom types work outofthebox (see #2201[https://gitlab.com/libeigen/eigen//issues/2201]).  
+  * Improved Geometry Module  
+  ** <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see !349[https://gitlab.com/libeigen/eigen//merge_requests/349]).  
+  ** New minimal vectorization support added for <code>Quaternion</code>.  
−  *  +  === Backendspecific improvements === 
−  **  +  * SSE/AVX/AVX512 
−  **  +  ** Enabled AVX512 instructions by default if available. 
−  **  +  ** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added. 
−  **  +  ** Better accuracy for several vectorized math functions including <code>exp</code>, <code>log</code>, <code>pow</code>, <code>sqrt</code>. 
−  ***  +  ** Many missing packet functions added. 
−  *  +  * GPU (CUDA and HIP) 
−  **  +  ** Several optimized math functions added, better support for <code>std::complex</code>. 
−  ***  +  ** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>. 
+  ** Many more functions can now be used in device code (e.g. comparisons, matrix inversion).  
+  * ZVector  
+  ** Vectorized <code>float</code> and <code>std::complex<float></code> support added.  
+  ** Added z14 support.  
+  * SYCL  
+  ** Redesigned SYCL implementation for use with the Tensor[https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.  
+  ** New generic memory model introduced used by <code>TensorDeviceSycl</code>.  
+  ** Better integration with OpenCL devices.  
+  ** Added many math function specializations.  
−  *  +  === Miscellaneous API Changes === 
−  *  +  * New <code>setOnes()</code> method for filling a dense matrix with ones. 
−  *  +  * New <code>setConstant(NoChange_t, Index, T)</code> methods for preserving one dimension of a matrix. 
−  *  +  <source lang="cpp"> 
−  *  +  MatrixXf A(10, 5); 
−  *  +  A.setOnes(); // 10x5 matrix of 1s 
−  *  +  A.setConstant(NoChange_t(), 10, 1); // 10x10 matrix of 2s. 
−  +  A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.  
−  *  +  </source> 
−  *  +  * Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero. 
−  +  * Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.  
−  *  +  * Added <code>shift_left<N>()</code> and <code>shift_right<N>()</code> coefficientwise array functions. 
−  +  * Enabled adding and subtracting of diagonal matrices.  
−  +  * Allow userdefined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>.  
+  * Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable.  
+  * Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>).  
+  * Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>. 
Latest revision as of 21:51, 17 August 2021
Contents
New Major Features in Core
 New support for
bfloat16
The 16bit Brain floating point format[1] is now available as Eigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between uint16_t
to extract the bit representation, use Eigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
 AMD ROCm HIP:
 Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.
Improvements/Cleanups to Core modules
 Dense matrix decompositions and solvers
 SVD implementations now have an
info()
method for checking convergence.
 SVD implementations now have an
MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. VectorXf x = svd.solve(b); }
 Decompositions now fail quickly when invalid inputs are detected.
 Fixed aliasing issues with inplace small matrix inversions.
 Fixed several edgecases with empty or zero inputs.
 Sparse matrix support, decompositions and solvers
 Enabled assignment and addition with diagonal matrices.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
 Added new IRDS iterative linear solver.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); if (idrs.info() == ComputationInfo::Success) { VectorXf x = idrs.solve(b); }
 Support added for SuiteSparse KLU routines.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }

SparseCholesky
now works with rowmajor matrices.  Various bug fixes and performance improvements.

 Improved support for
half
 Native support added for ARM
__fp16
, CUDA/HIP__half
, ClangF16C
.  Better vectorization support added across all backends.
 Native support added for ARM
 Improved bool support
 Partial vectorization support added for boolean operations.
 Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
 Improved support for custom types
 More custom types work outofthebox (see #2201[2]).
 Improved Geometry Module

Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349[3]).  New minimal vectorization support added for
Quaternion
.

Backendspecific improvements
 SSE/AVX/AVX512
 Enabled AVX512 instructions by default if available.
 New
std::complex
,half
, andbfloat16
vectorization support added.  Better accuracy for several vectorized math functions including
exp
,log
,pow
,sqrt
.  Many missing packet functions added.
 GPU (CUDA and HIP)
 Several optimized math functions added, better support for
std::complex
.  Added option to disable CUDA entirely by defining
EIGEN_NO_CUDA
.  Many more functions can now be used in device code (e.g. comparisons, matrix inversion).
 Several optimized math functions added, better support for
 ZVector
 Vectorized
float
andstd::complex<float>
support added.  Added z14 support.
 Vectorized
 SYCL
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
EIGEN_USE_SYCL
.  New generic memory model introduced used by
TensorDeviceSycl
.  Better integration with OpenCL devices.
 Added many math function specializations.
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
Miscellaneous API Changes
 New
setOnes()
method for filling a dense matrix with ones.  New
setConstant(NoChange_t, Index, T)
methods for preserving one dimension of a matrix.
MatrixXf A(10, 5); A.setOnes(); // 10x5 matrix of 1s A.setConstant(NoChange_t(), 10, 1); // 10x10 matrix of 2s. A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.
 Added
setUnit(Index i)
for vectors that sets the i th coefficient to one and all others to zero.  Added
transpose()
,adjoint()
,conjugate()
methods toSelfAdjointView
.  Added
shift_left<N>()
andshift_right<N>()
coefficientwise array functions.  Enabled adding and subtracting of diagonal matrices.
 Allow userdefined default cache sizes via defining
EIGEN_DEFAULT_L1_CACHE_SIZE
, ...,EIGEN_DEFAULT_L3_CACHE_SIZE
.  Added
EIGEN_ALIGNOF(X)
macro for determining alignment of a provided variable.  Allow plugins for
VectorwiseOp
by defining a fileEIGEN_VECTORWISEOP_PLUGIN
(e.g.DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h
).  Allow disabling of IO operations by defining
EIGEN_NO_IO
.