Difference between revisions of "User:Cantonios/3.4"
From Eigen
(One intermediate revision by the same user not shown)  
Line 28:  Line 28:  
}  }  
</source>  </source>  
−  ** Decompositions now fail quickly  +  ** Decompositions now fail quickly when invalid inputs are detected. 
** Fixed aliasing issues with inplace small matrix inversions.  ** Fixed aliasing issues with inplace small matrix inversions.  
** Fixed several edgecases with empty or zero inputs.  ** Fixed several edgecases with empty or zero inputs.  
* Sparse matrix support, decompositions and solvers  * Sparse matrix support, decompositions and solvers  
−  **  +  ** Enabled assignment and addition with diagonal matrices. 
<source lang="cpp">  <source lang="cpp">  
SparseMatrix<float> A(10, 10);  SparseMatrix<float> A(10, 10);  
Line 59:  Line 59:  
* Improved support for <code>half</code>  * Improved support for <code>half</code>  
−  ** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>.  +  ** Native support added for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>. 
−  ** Better vectorization support across backends.  +  ** Better vectorization support added across all backends. 
* Improved bool support  * Improved bool support  
−  ** Partial vectorization support for boolean operations.  +  ** Partial vectorization support added for boolean operations. 
** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.  ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>.  
* Improved support for custom types  * Improved support for custom types  
Line 68:  Line 68:  
* Improved Geometry Module  * Improved Geometry Module  
** <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see !349[https://gitlab.com/libeigen/eigen//merge_requests/349]).  ** <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see !349[https://gitlab.com/libeigen/eigen//merge_requests/349]).  
−  ** New minimal vectorization support.  +  ** New minimal vectorization support added for <code>Quaternion</code>. 
=== Backendspecific improvements ===  === Backendspecific improvements ===  
* SSE/AVX/AVX512  * SSE/AVX/AVX512  
−  **  +  ** Enabled AVX512 instructions by default if available. 
−  ** New <code>std::complex</code>, <code>half</code>, <code>bfloat16</code> vectorization support.  +  ** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added. 
** Better accuracy for several vectorized math functions including <code>exp</code>, <code>log</code>, <code>pow</code>, <code>sqrt</code>.  ** Better accuracy for several vectorized math functions including <code>exp</code>, <code>log</code>, <code>pow</code>, <code>sqrt</code>.  
** Many missing packet functions added.  ** Many missing packet functions added.  
* GPU (CUDA and HIP)  * GPU (CUDA and HIP)  
** Several optimized math functions added, better support for <code>std::complex</code>.  ** Several optimized math functions added, better support for <code>std::complex</code>.  
−  **  +  ** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>. 
** Many more functions can now be used in device code (e.g. comparisons, matrix inversion).  ** Many more functions can now be used in device code (e.g. comparisons, matrix inversion).  
* ZVector  * ZVector  
−  ** Vectorized <code>float</code> and <code>std::complex<float></code> support.  +  ** Vectorized <code>float</code> and <code>std::complex<float></code> support added. 
** Added z14 support.  ** Added z14 support.  
* SYCL  * SYCL  
** Redesigned SYCL implementation for use with the Tensor[https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.  ** Redesigned SYCL implementation for use with the Tensor[https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>.  
−  ** New generic memory model used by <code>TensorDeviceSycl</code>.  +  ** New generic memory model introduced used by <code>TensorDeviceSycl</code>. 
** Better integration with OpenCL devices.  ** Better integration with OpenCL devices.  
** Added many math function specializations.  ** Added many math function specializations.  
Line 98:  Line 98:  
A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.  A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.  
</source>  </source>  
−  * Added <code>setUnit(Index i)</code> for vectors that sets the ''i''th coefficient to one  +  * Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero. 
* Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.  * Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>.  
−  * Added <code>shift_left<N></code> and <code>shift_right<N></code> coefficientwise array functions.  +  * Added <code>shift_left<N>()</code> and <code>shift_right<N>()</code> coefficientwise array functions. 
−  *  +  * Enabled adding and subtracting of diagonal matrices. 
* Allow userdefined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>.  * Allow userdefined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>.  
* Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable.  * Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable.  
* Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>).  * Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>).  
* Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>.  * Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>. 
Latest revision as of 21:51, 17 August 2021
Contents
New Major Features in Core
 New support for
bfloat16
The 16bit Brain floating point format[1] is now available as Eigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between uint16_t
to extract the bit representation, use Eigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
 AMD ROCm HIP:
 Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.
Improvements/Cleanups to Core modules
 Dense matrix decompositions and solvers
 SVD implementations now have an
info()
method for checking convergence.
 SVD implementations now have an
MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. VectorXf x = svd.solve(b); }
 Decompositions now fail quickly when invalid inputs are detected.
 Fixed aliasing issues with inplace small matrix inversions.
 Fixed several edgecases with empty or zero inputs.
 Sparse matrix support, decompositions and solvers
 Enabled assignment and addition with diagonal matrices.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
 Added new IRDS iterative linear solver.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); if (idrs.info() == ComputationInfo::Success) { VectorXf x = idrs.solve(b); }
 Support added for SuiteSparse KLU routines.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }

SparseCholesky
now works with rowmajor matrices.  Various bug fixes and performance improvements.

 Improved support for
half
 Native support added for ARM
__fp16
, CUDA/HIP__half
, ClangF16C
.  Better vectorization support added across all backends.
 Native support added for ARM
 Improved bool support
 Partial vectorization support added for boolean operations.
 Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
 Improved support for custom types
 More custom types work outofthebox (see #2201[2]).
 Improved Geometry Module

Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349[3]).  New minimal vectorization support added for
Quaternion
.

Backendspecific improvements
 SSE/AVX/AVX512
 Enabled AVX512 instructions by default if available.
 New
std::complex
,half
, andbfloat16
vectorization support added.  Better accuracy for several vectorized math functions including
exp
,log
,pow
,sqrt
.  Many missing packet functions added.
 GPU (CUDA and HIP)
 Several optimized math functions added, better support for
std::complex
.  Added option to disable CUDA entirely by defining
EIGEN_NO_CUDA
.  Many more functions can now be used in device code (e.g. comparisons, matrix inversion).
 Several optimized math functions added, better support for
 ZVector
 Vectorized
float
andstd::complex<float>
support added.  Added z14 support.
 Vectorized
 SYCL
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
EIGEN_USE_SYCL
.  New generic memory model introduced used by
TensorDeviceSycl
.  Better integration with OpenCL devices.
 Added many math function specializations.
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
Miscellaneous API Changes
 New
setOnes()
method for filling a dense matrix with ones.  New
setConstant(NoChange_t, Index, T)
methods for preserving one dimension of a matrix.
MatrixXf A(10, 5); A.setOnes(); // 10x5 matrix of 1s A.setConstant(NoChange_t(), 10, 1); // 10x10 matrix of 2s. A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.
 Added
setUnit(Index i)
for vectors that sets the i th coefficient to one and all others to zero.  Added
transpose()
,adjoint()
,conjugate()
methods toSelfAdjointView
.  Added
shift_left<N>()
andshift_right<N>()
coefficientwise array functions.  Enabled adding and subtracting of diagonal matrices.
 Allow userdefined default cache sizes via defining
EIGEN_DEFAULT_L1_CACHE_SIZE
, ...,EIGEN_DEFAULT_L3_CACHE_SIZE
.  Added
EIGEN_ALIGNOF(X)
macro for determining alignment of a provided variable.  Allow plugins for
VectorwiseOp
by defining a fileEIGEN_VECTORWISEOP_PLUGIN
(e.g.DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h
).  Allow disabling of IO operations by defining
EIGEN_NO_IO
.