User:Cantonios/3.4
From Eigen
Contents
New Major Features in Core
 New support for
bfloat16
The 16bit Brain floating point format[1] is now available as Eigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert backandforth between uint16_t
to extract the bit representation, use Eigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
 AMD ROCm HIP:
 Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.
Improvements/Cleanups to Core modules
 Dense matrix decompositions and solvers
 SVD implementations now have an
info()
method for checking convergence.
 SVD implementations now have an
MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU  ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. VectorXf x = svd.solve(b); }
 Decompositions now fail quickly when invalid inputs are detected.
 Fixed aliasing issues with inplace small matrix inversions.
 Fixed several edgecases with empty or zero inputs.
 Sparse matrix support, decompositions and solvers
 Enabled assignment and addition with diagonal matrices.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
 Added new IRDS iterative linear solver.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); if (idrs.info() == ComputationInfo::Success) { VectorXf x = idrs.solve(b); }
 Support added for SuiteSparse KLU routines.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }

SparseCholesky
now works with rowmajor matrices.  Various bug fixes and performance improvements.

 Improved support for
half
 Native support added for ARM
__fp16
, CUDA/HIP__half
, ClangF16C
.  Better vectorization support added across all backends.
 Native support added for ARM
 Improved bool support
 Partial vectorization support added for boolean operations.
 Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
 Improved support for custom types
 More custom types work outofthebox (see #2201[2]).
 Improved Geometry Module

Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349[3]).  New minimal vectorization support added for
Quaternion
.

Backendspecific improvements
 SSE/AVX/AVX512
 Enabled AVX512 instructions by default if available.
 New
std::complex
,half
, andbfloat16
vectorization support added.  Better accuracy for several vectorized math functions including
exp
,log
,pow
,sqrt
.  Many missing packet functions added.
 GPU (CUDA and HIP)
 Several optimized math functions added, better support for
std::complex
.  Added option to disable CUDA entirely by defining
EIGEN_NO_CUDA
.  Many more functions can now be used in device code (e.g. comparisons, matrix inversion).
 Several optimized math functions added, better support for
 ZVector
 Vectorized
float
andstd::complex<float>
support added.  Added z14 support.
 Vectorized
 SYCL
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
EIGEN_USE_SYCL
.  New generic memory model introduced used by
TensorDeviceSycl
.  Better integration with OpenCL devices.
 Added many math function specializations.
 Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
Miscellaneous API Changes
 New
setOnes()
method for filling a dense matrix with ones.  New
setConstant(NoChange_t, Index, T)
methods for preserving one dimension of a matrix.
MatrixXf A(10, 5); A.setOnes(); // 10x5 matrix of 1s A.setConstant(NoChange_t(), 10, 1); // 10x10 matrix of 2s. A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.
 Added
setUnit(Index i)
for vectors that sets the i th coefficient to one and all others to zero.  Added
transpose()
,adjoint()
,conjugate()
methods toSelfAdjointView
.  Added
shift_left<N>()
andshift_right<N>()
coefficientwise array functions.  Enabled adding and subtracting of diagonal matrices.
 Allow userdefined default cache sizes via defining
EIGEN_DEFAULT_L1_CACHE_SIZE
, ...,EIGEN_DEFAULT_L3_CACHE_SIZE
.  Added
EIGEN_ALIGNOF(X)
macro for determining alignment of a provided variable.  Allow plugins for
VectorwiseOp
by defining a fileEIGEN_VECTORWISEOP_PLUGIN
(e.g.DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h
).  Allow disabling of IO operations by defining
EIGEN_NO_IO
.