Difference between revisions of "User:Cantonios/3.4"
From Eigen
Line 4: | Line 4: | ||
The 16-bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floating-point_format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>. | The 16-bit Brain floating point format[https://en.wikipedia.org/wiki/Bfloat16_floating-point_format] is now available as <code>Eigen::bfloat16</code>. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between <code>uint16_t</code> to extract the bit representation, use <code>Eigen::numext::bit_cast</code>. | ||
− | + | <source lang="cpp"> | |
bfloat16 s(0.25); // explicit construction | bfloat16 s(0.25); // explicit construction | ||
uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation | uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation | ||
Line 10: | Line 10: | ||
using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; | using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; | ||
MatrixBf16 X = s * MatrixBf16::Random(3, 3); | MatrixBf16 X = s * MatrixBf16::Random(3, 3); | ||
− | + | </source> | |
=== New backends === | === New backends === | ||
Line 17: | Line 17: | ||
=== Improvements/Cleanups to Core modules === | === Improvements/Cleanups to Core modules === | ||
+ | |||
+ | * Dense matrix decompositions and solvers | ||
+ | ** SVD implementations now have an <code>info()</code> method for checking convergence. | ||
+ | <source lang="cpp"> | ||
+ | MatrixXf m = MatrixXf::Random(3,2); | ||
+ | JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV); | ||
+ | if (svd.info() == ComputationInfo::Success) { | ||
+ | // SVD computation was successful. | ||
+ | } | ||
+ | </source> | ||
+ | ** Decompositions now fail quickly for detected invalid inputs. | ||
+ | ** Fixed aliasing issues with in-place small matrix inversions. | ||
+ | ** Fixed several edge-cases with empty or zero inputs. | ||
+ | * Sparse matrix support, decompositions and solvers | ||
+ | ** Enable assignment and addition with diagonal matrices. | ||
+ | <source lang="cpp"> | ||
+ | SparseMatrix<float> A(10, 10); | ||
+ | VectorXf x = VectorXf::Random(10); | ||
+ | A = x.asDiagonal(); | ||
+ | A += x.asDiagonal(); | ||
+ | </source> | ||
+ | ** Added new IRDS iterative linear solver. | ||
+ | <source lang="cpp"> | ||
+ | A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. | ||
+ | IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); | ||
+ | if (idrs.info() == ComputationInfo::Success) { | ||
+ | VectorXf x = idrs.solve(b); | ||
+ | } | ||
+ | </source> | ||
+ | ** Support added for SuiteSparse KLU routines. | ||
+ | <source lang="cpp"> | ||
+ | A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. | ||
+ | KLU<SparseMatrix<T> > klu(A); | ||
+ | if (klu.info() == ComputationInfo::Success) { | ||
+ | VectorXf x = klu.solve(b); | ||
+ | } | ||
+ | </source> | ||
+ | ** Various bug fixes and performance improvements in <code>SparseLU</code>, <code>SparseQR</code>, <code>SimplicialLDLT</code>. | ||
* Improved support for <code>half</code> | * Improved support for <code>half</code> | ||
** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>. | ** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>. | ||
** Better vectorization support across backends. | ** Better vectorization support across backends. | ||
+ | * Improved bool support | ||
+ | ** Partial vectorization support for boolean operations. | ||
+ | ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>. | ||
* Improved support for custom types | * Improved support for custom types | ||
** More custom types work out-of-the-box (see #2201[https://gitlab.com/libeigen/eigen/-/issues/2201]). | ** More custom types work out-of-the-box (see #2201[https://gitlab.com/libeigen/eigen/-/issues/2201]). |
Revision as of 21:05, 17 August 2021
Contents
New Major Features in Core
- New support for
bfloat16
The 16-bit Brain floating point format[1] is now available as Eigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between uint16_t
to extract the bit representation, use Eigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
- AMD ROCm HIP:
- Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.
Improvements/Cleanups to Core modules
- Dense matrix decompositions and solvers
- SVD implementations now have an
info()
method for checking convergence.
- SVD implementations now have an
MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. }
- Decompositions now fail quickly for detected invalid inputs.
- Fixed aliasing issues with in-place small matrix inversions.
- Fixed several edge-cases with empty or zero inputs.
- Sparse matrix support, decompositions and solvers
- Enable assignment and addition with diagonal matrices.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
- Added new IRDS iterative linear solver.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); if (idrs.info() == ComputationInfo::Success) { VectorXf x = idrs.solve(b); }
- Support added for SuiteSparse KLU routines.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }
- Various bug fixes and performance improvements in
SparseLU
,SparseQR
,SimplicialLDLT
.
- Various bug fixes and performance improvements in
- Improved support for
half
- Native support for ARM
__fp16
, CUDA/HIP__half
, ClangF16C
. - Better vectorization support across backends.
- Native support for ARM
- Improved bool support
- Partial vectorization support for boolean operations.
- Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
- Improved support for custom types
- More custom types work out-of-the-box (see #2201[2]).
- Improved Geometry Module
-
Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349[3]). - New minimal vectorization support.
-
Backend-specific improvements
- SSE/AVX/AVX512
- Enable AVX512 instructions by default if available.
- New
std::complex
,half
,bfloat16
vectorization support. - Better accuracy for several vectorized math functions including
exp
,log
,pow
,sqrt
. - Many missing packet functions added.
- GPU (CUDA and HIP)
- Several optimized math functions added, better support for
std::complex
. - Option to disable CUDA entirely by defining
EIGEN_NO_CUDA
. - Many more functions can now be used in device code (e.g. comparisons, matrix inversion).
- Several optimized math functions added, better support for
- SYCL
- Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
EIGEN_USE_SYCL
. - New generic memory model used by
TensorDeviceSycl
. - Better integration with OpenCL devices.
- Added many math function specializations.
- Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining