Difference between revisions of "User:Cantonios/3.4"
From Eigen
(2 intermediate revisions by the same user not shown) | |||
Line 28: | Line 28: | ||
} | } | ||
</source> | </source> | ||
− | ** Decompositions now fail quickly | + | ** Decompositions now fail quickly when invalid inputs are detected. |
** Fixed aliasing issues with in-place small matrix inversions. | ** Fixed aliasing issues with in-place small matrix inversions. | ||
** Fixed several edge-cases with empty or zero inputs. | ** Fixed several edge-cases with empty or zero inputs. | ||
* Sparse matrix support, decompositions and solvers | * Sparse matrix support, decompositions and solvers | ||
− | ** | + | ** Enabled assignment and addition with diagonal matrices. |
<source lang="cpp"> | <source lang="cpp"> | ||
SparseMatrix<float> A(10, 10); | SparseMatrix<float> A(10, 10); | ||
Line 59: | Line 59: | ||
* Improved support for <code>half</code> | * Improved support for <code>half</code> | ||
− | ** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>. | + | ** Native support added for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>. |
− | ** Better vectorization support across backends. | + | ** Better vectorization support added across all backends. |
* Improved bool support | * Improved bool support | ||
− | ** Partial vectorization support for boolean operations. | + | ** Partial vectorization support added for boolean operations. |
** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>. | ** Significantly improved performance (x25) for logical operations with <code>Matrix</code> or <code>Tensor</code> of <code>bool</code>. | ||
* Improved support for custom types | * Improved support for custom types | ||
Line 68: | Line 68: | ||
* Improved Geometry Module | * Improved Geometry Module | ||
** <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see !349[https://gitlab.com/libeigen/eigen/-/merge_requests/349]). | ** <code>Transform::computeRotationScaling()</code> and <code>Transform::computeScalingRotation()</code> are now more continuous across degeneracies (see !349[https://gitlab.com/libeigen/eigen/-/merge_requests/349]). | ||
− | ** New minimal vectorization support. | + | ** New minimal vectorization support added for <code>Quaternion</code>. |
=== Backend-specific improvements === | === Backend-specific improvements === | ||
* SSE/AVX/AVX512 | * SSE/AVX/AVX512 | ||
− | ** | + | ** Enabled AVX512 instructions by default if available. |
− | ** New <code>std::complex</code>, <code>half</code>, <code>bfloat16</code> vectorization support. | + | ** New <code>std::complex</code>, <code>half</code>, and <code>bfloat16</code> vectorization support added. |
** Better accuracy for several vectorized math functions including <code>exp</code>, <code>log</code>, <code>pow</code>, <code>sqrt</code>. | ** Better accuracy for several vectorized math functions including <code>exp</code>, <code>log</code>, <code>pow</code>, <code>sqrt</code>. | ||
** Many missing packet functions added. | ** Many missing packet functions added. | ||
* GPU (CUDA and HIP) | * GPU (CUDA and HIP) | ||
** Several optimized math functions added, better support for <code>std::complex</code>. | ** Several optimized math functions added, better support for <code>std::complex</code>. | ||
− | ** | + | ** Added option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>. |
** Many more functions can now be used in device code (e.g. comparisons, matrix inversion). | ** Many more functions can now be used in device code (e.g. comparisons, matrix inversion). | ||
* ZVector | * ZVector | ||
− | ** Vectorized <code>float</code> and <code>std::complex<float></code> support. | + | ** Vectorized <code>float</code> and <code>std::complex<float></code> support added. |
** Added z14 support. | ** Added z14 support. | ||
* SYCL | * SYCL | ||
** Redesigned SYCL implementation for use with the Tensor[https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>. | ** Redesigned SYCL implementation for use with the Tensor[https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html] module, which can be enabled by defining <code>EIGEN_USE_SYCL</code>. | ||
− | ** New generic memory model used by <code>TensorDeviceSycl</code>. | + | ** New generic memory model introduced used by <code>TensorDeviceSycl</code>. |
** Better integration with OpenCL devices. | ** Better integration with OpenCL devices. | ||
** Added many math function specializations. | ** Added many math function specializations. | ||
+ | |||
+ | === Miscellaneous API Changes === | ||
+ | * New <code>setOnes()</code> method for filling a dense matrix with ones. | ||
+ | * New <code>setConstant(NoChange_t, Index, T)</code> methods for preserving one dimension of a matrix. | ||
+ | <source lang="cpp"> | ||
+ | MatrixXf A(10, 5); | ||
+ | A.setOnes(); // 10x5 matrix of 1s | ||
+ | A.setConstant(NoChange_t(), 10, 1); // 10x10 matrix of 2s. | ||
+ | A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s. | ||
+ | </source> | ||
+ | * Added <code>setUnit(Index i)</code> for vectors that sets the ''i'' th coefficient to one and all others to zero. | ||
+ | * Added <code>transpose()</code>, <code>adjoint()</code>, <code>conjugate()</code> methods to <code>SelfAdjointView</code>. | ||
+ | * Added <code>shift_left<N>()</code> and <code>shift_right<N>()</code> coefficient-wise array functions. | ||
+ | * Enabled adding and subtracting of diagonal matrices. | ||
+ | * Allow user-defined default cache sizes via defining <code>EIGEN_DEFAULT_L1_CACHE_SIZE</code>, ..., <code>EIGEN_DEFAULT_L3_CACHE_SIZE</code>. | ||
+ | * Added <code>EIGEN_ALIGNOF(X)</code> macro for determining alignment of a provided variable. | ||
+ | * Allow plugins for <code>VectorwiseOp</code> by defining a file <code>EIGEN_VECTORWISEOP_PLUGIN</code> (e.g. <code>-DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h</code>). | ||
+ | * Allow disabling of IO operations by defining <code>EIGEN_NO_IO</code>. |
Latest revision as of 21:51, 17 August 2021
Contents
New Major Features in Core
- New support for
bfloat16
The 16-bit Brain floating point format[1] is now available as Eigen::bfloat16
. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between uint16_t
to extract the bit representation, use Eigen::numext::bit_cast
.
bfloat16 s(0.25); // explicit construction uint16_t s_bits = numext::bit_cast<uint16_t>(s); // bit representation using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>; MatrixBf16 X = s * MatrixBf16::Random(3, 3);
New backends
- AMD ROCm HIP:
- Unified with CUDA to create a generic GPU backend for NVIDIA/AMD.
Improvements/Cleanups to Core modules
- Dense matrix decompositions and solvers
- SVD implementations now have an
info()
method for checking convergence.
- SVD implementations now have an
MatrixXf m = MatrixXf::Random(3,2); JacobiSVD<MatrixXf> svd(m, ComputeThinU | ComputeThinV); if (svd.info() == ComputationInfo::Success) { // SVD computation was successful. VectorXf x = svd.solve(b); }
- Decompositions now fail quickly when invalid inputs are detected.
- Fixed aliasing issues with in-place small matrix inversions.
- Fixed several edge-cases with empty or zero inputs.
- Sparse matrix support, decompositions and solvers
- Enabled assignment and addition with diagonal matrices.
SparseMatrix<float> A(10, 10); VectorXf x = VectorXf::Random(10); A = x.asDiagonal(); A += x.asDiagonal();
- Added new IRDS iterative linear solver.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. IDRS<SparseMatrix<float>, DiagonalPreconditioner<float> > idrs(A); if (idrs.info() == ComputationInfo::Success) { VectorXf x = idrs.solve(b); }
- Support added for SuiteSparse KLU routines.
A.makeCompressed(); // Recommendation is to compress input before calling sparse solvers. KLU<SparseMatrix<T> > klu(A); if (klu.info() == ComputationInfo::Success) { VectorXf x = klu.solve(b); }
-
SparseCholesky
now works with row-major matrices. - Various bug fixes and performance improvements.
-
- Improved support for
half
- Native support added for ARM
__fp16
, CUDA/HIP__half
, ClangF16C
. - Better vectorization support added across all backends.
- Native support added for ARM
- Improved bool support
- Partial vectorization support added for boolean operations.
- Significantly improved performance (x25) for logical operations with
Matrix
orTensor
ofbool
.
- Improved support for custom types
- More custom types work out-of-the-box (see #2201[2]).
- Improved Geometry Module
-
Transform::computeRotationScaling()
andTransform::computeScalingRotation()
are now more continuous across degeneracies (see !349[3]). - New minimal vectorization support added for
Quaternion
.
-
Backend-specific improvements
- SSE/AVX/AVX512
- Enabled AVX512 instructions by default if available.
- New
std::complex
,half
, andbfloat16
vectorization support added. - Better accuracy for several vectorized math functions including
exp
,log
,pow
,sqrt
. - Many missing packet functions added.
- GPU (CUDA and HIP)
- Several optimized math functions added, better support for
std::complex
. - Added option to disable CUDA entirely by defining
EIGEN_NO_CUDA
. - Many more functions can now be used in device code (e.g. comparisons, matrix inversion).
- Several optimized math functions added, better support for
- ZVector
- Vectorized
float
andstd::complex<float>
support added. - Added z14 support.
- Vectorized
- SYCL
- Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
EIGEN_USE_SYCL
. - New generic memory model introduced used by
TensorDeviceSycl
. - Better integration with OpenCL devices.
- Added many math function specializations.
- Redesigned SYCL implementation for use with the Tensor[4] module, which can be enabled by defining
Miscellaneous API Changes
- New
setOnes()
method for filling a dense matrix with ones. - New
setConstant(NoChange_t, Index, T)
methods for preserving one dimension of a matrix.
MatrixXf A(10, 5); A.setOnes(); // 10x5 matrix of 1s A.setConstant(NoChange_t(), 10, 1); // 10x10 matrix of 2s. A.setConstant(5, NoChange_t(), 2); // 5x10 matrix of 3s.
- Added
setUnit(Index i)
for vectors that sets the i th coefficient to one and all others to zero. - Added
transpose()
,adjoint()
,conjugate()
methods toSelfAdjointView
. - Added
shift_left<N>()
andshift_right<N>()
coefficient-wise array functions. - Enabled adding and subtracting of diagonal matrices.
- Allow user-defined default cache sizes via defining
EIGEN_DEFAULT_L1_CACHE_SIZE
, ...,EIGEN_DEFAULT_L3_CACHE_SIZE
. - Added
EIGEN_ALIGNOF(X)
macro for determining alignment of a provided variable. - Allow plugins for
VectorwiseOp
by defining a fileEIGEN_VECTORWISEOP_PLUGIN
(e.g.-DEIGEN_VECTORWISEOP_PLUGIN=my_vectorwise_op_plugins.h
). - Allow disabling of IO operations by defining
EIGEN_NO_IO
.