Difference between revisions of "User:Cantonios/3.4"

From Eigen
Jump to: navigation, search
Line 11: Line 11:
 
* Improved support for <code>half</code>
 
* Improved support for <code>half</code>
 
** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>.
 
** Native support for ARM <code>__fp16</code>, CUDA/HIP <code>__half</code>, Clang <code>F16C</code>.
** Better vectorization support, various bug fixes.
+
** Better vectorization support across backends.
  
 
* Improved support for custom types
 
* Improved support for custom types
** More custom types work out-of-the-box (see #2201[https://gitlab.com/libeigen/eigen/-/issues/2201])
+
** More custom types work out-of-the-box (see #2201[https://gitlab.com/libeigen/eigen/-/issues/2201]).
  
 
* Improved Geometry Module
 
* Improved Geometry Module
Line 22: Line 22:
 
* Backend-specific improvements
 
* Backend-specific improvements
 
** SSE/AVX/AVX512
 
** SSE/AVX/AVX512
*** Enable AVX512 instructions by default if available
+
*** Enable AVX512 instructions by default if available.
*** <code>std::complex</code>, <code>half</code>, <code>bfloat16</code> vectorization support.
+
*** New <code>std::complex</code>, <code>half</code>, <code>bfloat16</code> vectorization support.
 
*** Many missing packet functions added.
 
*** Many missing packet functions added.
 
** GPU (CUDA and HIP)
 
** GPU (CUDA and HIP)
*** Several optimized math functions, better support for `std::complex`.
+
*** Several optimized math functions added, better support for <code>std::complex</code>.
 
*** Option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>.
 
*** Option to disable CUDA entirely by defining <code>EIGEN_NO_CUDA</code>.
*** Many more functions can now be used in device code.
+
*** Many more functions can now be used in device code (e.g. comparisons, matrix inversion).

Revision as of 19:42, 17 August 2021

  • New support for bfloat16

The 16-bit Brain floating point format[1] is now available as Eigen::bfloat16. The constructor must be called explicitly, but it can otherwise be used as any other scalar type. To convert back-and-forth between uint16_t to extract the bit representation, use Eigen::numext::bit_cast.

 bfloat16 s(0.25);                                 // explicit construction
 uint16_t s_bits = numext::bit_cast<uint16_t>(s);  // bit representation
 
 using MatrixBf16 = Matrix<bfloat16, Dynamic, Dynamic>;
 MatrixBf16 X = s * MatrixBf16::Random(3, 3);
  • Improved support for half
    • Native support for ARM __fp16, CUDA/HIP __half, Clang F16C.
    • Better vectorization support across backends.
  • Improved support for custom types
    • More custom types work out-of-the-box (see #2201[2]).
  • Improved Geometry Module
    • Transform::computeRotationScaling() and Transform::computeScalingRotation() are now more continuous across degeneracies (see !349[3]).
    • New minimal vectorization support.
  • Backend-specific improvements
    • SSE/AVX/AVX512
      • Enable AVX512 instructions by default if available.
      • New std::complex, half, bfloat16 vectorization support.
      • Many missing packet functions added.
    • GPU (CUDA and HIP)
      • Several optimized math functions added, better support for std::complex.
      • Option to disable CUDA entirely by defining EIGEN_NO_CUDA.
      • Many more functions can now be used in device code (e.g. comparisons, matrix inversion).