Difference between revisions of "User:Everton"

From Eigen
Jump to: navigation, search
 
Line 5: Line 5:
 
* General performance improvement and bugfixes.
 
* General performance improvement and bugfixes.
 
* Enhanced vectorization of current real and complex scalars.
 
* Enhanced vectorization of current real and complex scalars.
* Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x.
+
* Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
 
* Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports.
 
* Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports.

Latest revision as of 22:29, 17 August 2021

New Power 10 MMA Backend

  • Initial support for Power 10 matrix multiplication assist instructions for float32, float64 real and complex.

Altivec/Power improvements

  • General performance improvement and bugfixes.
  • Enhanced vectorization of current real and complex scalars.
  • Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
  • Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports.