Difference between revisions of "User:Everton"
From Eigen
Line 5: | Line 5: | ||
* General performance improvement and bugfixes. | * General performance improvement and bugfixes. | ||
* Enhanced vectorization of current real and complex scalars. | * Enhanced vectorization of current real and complex scalars. | ||
− | * Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x. | + | * Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products. |
* Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports. | * Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports. |
Latest revision as of 22:29, 17 August 2021
New Power 10 MMA Backend
- Initial support for Power 10 matrix multiplication assist instructions for float32, float64 real and complex.
Altivec/Power improvements
- General performance improvement and bugfixes.
- Enhanced vectorization of current real and complex scalars.
- Changes to the gebp_kernel specific to Altivec, using VSX implementation of the MMA instructions that gain speed improvements up to 4x for matrix-matrix products.
- Dynamic dispatch for GCC greater than 10 enabling selection of MMA or VSX instructions based on __builtin_cpu_supports.