Make use of more SSE4 instructions, especially for dot products.
Some progress for min/max: https://bitbucket.org/eigen/eigen/commits/761df4c9b3b7/ changeset: 761df4c9b3b7 user: ggael date: 2013-03-20 18:28:40 summary: Add SSE4 min/max for integers Then: - ceil/floor could be easily added - dpps instruction is not easy to exploit in Eigen because they all are decomposed into a product and a sum reduction. We should bench it first to see the potential. Move to 3.3
I agree with Gael on dpps -- it might be only useful for small vectors. Otherwise, I guess mulps and addps with a final reduction at the end should be faster. ceil/floor ==> JuniorJob. This could also be vectorizable with SSE2 using conversion to int and some handling of special cases.
https://bitbucket.org/eigen/eigen/commits/bbee0451d351/ Add round, ceil and floor for SSE4.1/AVX (Bug #70)
-- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/70.