Make use of more SSE4 instructions, especially for dot products.
Some progress for min/max:
date: 2013-03-20 18:28:40
summary: Add SSE4 min/max for integers
- ceil/floor could be easily added
- dpps instruction is not easy to exploit in Eigen because they all are decomposed into a product and a sum reduction. We should bench it first to see the potential.
Move to 3.3
I agree with Gael on dpps -- it might be only useful for small vectors. Otherwise, I guess mulps and addps with a final reduction at the end should be faster.
ceil/floor ==> JuniorJob.
This could also be vectorizable with SSE2 using conversion to int and some handling of special cases.
Add round, ceil and floor for SSE4.1/AVX (Bug #70)
-- GitLab Migration Automatic Message --
This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/70.