As we've found out on bug 195, GCC (at least up to 4.4) on i386 (i.e. -m32) miscompiles the _mm_load_sd intrinsic in that it adds redundant x87 fldl/fstpl instructions, which should result in poor performance (in bug 195, it even resulted in a wrong result bug, but that's a different story).
Our ploaddup function is still using _mm_load_sd, so it would be nice to have a work-around for gcc/i386 not using it.
-- GitLab Migration Automatic Message --
This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/200.