New user self-registration is disabled due to spam. Please email eigen-core-team @ if you need an account.
Before reporting a bug, please make sure that your Eigen version is up-to-date!
Bug 200 - ploaddup using _mm_load_sd, which is generally miscompiled on gcc/i386
Summary: ploaddup using _mm_load_sd, which is generally miscompiled on gcc/i386
Status: NEW
Alias: None
Product: Eigen
Classification: Unclassified
Component: Core - vectorization (show other bugs)
Version: unspecified
Hardware: All All
: --- Unknown
Assignee: Gael Guennebaud
Depends on:
Reported: 2011-02-28 02:24 UTC by Benoit Jacob
Modified: 2011-02-28 02:24 UTC (History)
2 users (show)


Description Benoit Jacob 2011-02-28 02:24:52 UTC
As we've found out on bug 195, GCC (at least up to 4.4) on i386 (i.e. -m32) miscompiles the _mm_load_sd intrinsic in that it adds redundant x87 fldl/fstpl instructions, which should result in poor performance (in bug 195, it even resulted in a wrong result bug, but that's a different story).

Our ploaddup function is still using _mm_load_sd, so it would be nice to have a work-around for gcc/i386 not using it.

Note You need to log in before you can comment on or make changes to this bug.