Bug 684 - Scalar minus Array operation is implemented suboptimal
Summary: Scalar minus Array operation is implemented suboptimal
Product: Eigen
Component: Core - vectorization
Version: 3.2
Blocks: 3.3
Reported: 2013-10-18 12:56 UTC by Christoph Hertzberg
Modified: 2019-12-04 12:43 UTC (History)
Description Christoph Hertzberg 2013-10-18 12:56:33 UTC
Currently expressions like
  void foo(Array2d &a, const double s) {
    a = s - a;

lead to this inefficient code (only excerpt shown, comments added):
	movapd	.LC0, %xmm0          #.LC0 is a {-0.0, -0.0}
	xorpd	(%eax), %xmm0        # negate elements of a
	movddup	8(%esp), %xmm1       # load s
	addpd	%xmm1, %xmm0         # add s + (-a)
	movapd	%xmm0, (%eax)        # store result

This could be implemented more efficiently using the subpd instruction:
	movddup	8(%esp), %xmm0       # load s
	subpd	(eax), %xmm0         # subtract s - a
	movapd	%xmm0, (%eax)        # store result
Comment 1 Gael Guennebaud 2013-10-18 14:51:03 UTC
I hoped the compiler would optimize this by itself, but that's not the case. Here is the fix:
Changeset:   0720fd0684a9
User:        ggael
Date:        2013-10-18 14:56:36
Summary:     Fix bug 684: optimize vectorization of array-scalar and scalar-array
