Bug 347 - Compilation for ARM NEON broken with iOS SDK 5.0
: Compilation for ARM NEON broken with iOS SDK 5.0
Status: RESOLVED FIXED
Product: Eigen
Classification: Unclassified
Component: Core - vectorization
: 3.0
: ARM - NEON Mac OS
: --- normal
Assigned To: Marton Danoczy
:
:
:
:
:
  Show dependency treegraph
 
Reported: 2011-09-08 14:51 UTC by Marton Danoczy
Modified: 2011-11-07 17:02 UTC (History)
4 users (show)



Attachments
Error messages when compiling with Clang 3.0 (7.05 KB, text/plain)
2011-09-08 14:51 UTC, Marton Danoczy
no flags Details
Fixes errors with both Clang and LLVM-GCC (3.63 KB, patch)
2011-09-12 12:51 UTC, Marton Danoczy
jacob.benoit.1: review-
Details | Diff | Splinter Review
Revised version of the patches (4.04 KB, patch)
2011-11-04 16:52 UTC, Marton Danoczy
jacob.benoit.1: review+
Details | Diff | Splinter Review

Description Marton Danoczy 2011-09-08 14:51:02 UTC
Created attachment 207 [details]
Error messages when compiling with Clang 3.0

Hi,

Since the newest version of the iOS SDK, compilation for ARM NEON is broken,
both with 3.0 and the default branch.

In iOS SDK 5.0, supported compilers are either LLVM-GCC or Clang (gcc is a
symlink to
llvm-gcc).

Compiling the file bug.cpp (which does nothing but including Eigen) breaks:

bug.cpp:
#include <Eigen/Core>

$ llvm-gcc -arch armv7 -isysroot
/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.0.sdk -pipe
-Wall -O0 -I../eigen_orig -c bug.cpp

In file included from ../eigen_orig/Eigen/Core:269,
                 from ei.cpp:1:
../eigen_orig/Eigen/src/Core/arch/NEON/PacketMath.h:89: error: expected
unqualified-id before '__extension__'
In file included from ../eigen_orig/Eigen/Core:352,
                 from ei.cpp:1:
../eigen_orig/Eigen/src/Core/GlobalFunctions.h:91: error: expected `}' at end
of input
../eigen_orig/Eigen/src/Core/GlobalFunctions.h:91: error: expected `}' at end
of input


same command, but with clang++ instead of llvm-gcc:
(totally different errors, see attached file)

Who can help? In the current state, the only workaround is
-DEIGEN_DONT_VECTORIZE.

Thanks,
Marton
Comment 1 Marton Danoczy 2011-09-12 12:51:54 UTC
Created attachment 209 [details] [review]
Fixes errors with both Clang and LLVM-GCC

This patch fixes all errors described above.

The problems were:
- the workaround for the compilation issue with NEON in gcc 4.2 and 4.3 from
changeset 434947f5e29d had to be disabled for both llvm-gcc and clang
- llvm-gcc defines uint32x2_t and uint32x4_t as unions, therefore a second set
of braces had to be added when initialising, i.e. uint32x2_t x = {{1,2}};
- clang does compile-time range checking on integer constants, so that e.g.
vextq_s32(first, second, Offset); fails in the template function palign_impl,
because it can't be guaranteed that Offset is between 0 and 3. My solution is
kind of hackish and relies on macros that explicitly define the template for
Offset=0..3, maybe there is a better one.
- clang complains about the function set_bool from (see bug 89) not doing
anything, thus I enabled EIGEN_SAFE_TO_USE_STANDARD_ASSERT_MACRO when using
clang. don't know how to test it though.

The presence of clang and llvm are tested with #ifs in place, I don't know
about the standard policy in Eigen. Caution: both pretend to be GCC 4.2.1 for
compatibility reasons!

Marton
Comment 2 Ben Gamari 2011-10-18 19:56:08 UTC
What is keeping this from being merged?
Comment 3 Marton Danoczy 2011-10-20 11:38:22 UTC
(In reply to comment #2)
> What is keeping this from being merged?

No idea, I asked already in the mailing list, but to no avail. I guess both
Gael and Benoit are quite busy at the moment. Maybe Jitse could do it? He is
usually quite active.

Jitse, all that has to be done is to transplant
https://bitbucket.org/marton78/eigen/changeset/b8010783daf7

Thanks,
Marton
Comment 4 Benoit Jacob 2011-10-20 16:16:51 UTC
Sorry... yes we're a bit busy at the moment. Will do ASAP.
Comment 5 Benoit Jacob 2011-10-24 15:37:13 UTC
Comment on attachment 209 [details] [review]
Fixes errors with both Clang and LLVM-GCC

I think this patch is very useful but can be significantly improved before we
check it in.


>-static uint32x4_t p4ui_CONJ_XOR = { 0x00000000, 0x80000000, 0x00000000, 0x80000000 };
>-static uint32x2_t p2ui_CONJ_XOR = { 0x00000000, 0x80000000 };
>+static uint32x4_t p4ui_CONJ_XOR =
>+#if __llvm__ && !__clang__

use defined(__llvm__) instead of __llvm__ as it is relying on undefined
behavior to assume that undefined preprocessor identifiers evaluate to 0. same
for __clang__.


>+    {{ 0x00000000, 0x80000000, 0x00000000, 0x80000000 }};

Need a little comment there to explain that this is a union for clang.

But, have you tried instead:

static uint32x2_t p2ui_CONJ_XOR({ 0x00000000, 0x80000000 });

or

static uint32x2_t p2ui_CONJ_XOR = uint32x2_t({ 0x00000000, 0x80000000 });

If either works, that would be nice as that would remove the need for #ifs
here. If that doesn't work, how about making a macro once and for all.

>+#else
>+    { 0x00000000, 0x80000000, 0x00000000, 0x80000000 };
>+#endif
>+static uint32x2_t p2ui_CONJ_XOR =
>+#if __llvm__ && !__clang__
>+    {{ 0x00000000, 0x80000000 }};
>+#else
>+    { 0x00000000, 0x80000000 };
>+#endif

Same.

>-#if EIGEN_GNUC_AT_MOST(4,4)
>+#if EIGEN_GNUC_AT_MOST(4,4) && !__llvm__

same as above.

>-  Packet4f countdown = { 0, 1, 2, 3 };
>+  Packet4f countdown = 
>+#if __llvm__ && !__clang__
>+    {{ 0, 1, 2, 3 }};
>+#else
>+    { 0, 1, 2, 3 };
>+#endif

same.


>   return vaddq_f32(pset1<Packet4f>(a), countdown);
> }
> template<> EIGEN_STRONG_INLINE Packet4i plset<int>(const int& a)
> {
>-  Packet4i countdown = { 0, 1, 2, 3 };
>+  Packet4i countdown = 
>+#if __llvm__ && !__clang__
>+    {{ 0, 1, 2, 3 }};
>+#else
>+    { 0, 1, 2, 3 };
>+#endif

same.

>-template<int Offset>
>-struct palign_impl<Offset,Packet4f>
>-{
>-  EIGEN_STRONG_INLINE static void run(Packet4f& first, const Packet4f& second)
>-  {
>-    if (Offset!=0)
>-      first = vextq_f32(first, second, Offset);
>-  }
>-};
>+#define PALIGN_NEON(Offset,Type,Command) \
>+template<>\
>+struct palign_impl<Offset,Type>\
>+{\
>+    EIGEN_STRONG_INLINE static void run(Type& first, const Type& second)\
>+    {\
>+        if (Offset!=0)\
>+            first = Command(first, second, Offset);\
>+    }\
>+};\
> 
>-template<int Offset>
>-struct palign_impl<Offset,Packet4i>
>-{
>-  EIGEN_STRONG_INLINE static void run(Packet4i& first, const Packet4i& second)
>-  {
>-    if (Offset!=0)
>-      first = vextq_s32(first, second, Offset);
>-  }
>-};
>+PALIGN_NEON(0,Packet4f,vextq_f32)
>+PALIGN_NEON(1,Packet4f,vextq_f32)
>+PALIGN_NEON(2,Packet4f,vextq_f32)
>+PALIGN_NEON(3,Packet4f,vextq_f32)
>+PALIGN_NEON(0,Packet4i,vextq_s32)
>+PALIGN_NEON(1,Packet4i,vextq_s32)
>+PALIGN_NEON(2,Packet4i,vextq_s32)
>+PALIGN_NEON(3,Packet4i,vextq_s32)

If that pleases Clang, that's a clang bug. If it's doing compile-time checking
of integer constants, it should do it after template parameters have been
resolved, so that change shouldn't make any difference. Have you reported this
bug to clang?

Alternatively, have you checked if clang offers a built-in to let you tell it
that the integer is in range?

Alternatively, how about using (Offset % 4) or some such, to explicitly mark it
as being in range?

>-#if EIGEN_GNUC_AT_MOST(4,3)
>+#if EIGEN_GNUC_AT_MOST(4,3) && !__clang__

same as above.
Comment 6 Benoit Jacob 2011-10-31 03:29:10 UTC
Just a quick note to say that long review delays hopefully will get better now
that we have a good patch review tool ;-)
Comment 7 Marton Danoczy 2011-11-04 16:52:11 UTC
Created attachment 224 [details] [review]
Revised version of the patches
Comment 8 Marton Danoczy 2011-11-04 16:52:23 UTC
(In reply to comment #5)

Thanks Benoit for the patch review and sorry it took me so long to come back to
you.

> use defined(__llvm__) instead of __llvm__ as it is relying on undefined
> behavior to assume that undefined preprocessor identifiers evaluate to 0. same
> for __clang__.
Done.

> But, have you tried instead:
> static uint32x2_t p2ui_CONJ_XOR({ 0x00000000, 0x80000000 });
> or
> static uint32x2_t p2ui_CONJ_XOR = uint32x2_t({ 0x00000000, 0x80000000 });
Unfortunately, both don't work.

> If either works, that would be nice as that would remove the need for #ifs
> here. If that doesn't work, how about making a macro once and for all.
Done.


> >+PALIGN_NEON(0,Packet4i,vextq_s32)
> >+PALIGN_NEON(1,Packet4i,vextq_s32)
> >+PALIGN_NEON(2,Packet4i,vextq_s32)
> >+PALIGN_NEON(3,Packet4i,vextq_s32)
> 
> If that pleases Clang, that's a clang bug. If it's doing compile-time checking
> of integer constants, it should do it after template parameters have been
> resolved, so that change shouldn't make any difference. Have you reported this
> bug to clang?
I agree, this should not happen. The commands vextq_* are macros defined in
include/clang/3.0/arm_neon.h that resolve to builtins of the form:
__builtin_neon_vextq_v(first, second, Offset, some_magic_number)

Unfortunately, clang checks the range of Offset even if the template is never
instantiated! I will report the bug to the LLVM project.

> Alternatively, have you checked if clang offers a built-in to let you tell it
> that the integer is in range?
Didn't find anything.

> Alternatively, how about using (Offset % 4) or some such, to explicitly mark it
> as being in range?
Doesn't help.
Comment 9 Marton Danoczy 2011-11-04 17:08:38 UTC
FYI: Here's the bug @ LLVM:

http://llvm.org/bugs/show_bug.cgi?id=11311
Comment 10 Benoit Jacob 2011-11-06 23:04:29 UTC
Comment on attachment 224 [details] [review]
Revised version of the patches

Review of attachment 224 [details] [review]:
-----------------------------------------------------------------

Thanks, looks good. Pushing. Thanks for reporting the LLVM bug.

::: Eigen/src/Core/arch/NEON/PacketMath.h
@@ +413,5 @@
> +    {\
> +        if (Offset!=0)\
> +            first = Command(first, second, Offset);\
> +    }\
> +};\

I'm going to add a comment here to say it's for this bug.
Comment 11 Benoit Jacob 2011-11-06 23:12:56 UTC
pushed to default branch (22b98c36023a) and 3.0 branch (6dbc070bef47).

Please verify that this works as intended on both branches. Thanks for the
patches!
Comment 12 Marton Danoczy 2011-11-07 17:02:05 UTC
It works for my project. Well, at least as far as I can tell. Additionally, I'd
like to run the Eigen test suite on iOS, but I'm not sure how to cross-compile
it with CMake.

I found these config iOS-CMake scripts: http://www.ltengsoft.com/node/20

However, CMake gives me the following error:

Setting up iOS toolchain
iOS toolchain loaded
CMake Error at CMakeLists.txt:71 (message):
  Can't link to the standard math library.  Please report to the Eigen
  developers, telling them about your platform.

How can I fix it?

Note You need to log in before you can comment on or make changes to this bug.