Bug 347 - Compilation for ARM NEON broken with iOS SDK 5.0
Compilation for ARM NEON broken with iOS SDK 5.0
Status: RESOLVED FIXED
Product: Eigen
Classification: Unclassified
Component: Core - vectorization
3.0
ARM - NEON Mac OS
: --- Unknown
Assigned To: Marton Danoczy
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-08 14:51 UTC by Marton Danoczy
Modified: 2011-11-07 17:02 UTC (History)
4 users (show)



Attachments
Error messages when compiling with Clang 3.0 (7.05 KB, text/plain)
2011-09-08 14:51 UTC, Marton Danoczy
no flags Details
Fixes errors with both Clang and LLVM-GCC (3.63 KB, patch)
2011-09-12 12:51 UTC, Marton Danoczy
jacob.benoit.1: review-
Details | Diff
Revised version of the patches (4.04 KB, patch)
2011-11-04 16:52 UTC, Marton Danoczy
jacob.benoit.1: review+
Details | Diff

Description Marton Danoczy 2011-09-08 14:51:02 UTC
Created attachment 207 [details]
Error messages when compiling with Clang 3.0

Hi,

Since the newest version of the iOS SDK, compilation for ARM NEON is broken, both with 3.0 and the default branch.

In iOS SDK 5.0, supported compilers are either LLVM-GCC or Clang (gcc is a symlink to
llvm-gcc).

Compiling the file bug.cpp (which does nothing but including Eigen) breaks:

bug.cpp:
#include <Eigen/Core>

$ llvm-gcc -arch armv7 -isysroot /Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.0.sdk -pipe -Wall -O0 -I../eigen_orig -c bug.cpp

In file included from ../eigen_orig/Eigen/Core:269,
                 from ei.cpp:1:
../eigen_orig/Eigen/src/Core/arch/NEON/PacketMath.h:89: error: expected
unqualified-id before '__extension__'
In file included from ../eigen_orig/Eigen/Core:352,
                 from ei.cpp:1:
../eigen_orig/Eigen/src/Core/GlobalFunctions.h:91: error: expected `}' at end
of input
../eigen_orig/Eigen/src/Core/GlobalFunctions.h:91: error: expected `}' at end
of input


same command, but with clang++ instead of llvm-gcc:
(totally different errors, see attached file)

Who can help? In the current state, the only workaround is -DEIGEN_DONT_VECTORIZE.

Thanks,
Marton
Comment 1 Marton Danoczy 2011-09-12 12:51:54 UTC
Created attachment 209 [details]
Fixes errors with both Clang and LLVM-GCC

This patch fixes all errors described above.

The problems were:
- the workaround for the compilation issue with NEON in gcc 4.2 and 4.3 from changeset 434947f5e29d had to be disabled for both llvm-gcc and clang
- llvm-gcc defines uint32x2_t and uint32x4_t as unions, therefore a second set of braces had to be added when initialising, i.e. uint32x2_t x = {{1,2}};
- clang does compile-time range checking on integer constants, so that e.g. vextq_s32(first, second, Offset); fails in the template function palign_impl, because it can't be guaranteed that Offset is between 0 and 3. My solution is kind of hackish and relies on macros that explicitly define the template for Offset=0..3, maybe there is a better one.
- clang complains about the function set_bool from (see bug 89) not doing anything, thus I enabled EIGEN_SAFE_TO_USE_STANDARD_ASSERT_MACRO when using clang. don't know how to test it though.

The presence of clang and llvm are tested with #ifs in place, I don't know about the standard policy in Eigen. Caution: both pretend to be GCC 4.2.1 for compatibility reasons!

Marton
Comment 2 Ben Gamari 2011-10-18 19:56:08 UTC
What is keeping this from being merged?
Comment 3 Marton Danoczy 2011-10-20 11:38:22 UTC
(In reply to comment #2)
> What is keeping this from being merged?

No idea, I asked already in the mailing list, but to no avail. I guess both Gael and Benoit are quite busy at the moment. Maybe Jitse could do it? He is usually quite active.

Jitse, all that has to be done is to transplant
https://bitbucket.org/marton78/eigen/changeset/b8010783daf7

Thanks,
Marton
Comment 4 Benoit Jacob 2011-10-20 16:16:51 UTC
Sorry... yes we're a bit busy at the moment. Will do ASAP.
Comment 5 Benoit Jacob 2011-10-24 15:37:13 UTC
Comment on attachment 209 [details]
Fixes errors with both Clang and LLVM-GCC

I think this patch is very useful but can be significantly improved before we check it in.


>-static uint32x4_t p4ui_CONJ_XOR = { 0x00000000, 0x80000000, 0x00000000, 0x80000000 };
>-static uint32x2_t p2ui_CONJ_XOR = { 0x00000000, 0x80000000 };
>+static uint32x4_t p4ui_CONJ_XOR =
>+#if __llvm__ && !__clang__

use defined(__llvm__) instead of __llvm__ as it is relying on undefined behavior to assume that undefined preprocessor identifiers evaluate to 0. same for __clang__.


>+    {{ 0x00000000, 0x80000000, 0x00000000, 0x80000000 }};

Need a little comment there to explain that this is a union for clang.

But, have you tried instead:

static uint32x2_t p2ui_CONJ_XOR({ 0x00000000, 0x80000000 });

or

static uint32x2_t p2ui_CONJ_XOR = uint32x2_t({ 0x00000000, 0x80000000 });

If either works, that would be nice as that would remove the need for #ifs here. If that doesn't work, how about making a macro once and for all.

>+#else
>+    { 0x00000000, 0x80000000, 0x00000000, 0x80000000 };
>+#endif
>+static uint32x2_t p2ui_CONJ_XOR =
>+#if __llvm__ && !__clang__
>+    {{ 0x00000000, 0x80000000 }};
>+#else
>+    { 0x00000000, 0x80000000 };
>+#endif

Same.

>-#if EIGEN_GNUC_AT_MOST(4,4)
>+#if EIGEN_GNUC_AT_MOST(4,4) && !__llvm__

same as above.

>-  Packet4f countdown = { 0, 1, 2, 3 };
>+  Packet4f countdown = 
>+#if __llvm__ && !__clang__
>+    {{ 0, 1, 2, 3 }};
>+#else
>+    { 0, 1, 2, 3 };
>+#endif

same.


>   return vaddq_f32(pset1<Packet4f>(a), countdown);
> }
> template<> EIGEN_STRONG_INLINE Packet4i plset<int>(const int& a)
> {
>-  Packet4i countdown = { 0, 1, 2, 3 };
>+  Packet4i countdown = 
>+#if __llvm__ && !__clang__
>+    {{ 0, 1, 2, 3 }};
>+#else
>+    { 0, 1, 2, 3 };
>+#endif

same.

>-template<int Offset>
>-struct palign_impl<Offset,Packet4f>
>-{
>-  EIGEN_STRONG_INLINE static void run(Packet4f& first, const Packet4f& second)
>-  {
>-    if (Offset!=0)
>-      first = vextq_f32(first, second, Offset);
>-  }
>-};
>+#define PALIGN_NEON(Offset,Type,Command) \
>+template<>\
>+struct palign_impl<Offset,Type>\
>+{\
>+    EIGEN_STRONG_INLINE static void run(Type& first, const Type& second)\
>+    {\
>+        if (Offset!=0)\
>+            first = Command(first, second, Offset);\
>+    }\
>+};\
> 
>-template<int Offset>
>-struct palign_impl<Offset,Packet4i>
>-{
>-  EIGEN_STRONG_INLINE static void run(Packet4i& first, const Packet4i& second)
>-  {
>-    if (Offset!=0)
>-      first = vextq_s32(first, second, Offset);
>-  }
>-};
>+PALIGN_NEON(0,Packet4f,vextq_f32)
>+PALIGN_NEON(1,Packet4f,vextq_f32)
>+PALIGN_NEON(2,Packet4f,vextq_f32)
>+PALIGN_NEON(3,Packet4f,vextq_f32)
>+PALIGN_NEON(0,Packet4i,vextq_s32)
>+PALIGN_NEON(1,Packet4i,vextq_s32)
>+PALIGN_NEON(2,Packet4i,vextq_s32)
>+PALIGN_NEON(3,Packet4i,vextq_s32)

If that pleases Clang, that's a clang bug. If it's doing compile-time checking of integer constants, it should do it after template parameters have been resolved, so that change shouldn't make any difference. Have you reported this bug to clang?

Alternatively, have you checked if clang offers a built-in to let you tell it that the integer is in range?

Alternatively, how about using (Offset % 4) or some such, to explicitly mark it as being in range?

>-#if EIGEN_GNUC_AT_MOST(4,3)
>+#if EIGEN_GNUC_AT_MOST(4,3) && !__clang__

same as above.
Comment 6 Benoit Jacob 2011-10-31 03:29:10 UTC
Just a quick note to say that long review delays hopefully will get better now that we have a good patch review tool ;-)
Comment 7 Marton Danoczy 2011-11-04 16:52:11 UTC
Created attachment 224 [details]
Revised version of the patches
Comment 8 Marton Danoczy 2011-11-04 16:52:23 UTC
(In reply to comment #5)

Thanks Benoit for the patch review and sorry it took me so long to come back to you.

> use defined(__llvm__) instead of __llvm__ as it is relying on undefined
> behavior to assume that undefined preprocessor identifiers evaluate to 0. same
> for __clang__.
Done.

> But, have you tried instead:
> static uint32x2_t p2ui_CONJ_XOR({ 0x00000000, 0x80000000 });
> or
> static uint32x2_t p2ui_CONJ_XOR = uint32x2_t({ 0x00000000, 0x80000000 });
Unfortunately, both don't work.

> If either works, that would be nice as that would remove the need for #ifs
> here. If that doesn't work, how about making a macro once and for all.
Done.


> >+PALIGN_NEON(0,Packet4i,vextq_s32)
> >+PALIGN_NEON(1,Packet4i,vextq_s32)
> >+PALIGN_NEON(2,Packet4i,vextq_s32)
> >+PALIGN_NEON(3,Packet4i,vextq_s32)
> 
> If that pleases Clang, that's a clang bug. If it's doing compile-time checking
> of integer constants, it should do it after template parameters have been
> resolved, so that change shouldn't make any difference. Have you reported this
> bug to clang?
I agree, this should not happen. The commands vextq_* are macros defined in include/clang/3.0/arm_neon.h that resolve to builtins of the form:
__builtin_neon_vextq_v(first, second, Offset, some_magic_number)

Unfortunately, clang checks the range of Offset even if the template is never instantiated! I will report the bug to the LLVM project.

> Alternatively, have you checked if clang offers a built-in to let you tell it
> that the integer is in range?
Didn't find anything.

> Alternatively, how about using (Offset % 4) or some such, to explicitly mark it
> as being in range?
Doesn't help.
Comment 9 Marton Danoczy 2011-11-04 17:08:38 UTC
FYI: Here's the bug @ LLVM:

http://llvm.org/bugs/show_bug.cgi?id=11311
Comment 10 Benoit Jacob 2011-11-06 23:04:29 UTC
Comment on attachment 224 [details]
Revised version of the patches

Review of attachment 224 [details]:
-----------------------------------------------------------------

Thanks, looks good. Pushing. Thanks for reporting the LLVM bug.

::: Eigen/src/Core/arch/NEON/PacketMath.h
@@ +413,5 @@
> +    {\
> +        if (Offset!=0)\
> +            first = Command(first, second, Offset);\
> +    }\
> +};\

I'm going to add a comment here to say it's for this bug.
Comment 11 Benoit Jacob 2011-11-06 23:12:56 UTC
pushed to default branch (22b98c36023a) and 3.0 branch (6dbc070bef47).

Please verify that this works as intended on both branches. Thanks for the patches!
Comment 12 Marton Danoczy 2011-11-07 17:02:05 UTC
It works for my project. Well, at least as far as I can tell. Additionally, I'd like to run the Eigen test suite on iOS, but I'm not sure how to cross-compile it with CMake.

I found these config iOS-CMake scripts: http://www.ltengsoft.com/node/20

However, CMake gives me the following error:

Setting up iOS toolchain
iOS toolchain loaded
CMake Error at CMakeLists.txt:71 (message):
  Can't link to the standard math library.  Please report to the Eigen
  developers, telling them about your platform.

How can I fix it?

Note You need to log in before you can comment on or make changes to this bug.