This bugzilla service is closed. All entries have been migrated to https://gitlab.com/libeigen/eigen
Bug 1751 - Fix warnings in HIP build
Summary: Fix warnings in HIP build
Status: CONFIRMED
Alias: None
Product: Eigen
Classification: Unclassified
Component: Tests (show other bugs)
Version: 3.4 (development)
Hardware: Other All
: Normal Compilation Problem
Assignee: Nobody
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-23 14:35 UTC by Christoph Hertzberg
Modified: 2019-12-04 18:48 UTC (History)
5 users (show)



Attachments

Description Christoph Hertzberg 2019-09-23 14:35:15 UTC
The nightly HIP-builds currently produce hundreds of 
   warning: loop not unrolled: the optimizer was unable to perform the requested transformation

Ideally, these should be fixed (I don't know if that is possible).

If the warnings are actually bogus, we can mask them by adding 
  -Wpass-failed=transform-warning
to /src/Core/util/DisableStupidWarnings.h (if HIP is enabled).
Comment 1 Deven Desai 2019-09-23 19:18:43 UTC
Currently we are unable to fix the cause of these warnings, so disabling these warnings is the better option for now.

This warning seems to be emitted both at compile and link time.

for example

------------------

Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_device_2.dir/cxx11_tensor_device_2_generated_cxx11_tensor_device.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_device.cu:17:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:130:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorConvolution.h:577:17: warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]
__global__ void EigenConvolutionKernel1D(
...
...
27 warnings generated.
Linking HIP executable cxx11_tensor_device_2
warning: <unknown>:0:0: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
...
...
--------------------


Using the recommended option does not seem to work :(
(i.e. adding #pragma clang diagnostic ignored -Wpass-failed=transform-warning)
--------------------
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../../../Eigen/src/Core/util/DisableStupidWarnings.h:96:36: warning: unknown warning group '-Wpass-failed=transform-warning', ignored [-Wunknown-warning-option]
  #pragma clang diagnostic ignored "-Wpass-failed=transform-warning"
--------------------


If I make the option more generic 
(i.e. adding #pragma clang diagnostic ignored -Wpass-failed)

then the compile time warnings are gone, but the link time warnings remain :( I do not know how to disable those.

I am hoping the disabling just the compile time warnings is enough avoid "error" classification in the nightly dashboard. I have kicked off a regression run with this change...lets see what we get.
Comment 2 Christoph Hertzberg 2019-09-23 21:12:36 UTC
You could pass custom linker flags (like `-Wno-pass-failed`) via EIGEN_TEST_CUSTOM_LINKER_FLAGS -- this only hides the problem when building the test-suite of course.

Regarding errors vs warnings, it seems to me that this is caused by parallel builds with interleaving warning messages. This could be avoided by passing `-j8 -O` as EIGEN_TEST_BUILD_FLAGS (or by using `cmake -G Ninja ..`).
Comment 3 Deven Desai 2019-09-24 13:28:17 UTC
Adding "#pragma clang diagnostic ignored -Wpass-failed" gets rid of the compile time warnings, but does not "fix" the incorrect error classification in the cdash dashboard
(see http://manao.inria.fr/CDash/buildSummary.php?buildid=9785  )

Passing custom linker flags (via EIGEN_TEST_CUSTOM_LINKER_FLAGS) does not seem to work in suppressing the link time warnings :(

Adding `-j8 -O` does not have the intended effect either 
(see http://manao.inria.fr/CDash/buildSummary.php?buildid=9791 )


The only viable solution here maybe to find all the instances of "#pragma unroll" that result in this warning and #ifdef them out when compiling with HIPCC. I am going to pursue that path, and see if I have better luck
Comment 4 Christoph Hertzberg 2019-09-24 13:37:50 UTC
Can you replace `#pragma unroll` by `EIGEN_UNROLL_LOOP` and change the definition of that macro (in Eigen/src/Core/util/Macros.h) when compiling with HIPCC?

Someone needs to check afterwards if this still works with CUDA.


Or are there some unrolls which are fine with HIP that you want to keep?
Comment 5 Deven Desai 2019-09-24 13:45:26 UTC
> Can you replace `#pragma unroll` by `EIGEN_UNROLL_LOOP` and change the definition of that macro (in Eigen/src/Core/util/Macros.h) when compiling with HIPCC?

That is exactly what I am trying out right now :)


> Or are there some unrolls which are fine with HIP that you want to keep?

do not know yet...still need to evaluate that part.
Comment 6 Deven Desai 2019-09-24 14:31:26 UTC
The problem with the EIGEN_UNROLL_LOOP approach is that there exist instances of both "EIGEN_UNROLL_LOOP" and "#pragma unroll" in the code right now. 

Replacing existing instances of "#pragma unroll" with EIGEN_UNROLL_LOOP should probably be a PR onto itself. There are two things that such a PR will need to address

1. Currently EIGEN_UNROLL_LOOP macro is defined only when "SYCL_DEVICE_ONLY" is defined, and hence replacing "#pragma unroll" with the EIGEN_UNROLL_LOOP will effectively remove that pragma when compiling for the CUDA. That may result in some performance regression. Is there a way to detect any such performance drop-off?

2. There is one instance of "#pragma unroll 7", so there will be a need to introduce"#define EIGEN_UNROLL_LOOP_N(X) " or similar macro to accommodate that one instance.

What are your thoughts on the two issues above?
Comment 7 Christoph Hertzberg 2019-09-26 12:53:41 UTC
Ad 1: You could of course define it for CUDA as well. As far as I understand, there is no code which is run by SYCL and CUDA. If that's not possible, we'd have to rename them to EIGEN_UNROLL_LOOP_SYCL, EIGEN_UNROLL_LOOP_CUDA, etc. But this will be a mess if there are cases which shall be unrolled for non-trivial subsets of {SYCL,CUDA,HIP,CPU,?}.

Ad 2: No objection against adding an extra macro for that.


Overall, it may actually be a good idea to first check if CUDA actually profits from these pragmas (I haven't been coding for CUDA for a while ...)
Comment 8 Rod Burns 2019-10-04 08:46:07 UTC
I'm just cross posting to here from https://bitbucket.org/eigen/eigen/pull-requests/673/sycl-adding-sycl-kernels-for-unsupported/diff

The EIGEN_UNROLL_LOOP is used by SYCL since SYCL compiles the code for a generic SPIR(-V) target and at the time of compilation has no knowledge of the underlying device running the code. So it is extra conservative and avoids unrolling the loop because in embedded devices with a limited amount of registers, unrolling can cause a performance drop. However, since NVIDIA owns the entire stack, all loops will be unrolled whenever the loop variables can be resolved at compile time. If there is a case in CUDA that an unroll is required for a specific size (or unroll cannot be resolve at compile time), from a SYCL point of view it is completely OK to change the EIGEN_UNROLL_LOOP to be:

#define STR(x) #x
#define STR_UNROLL(x) STR( unroll x)
#define UNROLL(y) STR_UNROLL(y)

#if defined(SYCL_DEVICE_ONLY)
#if defined(_MSC_VER)
#define EIGEN_UNROLL_LOOP(X) __pragma(unroll X)
#else
#define EIGEN_UNROLL_LOOP(X) _Pragma(UNROLL(X))
#endif
#else // HERE if any othere backend need macro can add its own macro
#define EIGEN_UNROLL_LOOP (X)
#endif

#endif // EIGEN_MACROS_H
So, in that case, we can capture any specific unroll variable when needed
Comment 9 Christoph Hertzberg 2019-10-04 09:43:12 UTC
Cleaned up version (using existing EIGEN_MAKESTRING macro):

#if defined(SYCL_DEVICE_ONLY)
  #if EIGEN_COMP_MSVC
    #define EIGEN_UNROLL_LOOP(X) __pragma(unroll X)
  #else
    #define EIGEN_UNROLL_LOOP(X) _Pragma(EIGEN_MAKESTRING(unroll X))
  #endif
#else
  // if any other backend needs a macro it can be added here
  #define EIGEN_UNROLL_LOOP (X)
#endif


@Deven: If that works for you, could you make a PR (including changing all `#pragma unroll` and `EIGEN_UNROLL_LOOP` to `EIGEN_UNROLL_LOOP()`)?
Comment 10 Christoph Hertzberg 2019-10-04 16:28:32 UTC
N.B.: Strictly speaking, empty macro arguments are allowed only with C++11 or later. Unless this is never used from C++03 code, we would need a second macro (or always pass EIGEN_EMPTY).

https://godbolt.org/z/ns-m-2
Comment 11 Nobody 2019-12-04 18:48:24 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/1751.

Note You need to log in before you can comment on or make changes to this bug.