This bugzilla service is closed. All entries have been migrated to
Bug 1212 - EigenContractionKernel causes ptx error(cost too much shared memory)
Summary: EigenContractionKernel causes ptx error(cost too much shared memory)
Alias: None
Product: Eigen
Classification: Unclassified
Component: Tensor (show other bugs)
Version: 3.3 (current stable)
Hardware: GPU (CUDA) All
: Normal Unknown
Assignee: Nobody
Depends on:
Reported: 2016-04-27 07:56 UTC by xiah
Modified: 2019-12-04 15:44 UTC (History)
4 users (show)


Description xiah 2016-04-27 07:56:57 UTC
I changed the code as follows to reduce the cost shared memory.

  //__shared__ volatile Scalar lhs_shmem[72 * 64];
  //__shared__ volatile Scalar rhs_shmem[72 * 64];

  __shared__ volatile Scalar lhs_shmem[72 * 32];
  __shared__ volatile Scalar rhs_shmem[72 * 32];

And then I can compile it without errors. Is it possible to redesign the kernel to reduce the cost of shared memeory? usually nv's GPU has a most 48k shared memory.
Comment 1 Christoph Hertzberg 2016-04-27 17:42:25 UTC
I guess Benoit should have a look at this.
Maybe add a compile time define to declare the size of available shared memory?
Comment 2 Benoit Steiner 2016-04-27 20:19:26 UTC
You want to pack as many values in shared memory as possible in order to maximize the performance. Since the amount of shared memory is fixed, this number depend on the size of the scalar used in the contraction. This means that we need to specialize the kernels for each possible input type. Unfortunately we haven't had time to do this so far.

One good strategy would be to write a fallback kernel that does a decent job on the biggest scalar we're likely to encounter (probably complex<double> and use this unless we have an optimized kernel for the type we care about.

Another strategy would be to call cuBlas directly whenever possible (i.e. when the input data for the 2 operand is directly addressable by pointer.)
Comment 3 Nobody 2019-12-04 15:44:40 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance:

Note You need to log in before you can comment on or make changes to this bug.