computeProductBlockingSizes has this code limiting the value of the kc blocking parameter to 360, or 240 for big Scalar types: k = std::min<SizeType>(k,sizeof(LhsScalar)<=4 ? 360 : 240); The other blocking parameters mc and nc are also clamped, though to higher values. This optimization should be justified by a comment. Please add one? Moreover, this optimization is *detrimental* on a Nexus 4 (ARM) device. See the attachment in bug 937 comment 3. It shows that for large enough products, the optimal power-of-two value of kc can easily be 512, and for 1024^3 matrix products, kc=1024 or kc=512 both perform optimally, while kc<=256 performs at least 10% worse (see the bottow of that file for the 1024^3 case). On a Core i7, the data in bug 937 comment 1 does confirm that kc=256 is the highest possible optimal power-of-two size. Still I would like to understand where the value 360 comes from?
These numbers (240/360) are the values giving best performance for very large matrices on i7. These numbers have been introduced when the previous heuristic based on caches sizes was not valid anymore. Then we forgot to update this part of the code, but clearly, those numbers have to be removed by a more general heuristic. Also, kc does not have to be a power-of-two, a multiple of 16 will do.
Thanks for this explanation. So this is a prime example of ad-hoc logic. I initially thought that ad-hoc was bad and we should be able to have nice universal logic instead, but I was wrong. See bug 937 comment 8. Let's instead embrace ad-hoc logic, keep this, and just make it Intel-only while developing a different ad-hoc logic for ARM.
The situation is much better now: https://bitbucket.org/eigen/eigen/commits/52572e60b5d3/ Changeset: 52572e60b5d3 User: ggael Date: 2015-02-26 15:04:35+00:00 Summary: Implement a more generic blocking-size selection algorithm.
-- GitLab Migration Automatic Message -- This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/939.