Created attachment 649 [details]
Example demonstrating crash
See attached example. Seems pretty low level and not specific to AutoDiffScalar matrices.
Looks like in evaluateProductBlockingSizesHeuristic, evaluating
k = (k%max_kc)==0 ? max_kc
: max_kc - k_peeling * ((max_kc-1-(k%max_kc))/(k_peeling*(k/max_kc+1)));
results in taking an undefined remainder by zero (max_kc happens to evaluate to 0), triggering
Exception: EXC_ARITHMETIC (code=EXC_I386_DIV, subcode=0x0))
This is on OSX, but we get a crash on Linux too, so I filled in All. This used to work fine with Eigen 3.2.7, but causes a crash with 3.3 beta 1 and with the default branch. I will attach a stack trace to a follow up comment.
Created attachment 650 [details]
however, a cleaner solution would be to to fallback to another product implementation for heavy scalar types.
Using R.lazyProduct(....) instead of R*... is very likely to be much faster.
(In reply to Gael Guennebaud from comment #2)
> Using R.lazyProduct(....) instead of R*... is very likely to be much faster.
Indeed, a quick bench tell me 1.11ms versus 0.17ms for lazyProduct.
Thank you for the fix!
> Indeed, a quick bench tell me 1.11ms versus 0.17ms for lazyProduct.
Hmm, that's very interesting. Sorry, I wasn't quite clear on this: did you mean that as a recommendation for us or as a possible future improvement on your end?
A lot of our code is templated on the scalar type and supposed to work for (at least) both double and AutoDiffScalar<...> types, with a heavier performance requirement on the double template instantiations. This means that switching everything to lazyProduct outright is not the best option for us. I suppose we could select which product implementation to use based on the scalar type. But having that done automatically by Eigen would be incredible for us given the potential huge speedup and would mean we could keep our code more readable.
This comment was mostly for Eigen, but actually, if R is always as small as 3x3, then better use lazyProduct even for double.