Eigen internally implements a cost model to control at compile time both the evaluation of nested expressions and explicit unrolling. The read cost (RC) associated to an expression represents an approximation of the effective number of instructions needed to read or evaluate a single coefficient of the expression. Given the following notations:
- SC = NumTraits<Scalar>::ReadCost
- MC = NumTraits<Scalar>::MulCost
- AC = NumTraits<Scalar>::AddCost
the cost of the expression 2*m1+m2 will be RC = (MC + SC) + AC + SC. For native types, we have SC==MC==AC==1, and then, in our example, we have RC = 4.
Such a mechanism is particularly useful to accurately control the generated code size during explicit unrolling. What we simply have to do is to divide a user defined maximal number of instructions per unrolling (EIGEN_UNROLLING_LIMIT) by RC to get the number of iterations to unroll.
This cost model is also used to decide whether a nested expression has to be evaluated or not. Indeed, for instance in (m1+m2)*m3 it might be preferable to first evaluate tmp = m1+m2 before doing the matrix product tmp * m3. This is because each coefficient of the left hand side expression (m1+m2) will be evaluated m3.cols() times. This automatic evaluation mechanism is implemented by the ei_nested<> structure. Assuming the coefficients of a nested expression of cost NC (NestedCost) are read R times each, then not evaluating it will cost:
- R * NC.
On the other hand, evaluating the expression will cost NC + SC to evaluate and store it, plus R loads:
- NC + (R+1)*SC
Finally, it is cheaper to evaluate the nested expression as soon as the following condition vanishes:
- (R+1)*SC < (R-1) * NC
Of course, when we have an equality it might still be preferable to evaluate the expression, therefore the following condition is reasonable too:
- (R+1)*SC <= (R-1) * NC
Further experiments will tell us which version is better in practice.
TODO: discuss when and how an expression gets vectorized.
That's a general C++ technique, here's a page to explain it, may help understand some Eigen code and write more.