Noticed by using a solver ConjugateGradient<SparseMatrix<double>, Lower|Upper>.
Calling the method solve is slow for me using the static scheduling (in SparseDenseProduct.h). There is comment saying the tuning has been done on a Poisson 2D/3D. If I'm not mistaken, this is case where all lines of the sparse matrix have the same non-zero count, which is probably best handled by the static scheduler. However, with a matrix not banded, with irregular patterns (my case), the dynamic and guided scheduler perform much better (20-30% better in my case).
How about using the guided scheduler here ? Or provide a way to use it ? Or determine to use it automatically if the non-zero count per line is too much irregular ?
What do you think ?
Good point, I've found the following compromise:
which creates 4 times more chunks than threads. Should be enough to balance the work.
Summary: Bug 1154: move to dynamic scheduling for spmv products.