698
2013-11-05 11:27:14 +0000
LinSpaced for integer types
2016-10-24 13:51:59 +0000
1
1
1
Unclassified
Eigen
Core - general
3.4 (development)
All
All
RESOLVED
FIXED
JuniorJob
Low
Optimization
---
558
1
chtz
eigen.nobody
gael.guennebaud
jacob.benoit.1
oldest_to_newest
2878
0
chtz
2013-11-05 11:27:14 +0000
At the moment, e.g., Vector4i::LinSpaced(0,5) results in {0 1 2 3} which is unlikely to be intended. The reason is that inside LinSpaced a stepsize is calculated which gets truncated for integer types.
I would suggest to specify LinSpaced(size, low, high)[i] always as
low + (high-low)*i/(size-1);
(that is for size>1, for size==1 see bug 224).
The implementation can then be optimized as low + i*((high-low)/(size-1)) for float-types and some kind of Bresenham algorithm for integer types.
We also need unit tests for LinSpaced on integer types.
3979
1
gael.guennebaud
2014-10-20 16:28:51 +0000
I agree with that proposal.
5409
2
gael.guennebaud
2016-02-01 13:27:40 +0000
https://bitbucket.org/eigen/eigen/commits/5b8b48d4593e/
Summary: Bug 698: fix linspaced for integer types.
Next step is optimizing the integer path for sequential access -> minor.
6355
3
gael.guennebaud
2016-10-11 19:26:38 +0000
Arf, the problem with this new implementation is that when calling:
ArrayXi::LinSpaced(46342, 0, 46341)
then "(high-low)*i" overflows:
...
46337
46338
46339
46340
-46340
This has been reported there: http://stackoverflow.com/questions/39952584/why-does-eigens-linspaced-overflow-at-46340-kind-of
6356
4
gael.guennebaud
2016-10-11 19:52:16 +0000
The only solution I can figure out tonight is to enforce evaluation within a temporary so that we can use Bresenham's algorithm.
6378
5
gael.guennebaud
2016-10-13 08:40:33 +0000
hm, actually this is not so easy as Bresenham's algorithm will only work if high-low<=num_steps and it might also overflow with 8 or 16 bits integer types because we have to track values up to 3 times num_steps.
So we could think about the following strategy:
if high-low is a multiple of num_steps-1, then we can safely compute
step = (high-low)/(num_steps-1);
return low + step*i;
This probably represents 90% of the use cases with integer types.
Otherwise, fall back to floating point double precision for the intermediate computations.
Unfortunately, this requires a runtime decision, and so a "if" within the coeff() method... This can be solved using a different (low,step,high) API in 3.4.
6380
6
gael.guennebaud
2016-10-13 09:09:49 +0000
Sorry for the noise, the following should do the job:
low + (range/(n-1))*i + ((range%(n-1))*i)/(n-1)
The first division and modulo can be precomputed, so this version should be nearly as efficient as the current one.
6381
7
gael.guennebaud
2016-10-13 09:16:13 +0000
too fast again, (range%(n-1)) can still be very large and overflow when multiplied by i. At least, this version works fine in the common case where: "range%(n-1)==0".
6382
8
chtz
2016-10-13 09:26:04 +0000
I was also thinking about splitting the step into parts a=(high-low)/(size-1) and b=(high-low)%(size-1) and return (low + a*i + (b*i)/(size-1));
But unfortunately that will only take care of some cases which currently overflow, even when making some smarter modulo which is always in range (-size/2, +size/2). The (b*i)/(size-1) part could actually be handled by a Bresenham algorithm, though -- or we could assert for cases where b*i can overflow (not ideal, but better than silently calculating wrong values).
To make things worse, for signed integer types, we could actually already have an overflow when calculating (high-low). This could somehow be handled with some if-branches such as (high > low) == (high-low>0) ...
A pragmatic solution could be to calculate with 64bit integers (for sizeof(Scalar)<=4) -- this could actually still overflow if the size is larger than 2^31, though.
Also, for sizeof(Scalar)>4 we would need an integer of size sizeof(Scalar)+sizeof(Index) to be safe (i.e., at least 128bit, which is not available everywhere). Maybe we could implement a generic templated bigInt similar to TensorUInt128 from unsupported/Eigen/CXX11/src/Tensor/TensorUInt128.h
In our use-case we could safe some operations since we know that, e.g.,
(high-low)*i/(size-1) <= (high-low)
Also, I agree that preferably, we should also provide a (low,step,high) or (low,step,size) API (cf Bug 699).
6383
9
gael.guennebaud
2016-10-13 09:47:50 +0000
I've also been thinking about using 64 bits integers. I would not waste too much time for the very unlikely case of size>2^31, but the main problem is that 64 bits integers are not supported everywhere...
6384
10
chtz
2016-10-13 10:51:22 +0000
(In reply to Gael Guennebaud from comment #9)
> but the main problem is that 64 bits integers are not supported everywhere...
We could hand-craft a custom 64 bit integer type (similar to UInt128) for that case as well (ideally in a templated way to avoid code duplication with int128). Of course we should only use it if no int64 is available natively.
6385
11
gael.guennebaud
2016-10-13 12:58:42 +0000
ok, here is another thought: in my opinion, if either (high-low+1)%size or size%(high-low+1) is different than zero, then the notion of "linearly spaced values" is not well defined for integers. If a user make such a call to LinSpaced, then its most likely a bug. So I would suggest to assert on such cases, and then implement it as follows:
range = high-low+1;
if(range>=n)
{
assert((range%n)==0);
return low + i * (range/n);
}
else
{
assert((n%range)==0);
return low + i / (n/range);
}
Note that this different than the current "low + (high-low)*i/(size-1)", but in my opinion this new version makes more sense for integers. For instance, LinSpaced(6,0,2) would produce:
0 0 1 1 2 2
whereas it currently produces:
0 0 0 1 1 2
If the user wants fancy numbers, then it's up to him to call LinSpaced with floating point numbers with appropriate range and rounding.
6400
12
chtz
2016-10-16 11:06:25 +0000
I agree, 0 0 1 1 2 2 makes much more sense than 0 0 0 1 1 2. And an assertion for the invalid cases is certainly better than silently returning bad results.
6419
13
gael.guennebaud
2016-10-24 13:34:50 +0000
I'm preparing a patch, but without the assertion for now. This essentially means that if the input range [low,high] does not admit an even spacing, then high is lowered to the largest value meeting the constraints. I'll document it with examples.
6420
14
gael.guennebaud
2016-10-24 13:51:59 +0000
https://bitbucket.org/eigen/eigen/commits/570884465ad7/
Summary: Bug 698: rewrite LinSpaced for integer scalar types to avoid overflow and guarantee an even spacing when possible.
Otherwise, the "high" bound is implicitly lowered to the largest value allowing for an even distribution.
This changeset also disable vectorization for this integer path.