This bugzilla service is closed. All entries have been migrated to https://gitlab.com/libeigen/eigen

Bug 564

Summary: maxCoeff() returns -nan instead of max, while maxCoeff(&maxRow, &maxCol) works
Product: Eigen Reporter: Karl <pelonza>
Component: Core - expression templatesAssignee: Nobody <eigen.nobody>
Status: DECISIONNEEDED ---    
Severity: Unknown CC: chtz, gael.guennebaud, jacob.benoit.1, rmlarsen, smatzek
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
Bug Depends on: 1687, 1373    
Bug Blocks: 814    
Attachments:
Description Flags
Add hasNaN and isFinite members. gael.guennebaud: review?

Description Karl 2013-03-13 18:46:17 UTC
I've got code that for reasons (well, errors i think) ends up leaving some "-nan" in a matrix I'm searching for max values in. But the two "maxCoeff(**)" calls return different values....
Not sure which version of Eigen is installed (one of the 3's I think) since I didn't do it.

When I call:

double mymax;
ArrayXXd smAry;
ArrayXXd::Index maxRow,maxCol;

mymax=smAry.maxCoeff();
mymax=smAry.maxCoeff(&maxRow, &maxCol);


--------------------
smAry looks like (when output with cout<<smAry):
   0    4    3 -nan
   4 -nan    5    3
   3    5    0    5
   0    3    5 -nan

Output in mymax from first:
-nan
Output in mymax from 2nd command
5
Comment 1 Christoph Hertzberg 2013-03-13 21:43:34 UTC
I agree that they should definitely return the same value and we should specify how it deals with NaNs. Keep in mind that "the biggest" does not suffice as specification, since both (5>-nan) and (-nan>5) are false by IEEE754.

What's definitely wrong though is:

	Eigen::ArrayXd arr(6);
	arr << 2,2, 0.0/0.0, 0.0/0.0, 1, 1;
	std::cout << arr.transpose() << "\nmaxCoeff() = " << arr.maxCoeff();

Output (at least when compiled with vectorization enabled):
  2   2 nan nan   1   1
maxCoeff() = 1

The problem is that maxps and maxpd always propagate the source operand, if at least one operand is NaN.
Comment 2 Gael Guennebaud 2013-03-19 10:55:21 UTC
I don't really know how to fix that expect documenting that the result of min/max is undefined in the presence of NaN. Vectorization is not the only problem here. For instance look at this example:

std::cout << std::max(std::max(2.,nan),1.0) << "\n";
std::cout << std::max(std::max(nan,2.),1.0) << "\n";
std::cout << std::max(1.0,std::max(nan,2.)) << "\n";

The result is:

2
nan
1
Comment 3 Gael Guennebaud 2013-03-19 11:01:04 UTC
well, it seems OK for the std::max to return an arbitrary value because it's only a binary operator, but in our case I admit that's not good. However I don't see how to fix this without introducing a large overhead.
Comment 4 Christoph Hertzberg 2013-03-19 11:26:11 UTC
So std::max seems to propagate the first operand if one operand is nan, if the SSE-intrinsics for maxpX always properly propagate the same value, a solution would be to always use the current max values as source and the newly read values as dest operand. This will give defined behavior at the cost of a minor overhead (which could be disabled if some fast_math define is present). I'm not sure if compilers are allowed to change the operands for SSE intrinsics, if they are "almost" commutative.

Pseudo code for VectorXd v (std::max can be replaced by the appropriate SSE-intrinsics):

Scalar maxval = v(0);
for(int i=1; i<v.rows(); ++i){
  // defined behavior, NaNs are ignored:
  maxval = std::max(v(i), maxval);

  // spurious behavior:
  // maxval = std::max(maxval, v(i));
}
return maxval;
Comment 5 Christoph Hertzberg 2013-03-20 14:11:51 UTC
I did some testing and I must admit that I was a bit enthusiastic with my last post (it only works if the first entry/package(s) is not a NaN). 
Otherwise, in fact an overhead introducing test needs to be implemented.
Something like this (pseudo code):
stable_max(a,b) = a>b || !(b==b) ? a : b; // always propagates not-NaNs
stable_max(a,b) = a>b || !(a==a) ? a : b; // always propagates NaNs

So maybe documenting it as undefined if NaNs are present is fine (Karl might disagree there).
Comment 6 Gael Guennebaud 2013-03-20 14:44:56 UTC
Yes, I was also thinking about documenting as undefined because the purely sequential approach you described would also kill the performance regarding pipelining, probably more that your stable_max. 

One can also easily call vec.redux(stable_norm) with his favorite stable_norm version, or can we find an elegant API to offer these variants?

I also think that we really should add a convenient and easily accessible way to check the presence of NaN, like adding:

bool hasNaN() const { return !((this->array()==this->array()).all()); }
Comment 7 Karl 2013-03-20 15:03:00 UTC
So, my biggest issue was more that it wasn't documented than that it didn't work. Though I'd certainly prefer it to work as I'd expect...

I'm not precisely sure where the overhead you guys are talking about for making a stable version comes from, but the code-guru in my research group threw together a templated max-function for sparse matrices for me which utilizes the "limits" header, to have more stable behavior:

#include <limits>
(some other includes for sample code, and eigen)

template<typename M>
typename M::Scalar sparse_max(const M& m) {
  typedef typename M::Scalar scalar;
  typedef typename M::Index index;
  typedef typename M::InnerIterator iterator;

  scalar res = std::numeric_limits<scalar>::min();

  for(index k = 0; k < m.outerSize(); ++k)
    for(iterator it(m, k); it; ++it)
      res = std::max(res, it.value());
  return res;
}


Perhaps using something like the limit would introduce a very minimal overhead? Also, this function might be useful to include in general for your sparse matrices.

It could also be an overload on the min/max functions that allows a "global minimum" for the first comparison, then make sure you are using the more stable version (with nan 2nd). 
something like:
maxCoeff(&maxRow, &maxCol, <matrix type(double, etc)> <value max must be greater than>)

I realize it's nearly identical to (mat><value>).any() ... but...
Comment 8 Karl 2013-03-20 15:06:16 UTC
It is also important to specifically document this behavior I think...since there MIGHT be people that WANT to know if a NaN is present or the minimum value (there's not really an easy way right now without a self-defined redux).. which seems like a reasonably common thing you might want to check for....
Comment 9 Christoph Hertzberg 2013-03-20 15:16:24 UTC
We could add stable_max, stable_min methods, maybe with a template argument
which decides whether NaNs shall be suppressed or propagated.
Or directly add maxCoeff<Eigen::SuppressNaN>(), etc.

As for hasNaN(), this would certainly come in handy from time to time, maybe
along with an isFinite() method.

@Karl, comment 7: Basically that was what I had in mind with comment 4 (I mixed up the operand order of max, though ...). 
Two minor problems: If your matrix only consists of NaNs or -Inf, still numeric_limits::min() will be returned. And it would require the redux function to have some kind of initial value (which again would introduce some overhead, especially for small matrices/vectors)

And I would also say that I actually expect NaNs to propagate through max/min reductions (i.e., if a mistake happened once, I don't want it to be silently ignored)
Comment 10 Gael Guennebaud 2013-04-09 21:39:42 UTC
Step 1, documentation:

https://bitbucket.org/eigen/eigen/commits/cd15b09304ff/
Changeset:   cd15b09304ff
User:        ggael
Date:        2013-04-09 11:27:54
Summary:     Bug 564: document the fact that minCoeff/maxCoeff members have undefined behavior if the matrix contains NaN.
Comment 11 Gael Guennebaud 2013-04-09 23:01:13 UTC
Created attachment 327 [details]
Add hasNaN and isFinite members.

Here is a patch adding hasNaN and isFinite members. Probably not the most optimal implementation for native types, but should not be too bad. Since we already had a beta, I'd like someone approve it before pushing.
Comment 12 Christoph Hertzberg 2013-04-12 14:37:52 UTC
I think the patch is alright. Maybe some more tests asserting that the floating point operations on "sane values" lead to finite results could be added?

I've added another Bug 585, which suggests improving the all()/any() reductions. 
Furthermore, hasNaN could be optimized by using the _mm_cmpunord_pX intrinsic on two consecutive packets, thus halving the number of comparisons. But I don't think that's worth blocking 3.2.
Comment 13 Christoph Hertzberg 2013-04-12 14:43:19 UTC
Another thing: What shall hasNaN() and isFinite() return for types which do not support special values (such as integers)? Simply always return false or true, resp.? That would be supported already by the current implementation (not in the most efficient way; however, compilers are pretty good at integer optimizing, so it could be reduced to a NOP on most compilers?)
Comment 14 Gael Guennebaud 2013-04-16 15:06:01 UTC
I checked with GCC and clang, and indeed, isFinite and hasNaN boil down to a NOP.

Using _mm_cmpord_pX (not _mm_cmpunord_pX) is a nice trick, however I'm afraid this is not easily integrable within our general framework.
Comment 15 Gael Guennebaud 2013-04-16 15:12:06 UTC
https://bitbucket.org/eigen/eigen/commits/8ad4e281a3ce/
Changeset:   8ad4e281a3ce
User:        ggael
Date:        2013-04-16 15:10:40
Summary:     Big 564: add hasNaN and isFinite members

still remains the "stable" versions -> for 3.3
Comment 16 Christoph Hertzberg 2013-04-17 10:47:13 UTC
(In reply to comment #14)
> Using _mm_cmpord_pX (not _mm_cmpunord_pX) is a nice trick, however I'm afraid
> this is not easily integrable within our general framework.

Yes, I guess that would require the "meta package" approach which has been vaguely discussed for some time (is there a bz entry for that?). 
And I agree that it is not really worth the effort for now, since most likely hasNaN will hardly ever be performance critical and even if, I think memory throughput is the limiting factor most of the time.
Comment 17 Gael Guennebaud 2013-04-19 14:08:48 UTC
(In reply to comment #16)
> I think memory throughput is the limiting factor most of the time.

very good point. Definitely not worth the effort!
Comment 18 Gael Guennebaud 2013-07-18 11:31:47 UTC
For the record, isFinite has been renamed allFinite to avoid a naming collision with a coefficient-wise isFinite method.
Comment 19 Christoph Hertzberg 2018-03-01 18:35:47 UTC
Continuing a discussion from Bug 1494:

The current IEEE754-2008 standard defines new functions `minNum` and `maxNum` (section 5.3.1), which propagate non-NaNs (as my first `stable_max` variant in Comment 5).
We could add new functions of the same names, copying that behavior and leave the min/max functions undefined for NaN inputs (IEEE754-2008 does not define min or max functions, as far as I read it)
Comment 20 Samuel Matzek 2018-03-05 19:08:51 UTC
I agree that two sets of functions may be the way to go. The various places that currently call pmin/pmax can then change as necessary.  For instance, the scalar_ops used by tensors would probably want the nan propagating versions since http://eigen.tuxfamily.org/bz/show_bug.cgi?id=1373 noted this.

To re-iterate the two behaviors:
One set, like IEEE754-2008's `minNum` and `maxNum`:

min(x,NaN) = min(NaN,x) = x
max(x,NaN) = max(NaN,x) = x

Another set like std::min/max behavior, always returning the first argument:
min(NaN, x) = nan
min(x, NaN) = x

max(NaN, x) = nan
max(x, NaN) = x
Comment 21 Christoph Hertzberg 2018-03-05 19:42:54 UTC
I would keep the default behavior as 'most efficient implementation' and document that the behavior is undefined, if NaNs are involved (maybe describe what the typical behavior on some platforms is, and try to unify that behavior, if it does not make a performance difference). In my opinion, usually NaN-propagation would make more sense then number propagation (but we could of course implement all variants).
Comment 22 Christoph Hertzberg 2019-06-21 07:46:33 UTC
Related pull-request: https://bitbucket.org/eigen/eigen/pull-requests/658/
Comment 23 Rasmus Munk Larsen 2019-06-24 23:04:00 UTC
The IEEE754-2008 is also "codified" in std::fmin/std::fmax now. My PR https://bitbucket.org/eigen/eigen/pull-requests/658/ changes pmin to follow this, but I agree that this is perhaps too conservative as it sacrifices speed. 

The question in my mind is whether we want to bloat the public API to explicitly let users control this corner case. I would tend to lean towards using the EIGEN_FAST_MATH flag to indicate whether the behavior is defined or now. One option would be that EIGEN_FAST_MATH==true gives no guarantees and in practice yields current platform-dependent behavior, but EIGEN_FAST_MATH=false gives the IEEE754-2008 `minNum` and `maxNum` behavior of std::fmin/std::fmax.
Comment 24 Christoph Hertzberg 2019-06-25 08:26:21 UTC
One problem with making this depend on a compile-time-flag is that it is difficult to link compilation-units together which want different behavior.
Also the granularity of control is difficult, e.g., if you want stable min/max behavior but vectorized sin/cos computations, you would have a problem.

And as said in Bug 1687, I'm not happy with EIGEN_FAST_MATH in general, at least not with its current behavior.
Comment 25 Christoph Hertzberg 2019-06-25 18:03:34 UTC
Here is a minimal API-demonstration of what I have in mind regarding a templated version:
https://godbolt.org/z/5VyfDX, i.e., the following calls would be possible:
   
   V.maxCoeff(); // default behavior (fastest)
   V.maxCoeff<Fastest>();
   V.maxCoeff<PropagateNaN>();
   V.maxCoeff<PropagateNumbers>();

Of course, names are open for discussion, and this would need to be extended to all other variants. And the enum may get extended in the future.



There are actually some cases which could be optimized, e.g.,

   float foo(VectorXf const& V) {
       return V.cwiseAbs().maxCoeff<PropagateNaN>();
   }

Could be implemented using a pand and an integer-max (because for positive floats, the bit-pattern is monotonically increasing). Maybe this could be implemented by a V.maxCoeff<AbsValuePropagateNaN>(), or so (but that could easily be added later, as well as variants which return the maximal next power of two, cf. Bug 1666).


And if V.maxCoeff<PropagateNumber>() is allowed to return -inf, if all entries are NaN, we could (in the main-loop) do something like:

    Packet4f mx = pset1<float>(-infinity);
    for(int i=0; i<V.size(); i+=4) mx = pmax(mx, V.packet(i));

Of course, this needs to be unrolled, because `pmax` has some latency.
Comment 26 Gael Guennebaud 2019-07-01 13:34:46 UTC
If we really want to expose the different behaviors, the only option to me is to expose them at the call-site. Then we could go with different function names (A):

  a.maxCoeff();
  a.maxCoeffOrNaN();
  a.maxCoeffIgnoreNaN();
  a.max(b);
  a.maxOrNaN(b);
  a.maxIgnoreNaN(b);

a function parameter (B):

  a.maxCoeff(PropagateNaN);
  a.maxCoeff(PropagateNumbers);
  a.max(PropagateNaN,b);
  a.max(PropagateNumbers,b);

or a template parameter (C) as you proposed.

All variants have downsides:

(C) requires 'a.template maxCoeff<...>()' in template code.

(B) and (C) pollute the Eigen namespace and and most cases will require the user to write:

   a.maxCoeff(Eigen::PropagateNaN);
   a.template maxCoeff<Eigen::PropagateNaN>();

(B) Which one comes first, the policy (as in the STL) of the second array?, i.e., a.max(PropagateNaN,b) or a.max(b,PropagateNaN) ???

(A) pollutes the DenseBase API.
Comment 27 Christoph Hertzberg 2019-07-01 16:33:31 UTC
(In reply to Gael Guennebaud from comment #26)
> (C) requires 'a.template maxCoeff<...>()' in template code.

Agreed, this is sub-optimal. But I'd prefer this to options (A) and (B), also because in this case forwarding the template parameter to the internal implementation should be simple. It only requires templatezing (without code-duplication) `scalar_max_op`/`scalar_min_op` (and some related methods) and specializing `pmax`/`pmin`, et al.

And `a.maxCoeff()` would of course still be possible (i.e., the current behavior does not change).


But if this has very little possible users, I'm also fine with just closing this (and perhaps providing an example implementation in the documentation how to do this manually, like here: https://bitbucket.org/snippets/chtz/oAojXr).
Comment 28 Gael Guennebaud 2019-07-01 16:47:05 UTC
I'm fine with (C).

On my side, I see how PropagateNaN could be useful, at least for debugging, and it seems that TF is rather interested by the PropagateNumbers variant.

Rasmus, is option (C) fine to you? Are you willing to adjust your PR to match this design?
Comment 29 Rasmus Munk Larsen 2019-08-01 16:57:47 UTC
Sorry for the late response, I was on vacation. I'm OK with C. Do we want to support 3 values of the template parameter, e.g.

PropagateNumbers (fmax/fmin)
PropagateNan
PropagateFirstArgIfNaN (std::min/std::max)

?
Comment 30 Nobody 2019-12-04 12:09:23 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/564.