This bugzilla service is closed. All entries have been migrated to https://gitlab.com/libeigen/eigen
Bug 618 - Tests containing NaN within > or < operators fail on Windows EC 7
Summary: Tests containing NaN within > or < operators fail on Windows EC 7
Status: DECISIONNEEDED
Alias: None
Product: Eigen
Classification: Unclassified
Component: Tests (show other bugs)
Version: 3.3 (current stable)
Hardware: Other Windows
: Normal Wrong Result
Assignee: Nobody
URL:
Whiteboard:
Keywords: Compatibility
Depends on:
Blocks:
 
Reported: 2013-06-20 08:00 UTC by Christian Schluchter
Modified: 2019-12-04 12:25 UTC (History)
5 users (show)



Attachments
Patch removing tests on results, leaving just a test that function returns (3.25 KB, patch)
2013-06-21 23:18 UTC, Jitse Niesen
no flags Details | Diff
Patch removing tests on results, leaving just a test that function returns (4.01 KB, patch)
2013-06-21 23:25 UTC, Jitse Niesen
no flags Details | Diff

Description Christian Schluchter 2013-06-20 08:00:31 UTC
When NaN and a number is compared using <, >, ... operators, they always return false on Win32, however on Win EC 7 they always return true.
e.g.
(3.3 < NaN) is false on Win32, true on WinEC
(3.3 > NaN) is false on Win32, true on WinEC

This difference causes unit tests with a matrix containing NaN to fail:
- eigensolver_complex
- eigensolver_generic
- eigensolver_selfadjoing
- jacobi_svd_7
- schur_complex
- schur_real
(list might be incomplete)

sample output:
Test eigensolver(MatrixXd(s,s)) failed in ..\..\test\eigensolver_generic.cpp (61)
		    test_is_equal(eiNaN.info(), NoConvergence)
		    actual   = 0
		    expected = 2

I hope the productive code does not rely on these operators returning false if one argument is NaN.
Comment 1 Christoph Hertzberg 2013-06-21 11:30:22 UTC
What kind of CPU does your WinEC run on? And did you activate some kind of fast-math compile options or CPU-flags? (Having NaN comparisons result in true is very non-standard ...)

I think most of the time we say that NaNs will lead to undefined behavior, but it would be a good idea to deactivate corresponding tests if we really don't know what will happen.
Comment 2 Christian Schluchter 2013-06-21 13:06:05 UTC
We run it currently on a TI OMAP 4, but experienced the same differing NaN handling on earlier Windows CE versions and older processors.
There are no optimizations or other special flags activated. 

It seems to be broken in WinCE/EC since VC6, unfortunately the only information I found about it is this excerpt from a perl wince port:
in http://osdir.com/ml/lang.perl.perl5.porters/2006-05/msg00000.html they define 

#if _MSC_VER < 1300
/* VC6 has broken NaN semantics: NaN == NaN returns true instead of false */
#define NAN_COMPARE_BROKEN 1
#endif

Anyway, the NaN handling is inconsistent, so I agree with you that the respective tests are not very useful. Additionally, it might be good to check that the productive code never relies on these comparisons returning true.
Comment 3 Christian Schluchter 2013-06-21 13:07:51 UTC
Forgot to mention: We use _MSC_VER 1500, not < 1300 as the perl port checks, but it obviously is still broken.
Comment 4 Christoph Hertzberg 2013-06-21 13:27:06 UTC
Does your productive code depend on correct NaN handling?

W.r.t making the test suite work:
Do you know the which defines are set to detect Windows CE/EC? 
And do you know if there is a reliable way to detect NaNs on these systems?
Comment 5 Christian Schluchter 2013-06-21 13:59:22 UTC
Our code does not work with NaN within Eigen, so we are not depending on this.

For the tests:
You can check for _Win32_WCE (http://msdn.microsoft.com/en-us/library/ee479161(v=winembedded.60).aspx)

_isnan should do the job (http://msdn.microsoft.com/en-us/library/ms860526.aspx)
Comment 6 Jitse Niesen 2013-06-21 23:17:53 UTC
(In reply to comment #2)
> /* VC6 has broken NaN semantics: NaN == NaN returns true instead of false */

That is how .hasNaN() is implemented and this function is implemented in the  special_numbers test. But that test passes? 

I'm pretty sure the IEEE floating point standard specified how inequality tests with NaN should be handled, though I do not have the text ready, so I can't be absolutely certain. It is annoying if either Windows CE or the hardware does not do this properly, but I guess we'll have to live with it.

I agree with Christoph, it is not necessary to define what happens if the user presents us with a matrix containing NaN. So I prepared a patch removing the test that NoConvergence is reported when the matrix contains NaN. It now only tests that the call to compute() returns. The alternative would be to test the matrix specifically for NaN at the beginning.

I am attaching the patch to this bug report.
Comment 7 Jitse Niesen 2013-06-21 23:18:53 UTC
Created attachment 353 [details]
Patch removing tests on results, leaving just a test that function returns
Comment 8 Jitse Niesen 2013-06-21 23:25:37 UTC
Created attachment 354 [details]
Patch removing tests on results, leaving just a test that function returns

Sorry, I missed a file in the previous patch, so I'm replacing it with a new patch.

By the way, I have no idea what is happening with the jacobisvd test. Could you please run it by hand and report the error message?
Comment 9 Gael Guennebaud 2013-06-21 23:52:08 UTC
What about disabling convergence tests only if NaN==NaN (tested at runtime)?
Comment 10 Christian Schluchter 2013-06-24 12:31:16 UTC
I would say removing the verifies definitely is the right thing, as it is not clear what they verify in the case of NaN.

On WinCE, however, the problem starts earlier: as soon as a comparison with NaN is done, which happens e.g. within jacobisvd_7, a debug message (turned into an assertion in the tests) "invalid operator <" appears, that is originating from Visual Studio's "less than" implementation:

template<class _Ty1, class _Ty2> inline
	bool __CLRCALL_OR_CDECL _Debug_lt(const _Ty1& _Left, const _Ty2& _Right,
		const wchar_t *_Where, unsigned int _Line)
	{	// test if _Left < _Right and operator< is strict weak ordering
	if (!(_Left < _Right))
		return (false);
	else if (_Right < _Left)
		_DEBUG_ERROR2("invalid operator<", _Where, _Line);
	return (true);
	}

This debug error appears as soon as NaN comparisons return true, which is very unfortunately the case on WinCE, and therefore at least a deviating behavior if not a bug within WinCE.
In my opinion, Microsoft should have fixed that years ago. Since they have not done that, I would say there are two possible solutions:

- check that no comparisons with NaN are done in the whole library
- or, just make sure that no NaN gets into Eigen on WinCE

Regarding the effort, I assume the second one is the way to go.
Comment 11 Jitse Niesen 2013-06-24 13:47:52 UTC
(In reply to comment #9)
> What about disabling convergence tests only if NaN==NaN (tested at runtime)?

That is of course possible, but it seemed rather inconsistent to me.

As I see it, the possibilities are:

1. Guarantee that the eigensolvers (and similar routines) signal NoConvergence if a NaN is passed to them, document this, and keep the unit test for this. In that case, I think we should explicitly test for NaN at the start of the compute() routine in the eigensolver. At the moment, the computation just happens to not converge in most cases, but this seems to be mostly an accident; for instance, a 1-by-1 matrix with a NaN does not yield NoConvergence.

2. Guarantee only that the compute() routine in the eigensolvers return, without saying anything on the result. In that case, I would say we should also remove the unit test that the eigensolver signals NoConvergence, because the unit tests and documentation should reflect what we require our functions to do. As a variant, we could also give no guarantees at all and assert that the matrix passed does not have a NaN.

3. Disabling convergence tests only if NaN==NaN (tested at runtime) would mean a guarantee that the eigensolvers (and similar routines) signal NoConvergence if a NaN is passed to them, but not on WinCE. We should try to have the same behaviour for each platform (though in the view of all the inconsistencies with NaNs on WinCE that Christian discovered, it is attractive to just ignore it). Alternatively, it could mean that we don't require our eigensolver to signal NoConvergence, but we test for it anyway. That seems to signal the wrong thing to our users.
Comment 12 Christoph Hertzberg 2014-11-02 19:05:59 UTC
(In reply to Christian Schluchter from comment #0)
> (3.3 < NaN) is false on Win32, true on WinEC
> (3.3 > NaN) is false on Win32, true on WinEC

If that behavior is guaranteed on WinCE (WinEC?) we could try to replace such comparisons with (3.3 < NaN)  -->  !(3.3>=NaN).
This would however significantly clutter our code only to fully support a rather uncommon platform.

The most easy solution (from our side) would be to document that behavior on WinCE platforms is undefined if expressions contain NaNs.
Comment 13 Christian Schluchter 2014-11-03 02:16:31 UTC
All comparisons with NaN are false on Win32 and true on WinCE, so also 3.3 >= NaN is false on Win32 and true on WinCE. That is not really a solution unless you want to add ifdefs to each comparison that might encounter a NaN. If you want that you don't need the double negation at all.

Also, I guess we cannot rely on that violation of the standard to be consistently violated throughout all WinCE versions.

(To your WinEC vs WinCE question: starting with WinCE 7 it is called Windows Embedded Compact and sometimes referred to as WinEC, but I guess you are right, the official abbreviation might still be WinCE.)
Comment 14 Christoph Hertzberg 2014-11-03 12:18:32 UTC
(In reply to Christian Schluchter from comment #13)
> unless you want to add ifdefs to each comparison that might encounter a NaN.

It would only require to implement functions like less_and_not_NaN(a,b) once (depending on WinCE'ness) and use these functions where needed. But, as I said above, that would still severely clutter our code (furthermore, it will be near impossible to ensure that all required places are handled that way).

> Also, I guess we cannot rely on that violation of the standard to be
> consistently violated throughout all WinCE versions.

Yes true, but we could define above mentioned functions depending on the version -- or we could implement of a safer isNaN test and test for that explicitly before doing the actual comparison for all WinCE targets. (I don't really want to advocate this solution, though ...)


My preferred "solution" at the moment would be to run a test program (during configuration) which checks NaN!=NaN. If that program fails we print a warning and disable test cases which rely on correct NaN handling. This might also cover the case if someone runs the test-suite with some sort of -ffast-math compile flag.
Comment 15 Gael Guennebaud 2014-11-04 01:02:14 UTC
(In reply to Christoph Hertzberg from comment #14)
> My preferred "solution" at the moment would be to run a test program (during
> configuration) which checks NaN!=NaN. If that program fails we print a
> warning and disable test cases which rely on correct NaN handling. This
> might also cover the case if someone runs the test-suite with some sort of
> -ffast-math compile flag.

I agree with that approach.
Comment 16 Nobody 2019-12-04 12:25:04 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to gitlab.com's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.com/libeigen/eigen/issues/618.

Note You need to log in before you can comment on or make changes to this bug.