a good suggestion from mattd (http://forum.kde.org/viewtopic.php?f=74&t=93733&p=190551#p190474):
Perhaps CSV support would be possible in the future?
It's standardized (RFC 4180) and seems to be _the_ lowest-common-denominator data-exchange format supported by a very wide range of software, from spreadsheets (e.g. Gnumeric, Microsoft Excel, Open Office), through scientific computing software (e.g. Mathematica, MATLAB, GNU Octave, Scilab), to a variety of statistical packages (E-Views, Ox, OxMetrics, R, SAS, STATA, ...).
Ideally, overloading the operator<< so that in addition to current functionality it accepts a csv-file as a right-operand would fit with the way Eigen does things -- and "comma-initialization" description would still fit perfectly!
though I'm not convinced by using operator<< for that but why not.
I think it would make more sense to overload operator>>(std::istream&, Eigen::DenseBase<Derived>&), thus being compatible to standard C++ I/O. Of course istream can be an ifstream.
A question is, if CSV streams are always accepted, or maybe only for something like:
stream >> A.format(...); // with some fitting format description.
I agree with the idea of following the Standard Library.
My thinking is that it would be good to support both std::istream (including std::ifstream) and std::ostream (including std::ofstream), with the respective use of operator>> for input (input_stream >> eigen_object) -- and operator<< for output (output_stream << eigen_object).
CSV output is already possible, using something like this:
IOFormat CSVFmt(FullPrecision, 0, ", ");
IOFormat CSVFmt(FullPrecision, DontAlignCols, ",\t");
// and then
std::cout << A.format(CSVFmt);
Created attachment 434 [details]
stream operator>> for DenseBase
I overloaded the std::istream &operator>>(std::istream &s, DenseBase<Derived> &m). Source is attached as eigen.h file.
The implementation makes it possible to read (probably) any format possible by IOFormat class.
What does work:
- formats mentioned in http://eigen.tuxfamily.org/dox/structEigen_1_1IOFormat.html
- CSV Format with IOFormat(FullPrecision, 0, ",") // no blank in ","!!!
I tested it with LibreOffice and it turned out to be quite complicated because libreoffice does not honor the file format of the source file.
What does not work (yet):
- Matrix needs either fixed dimensions or a known size. Eigen::Dynamic is not possible because I don't know how to resize DenseBase class. Also runtime detection of 'data size' in stream is realy ugly and not possible for every IOFormat. (In the end I removed this code.)
- No runtime selection of IOFormat. The format has to be selected by defining EIGEN_DEFAULT_IO_FORMAT.
Hope this helps,
(In reply to comment #4)
> Created attachment 434 [details]
> stream operator>> for DenseBase
Nice start. Perhaps we can finalize I/O for 3.3
> What does not work (yet):
> - Matrix needs either fixed dimensions or a known size.
There is a resize(rows, cols) method defined in DenseBase with a meaningful specialization for Matrix<...> and Array<...>.
But I agree that this is not easy to integrate, on the one hand because not every format allows automatic detection where (the row of) a matrix ends, on the other hand, resizing is expensive, because all data needs to be copied to the resized matrix for each resize operation.
A feasible solution would involve some temporary data storage, e.g. in a std::vector<Scalar>.
> - No runtime selection of IOFormat. The format has to be selected by defining
I would suggest a syntax equivalent to the withFormat() output, e.g.
stream >> A.withFormat(IOFormat(...));
Furthermore, it would be nice to allow more liberal input formats, i.e. optionally ignore all whitespace or have formats which allow all of
1 2;3 4
[[1 2][3 4]]
But that comes close to writing a full parser, especially if accepted formats shall be definable at runtime.
We also need to decide what to do if the format does not fit the input or the input does not suffice to fill the matrix. I don't think assertions are a good way whenever user input is involved.
Basically, I see two alternatives:
1. Set the stream status to bad -- that's what C++ streams generally do for bad input. I don't really like this, because it needs manual checking after each input and is likely to be forgotten, leading to subtle bugs.
2. Throw an exception. I would generally prefer this if things can go wrong depending on actual input. However, this does not work if compiled with exceptions disabled.
Created attachment 435 [details]
Dynamic input for vectors
For reference, this is what I once wrote for inputting dynamic sized vectors. It lacks many abilities such as custom braces/separators and it does not work for matrices.
If I may tune in :-)
(a) Input formats.
This sounds like a good candidate for a nice-to-have feature, but I think it may also be important to prioritize. Perhaps the first priority should be to finish the RFC 4180 CSV support first (since that's the most widely supported general lowest-common-denominator input/output format) -- and only after that's done consider more advanced / customized I/O formats. Thoughts?
(b) Input format errors.
I've found the following informative:
Ideally, it would be nice to achieve a design such that by sticking to idiomatic C++ the users shouldn't even normally be able to express "a source code that's forgotten to check the error state" in their programs. In general, I think it's a good idea to follow the already established (and thus well-known) existing conventions in the C++ Standard Library (due to the Principle Of Least Astonishment, essentially).
Perhaps it would be also reasonable to consider behaving like boost::lexical_cast on encountering a non-numeric value with the user-requested to-numeric conversion -- i.e., throwing:
That being said, an interesting alternative may be that of Boost.Math -- let the user choose the error handling policy at compile-time.
Note how the users are allowed to choose from among the following error handling policies:
"The available actions are:
- throw_on_error: Throws the exception most appropriate to the error condition.
- errno_on_error: Sets ::errno to an appropriate value, and then returns the most appropriate result
- ignore_error: Ignores the error and simply the returns the most appropriate result.
- user_error: Calls a user-supplied error handler."
Different application areas and different use cases may imply that different error handling policies are optimal -- perhaps it's very legitimate to open it up as a customization area?
In general, this offers some good advice: