Working notes - Indexing++
The goal of this page is to summarize the different ideas and working plan to (finally!) provide support for flexible row/column indexing in Eigen. See this bug report.
Contents
Needs
We aim to support various indexing mechanisms. For each dimension, we would like to be able to have any of:
- all: as in
A(:)
- singleton: as in
A(3)
(already supported!) - index-array: as in
A([3 1 5])
- range-based: as in
A(3:9)
, but sometimes it is also convenient to specify the start+length similar to whatEigen::Block
offers. - slicing: as in
A(3:2:9)
. Same as for range, we sometime also want to specify start+length+stride - negative indices: as in
A(end-3)
orA(3:2:end-1)
- boolean masking: as in
A(A>0)
All of this with a concise and unambiguous API! We should also be able to provide compile-time information such as lengths and maybe strides.
Some references
- This post about the Range v3 library.
- This comment describes an interesting API for range and slices inspired by APL.
- libigl extends Eigen by a colon and a slice function
- Several questions/feature request on StackOverflow: [1] [2] [3]
C++11 API
Since achieving a lean API in C++03 seems to be rather impossible, let's first see what could be done in C++11 and provide more verbose fallbacks for C++03.
Better than a long story, here is a showcase demo.
Source code
Click expand to see the implementation details.
#include <iostream> #include <array> #include <valarray> #include <vector> #include <Eigen/Dense> using namespace Eigen; using namespace std; struct all_t { all_t() {} }; static const all_t all{}; struct shifted_last { int offset; }; struct last_t { shifted_last operator- (int offset) const { return shifted_last{offset}; } }; static const last_t last{}; struct Range { Range(Index f, Index l) : m_first(f), m_last(l) {} Index m_first, m_last; Index size() const { return m_last-m_first+1; } Index operator[] (Index k) { return m_first + k; } }; struct RangeLast { RangeLast(Index f, last_t) : m_first(f) {} Index m_first; }; struct RangeShiftedLast { RangeShiftedLast(Index f, shifted_last l) : m_first(f), m_last(l) {} Index m_first; shifted_last m_last; }; struct Slice { Slice(Index f, Index s, Index l) : m_first(f), m_step(s), m_last(l) {} Index m_first, m_step, m_last; Index size() const { return (m_last-m_first+m_step)/m_step; } Index operator[] (Index k) { return m_first + k*m_step; } }; struct SliceLast { SliceLast(Index f, Index s, last_t) : m_first(f), m_step(s) {} Index m_first, m_step; }; struct SliceShiftedLast { SliceShiftedLast(Index f, Index s, shifted_last l) : m_first(f), m_step(s), m_last(l) {} Index m_first, m_step; shifted_last m_last; }; Range span(Index f, Index l) { return Range(f,l); } RangeLast span(Index f, last_t l) { return RangeLast(f,l); } RangeShiftedLast span(Index f, shifted_last l) { return RangeShiftedLast(f,l); } Slice span(Index f, Index s, Index l) { return Slice(f,s,l); } SliceLast span(Index f, Index s, last_t l) { return SliceLast(f,s,l); } SliceShiftedLast span(Index f, Index s, shifted_last l) { return SliceShiftedLast(f,s,l); } template<int Size=Dynamic,int Step=1,int Start=0> struct iota_t { iota_t(Index size=Size) : m_size(size), m_step(Step), m_start(Start) {} iota_t(Index size,Index step,Index start) : m_size(size), m_step(step), m_start(start) {} internal::variable_if_dynamic<Index,Size> m_size; internal::variable_if_dynamic<Index,Step> m_step; internal::variable_if_dynamic<Index,Start> m_start; operator Slice() const { return Slice(m_start.value(), m_step.value(), m_start.value()+(m_size.value()-1)*m_step.value()); } friend iota_t<Size,Dynamic,Start> operator*(int stride,const iota_t &x) { return iota_t<Size,Dynamic,Start>{x.m_size.value(),x.m_step.value()*stride,x.m_start.value()}; } iota_t<Size,Step==Dynamic?Dynamic:-Step,Start> operator-() { return iota_t<Size,Step==Dynamic?Dynamic:-Step,Start>{m_size.value(),-m_step.value(),m_start.value()}; } iota_t<Size,Step,Dynamic> operator+(int offset) const { return iota_t<Size,Step,Dynamic>{m_size.value(),m_step.value(),m_start.value()+offset}; } friend iota_t<Size,Step,Dynamic> operator+(int offset,const iota_t &x) { return iota_t<Size,Step,Dynamic>{x.m_size.value(),x.m_step.value(),x.m_start.value()+offset}; } friend iota_t<Size,Step==Dynamic?Dynamic:-Step,Dynamic> operator-(int offset,const iota_t &x) { return iota_t<Size,Step==Dynamic?Dynamic:-Step,Dynamic>{x.m_size.value(),-x.m_step.value(),x.m_start.value()+offset}; } }; iota_t<> iota(Index size) { return iota_t<>(size); } template<int Size> iota_t<Size> iota() { return iota_t<Size>(); } // NOTE that declaring value_type here is needed to make SFINAE works properly in case T does not exhibit value_type // Another solution would be to specialize has_boolean_value_type and has_index_value_type for Range, Slice, etc. template<typename T,typename value_type=typename T::value_type> struct has_boolean_value_type { enum { value = bool(internal::is_same<value_type,bool>::value) #if EIGEN_COMP_CLANG // Workaround clang's issue: // (valarray::operator<)::value_type is int instead of bool, // so we check the return type of (valarray::operator<).operator[], which is bool! || bool(internal::is_same<typename internal::remove_all<decltype(declval<T>()[0])>::type,bool>::value) #endif }; }; template<typename T> struct is_index { enum { value = std::is_integral<T>::value }; }; template<> struct is_index<bool> { enum { value = false }; }; template<typename T,typename value_type=typename T::value_type> struct has_index_value_type { enum { value = is_index<value_type>::value }; }; struct MyMat { void operator()(Index k) { cout << "singleton: "; cout << k << '\n'; } void operator()(const char*) { cout << "all\n"; } void operator()(all_t) { cout << "all\n"; } void operator()(Range ids) { cout << "range-based: "; print_indices(ids); } void operator()(RangeLast ids) { return operator()({ids.m_first,m_size-1}); } void operator()(RangeShiftedLast ids) { return operator()({ids.m_first,m_size-ids.m_last.offset-1}); } void operator()(Slice ids) { cout << "slice-based: "; print_indices(ids); } void operator()(SliceLast ids) { return operator()(Slice{ids.m_first,ids.m_step,m_size-1}); } void operator()(SliceShiftedLast ids) { return operator()(Slice{ids.m_first,ids.m_step,m_size-ids.m_last.offset-1}); } template<typename Indices> typename internal::enable_if<has_index_value_type<Indices>::value>::type operator()(const Indices &ids) { cout << "index-based: "; print_indices(ids); } template<typename Mask> typename internal::enable_if<has_boolean_value_type<Mask>::value>::type operator()(const Mask &mask) { cout << "mask-based: "; for(int k=0; k<mask.size(); ++k) if(mask[k]) cout << k << " "; cout << "\n"; } template<typename T> void F(T) {} protected: template<typename T> void print_indices(T& ids) { for(int k=0; k<ids.size(); ++k) cout << ids[k] << " "; cout << "\n"; } Index m_size = 11; }; #define PRINT(X,P) cout << #X << P << " -> "; X int main() { ArrayXd eia(10); eia.setRandom(); ArrayXi eii(4); eii << 3, 1, 7, 5; valarray<double> vala(10); Map<ArrayXd>(&vala[0],10) = eia; valarray<int> vali(4); Map<ArrayXi>(&vali[0],4) = eii; vector<int> veci(4); Map<ArrayXi>(veci.data(),4) = eii; MyMat mat; cout << "\nc++98 API:\n" << "-------------\n"; PRINT( mat(5)," " ); // singleton PRINT( mat(all)," " ); // all PRINT( mat("")," " ); // all PRINT( mat(span(3,9))," " ); // range PRINT( mat(span(3,last-2))," " ); // range PRINT( mat(span(9,3))," " ); // empty range (as in MatLab), need to use a slice: PRINT( mat(span(9,-1,3))," " ); // slice PRINT( mat(span(3,3,last-3))," " ); // slice PRINT( mat(eii)," " ); // index-based PRINT( mat(vali)," " ); // index-based PRINT( mat(veci)," " ); // index-based PRINT( mat(eia>0)," " ); // mask PRINT( mat(vala>0.0)," " ); // mask cout << "\nc++11 shortcuts:\n" << "-------------\n"; PRINT( mat({3,9})," " ); // range PRINT( mat({3,last})," " ); // range PRINT( mat({3,last-2})," " ); // range PRINT( mat({9,3})," " ); // empty range (as in MatLab), need to use a slice: PRINT( mat({9,-1,3})," " ); // slice PRINT( mat({9,-2,1})," " ); // slice PRINT( mat({3,2,9})," " ); // slice PRINT( mat({3,2,last})," " ); // slice PRINT( mat({3,3,last-3})," " ); // slice PRINT( mat(valarray<int>{5,2,5,6}),"" ); // index-based PRINT( mat(array<int,4>{5,2,5,6})," " ); // index-based PRINT( mat( vector<bool>{false,true,true,false})," " ); // mask PRINT( mat(!valarray<bool>{false,true,true,false}),"" ); // mask cout << "\nAPL-like API (c++98 compatible):\n" << "--------------------------------\n"; PRINT( mat(iota(3))," " ); PRINT( mat(2+iota(3))," " ); PRINT( mat(3+2*iota(3))," " ); PRINT( mat(9-iota(3))," " ); PRINT( mat(9-2*iota(3))," " ); PRINT( mat(9-2*iota<3>())," " ); }
Generated output
c++98 API: ---------- mat(5) -> singleton: 5 mat(all) -> all mat("") -> all mat(span(3,9)) -> range-based: 3 4 5 6 7 8 9 mat(span(3,last-2)) -> range-based: 3 4 5 6 7 8 mat(span(9,3)) -> range-based: mat(span(9,-1,3)) -> slice-based: 9 8 7 6 5 4 3 mat(span(3,3,last-3)) -> slice-based: 3 6 mat(eii) -> index-based: 3 1 7 5 mat(vali) -> index-based: 3 1 7 5 mat(veci) -> index-based: 3 1 7 5 mat(eia>0) -> mask-based: 2 4 7 8 9 mat(vala>0.0) -> mask-based: 2 4 7 8 9 c++11 shortcuts: ---------------- mat({3,9}) -> range-based: 3 4 5 6 7 8 9 mat({3,last}) -> range-based: 3 4 5 6 7 8 9 10 mat({3,last-2}) -> range-based: 3 4 5 6 7 8 mat({9,3}) -> range-based: mat({9,-1,3}) -> slice-based: 9 8 7 6 5 4 3 mat({9,-2,1}) -> slice-based: 9 7 5 3 1 mat({3,2,9}) -> slice-based: 3 5 7 9 mat({3,2,last}) -> slice-based: 3 5 7 9 mat({3,3,last-3}) -> slice-based: 3 6 mat(valarray<int>{5,2,5,6}) -> index-based: 5 2 5 6 mat(array<int,4>{5,2,5,6}) -> index-based: 5 2 5 6 mat( vector<bool>{false,true,true,false}) -> mask-based: 1 2 mat(!valarray<bool>{false,true,true,false}) -> mask-based: 0 3 APL-like API (c++98 compatible): -------------------------------- mat(iota(3)) -> slice-based: 0 1 2 mat(2+iota(3)) -> slice-based: 2 3 4 mat(3+2*iota(3)) -> slice-based: 3 5 7 mat(9-iota(3)) -> slice-based: 9 8 7 mat(9-2*iota(3)) -> slice-based: 9 7 5 mat(9-2*iota<3>()) -> slice-based: 9 7 5
Discussions
- For concision, this demo only considers the 1D case, but of course the idea is to implement it in the nD settings. For instance, with a 3D tensor, we could imagine:
mat(span(0,2,last), "", vec>0)
which is why an API likemat.slice(...)
is a no go. This is also a crucial limitation of the currentEigen::Block
-based API. - We use the word 'last' instead of Matlab's 'end' to (1) reduce collision with std::end, and (2) make it more consistent with STL's end as 'last' refer to the last element whereas 'end' refers to the last+1.
- Using
""
for all might looks odd, but it would avoid introducing a polluting identifierall
(nobody wants to writeEigen::all
). We have the same issue withlast
though. - c++98
span(first,[step,]last)
API:- Pros:
- mimics MatLab first:[step:]last API
- enable indexing from the end
- Cons:
- the argument order is not explicit
- we could specify compile time size with
span<3>(4,6)
orspan<3>(4,2,8)
but then we have redundant information...
- Pros:
- APL-like API:
- Pros:
- easy to specify compile-time info
- no argument-order confusion
- can be built on top of the range/slice API (hence, can be done outside Eigen)
- Cons:
- cumbersome?
- needs to specify length instead of first-last
- Pros:
- c++11
mat({first,[step,]last})
API:- Pros:
- very compact
- closely mimics MatLab first:step:last API
- no need to import a possibly polluting name as
Eigen::span
- enable indexing from the end
- Cons:
- might be confusing with
mat({i0,i1,i2})
(which is whymat({i0,i1,i2,i3,...})
is not allowed) - does not permit to specify compile-time size
- the argument order is not explicit
- might be confusing with
- Pros:
- What does not work:
mat({5,2,6,7,1}); // needs to explicitly use a valarray, array, or ... mat({true,false,false,true,true,false,true}); // needs to explicitly use a valarray, array, or ...
C++03 fallback API
Internal implementation details
Performance
One could be tempted to leverage AVX2 gather/scatter instructions, but those are horribly slow. Better emulate them for now.