Working notes - Indexing++
From Eigen
The goal of this page is to summarize the different ideas and working plan to (finally!) provide support for flexible row/column indexing in Eigen. See this bug report.
Contents
Needs
We aim to support various indexing mechanisms. For each dimension, we would like to be able to have any of:
- all: as in
A(:)
- singleton: as in
A(3)
(already supported!) - index-array: as in
A([3 1 5])
- range-based: as in
A(3:9)
, but sometimes it is also convenient to specify the start+length similar to whatEigen::Block
offers. - slicing: as in
A(3:2:9)
. Same as for range, we sometime also want to specify start+length+stride - negative indices: as in
A(end-3)
orA(3:2:end-1)
- boolean masking: as in
A(A>0)
All of this with a concise and unambiguous API! We should also be able to provide compile-time information such as lengths and maybe strides.
Some references
- Some overviews what other programming languages do: [1] [2]
- This post about the Range v3 library.
- This comment describes an interesting API for range and slices inspired by APL.
- libigl extends Eigen by a colon and a slice function
- Several questions/feature request on StackOverflow: [3] [4] [5] [6]
API playground
To get some support for the discussions, let's start with a demo of a work-in-progress demo.
Source code
Click expand to see the implementation details.
#include <iostream> #include <array> #include <valarray> #include <vector> #include <Eigen/Dense> using namespace Eigen; using namespace std; struct all_t { all_t() {} }; static const all_t all{}; struct shifted_last { explicit shifted_last(int o) : offset(o) {} int offset; shifted_last operator+ (int x) const { return shifted_last(offset+x); } int operator- (shifted_last x) const { return offset-x.offset; } }; struct last_t { shifted_last operator- (int offset) const { return shifted_last(-offset); } int operator- (last_t) const { return 0; } int operator- (shifted_last x) const { return -x.offset; } }; static const last_t last{}; template<int V> struct Index_c { static const int value = V; operator int() const { return value; } }; //-------------------------------------------------------------------------------- // Range(first,last) and Slice(first,step,last) //-------------------------------------------------------------------------------- template<typename FirstType=Index,typename StepType=Index_c<1>,typename LastType=Index> struct Span_t { Span_t(FirstType f, StepType s, LastType l) : m_first(f), m_step(s), m_last(l) {} FirstType m_first; StepType m_step; LastType m_last; Index size() const { return (m_last-m_first+m_step)/m_step; } Index operator[] (Index k) const { return m_first + k*m_step; } }; template<typename FirstType,typename LastType> struct Span_t<FirstType,Index_c<1>,LastType> { Span_t(FirstType f, LastType l) : m_first(f), m_last(l) { } Span_t(FirstType f, Index_c<1> s, LastType l) : m_first(f), m_last(l) {} FirstType m_first; LastType m_last; Index size() const { return (m_last-m_first+1); } Index operator[] (Index k) const { return m_first + k; } }; template<typename T> struct cleanup_span_type { typedef Index type; }; template<> struct cleanup_span_type<last_t> { typedef last_t type; }; template<> struct cleanup_span_type<shifted_last> { typedef shifted_last type; }; template<typename FirstType,typename LastType> Span_t<typename cleanup_span_type<FirstType>::type,Index_c<1>,typename cleanup_span_type<LastType>::type > span(FirstType f, LastType l) { return Span_t<typename cleanup_span_type<FirstType>::type,Index_c<1>,typename cleanup_span_type<LastType>::type>(f,l); } template<typename FirstType,typename LastType> Span_t<typename cleanup_span_type<FirstType>::type,Index,typename cleanup_span_type<LastType>::type > span(FirstType f, Index s, LastType l) { return Span_t<typename cleanup_span_type<FirstType>::type,Index,typename cleanup_span_type<LastType>::type>(f,s,l); } //-------------------------------------------------------------------------------- // APL-like implementation based on iota(n) <=> {0,1,2,3,...n-1} //-------------------------------------------------------------------------------- template<typename StartType> struct iota2slice { typedef Span_t<Index,Index,Index> type; typedef Index last_type; }; template<> struct iota2slice<shifted_last> { typedef Span_t<shifted_last,Index,shifted_last> type; typedef shifted_last last_type; }; template<int Size=Dynamic,int Step=1,typename StartType=Index_c<0> > struct iota_t { iota_t(Index size=Size) : m_size(size), m_step(Step) {} iota_t(Index size,Index step,StartType start) : m_size(size), m_step(step), m_start(start) {} internal::variable_if_dynamic<Index,Size> m_size; internal::variable_if_dynamic<Index,Step> m_step; StartType m_start; typedef typename iota2slice<StartType>::type SliceType; typedef typename iota2slice<StartType>::last_type last_type; SliceType span() const { return SliceType(m_start, m_step.value(), m_start+(m_size.value()-1)*m_step.value()); } operator SliceType() const { return span(); } // set a runtime step friend iota_t<Size,Dynamic,StartType> operator*(int stride,const iota_t &x) { return iota_t<Size,Dynamic,StartType>{x.m_size.value(),x.m_step.value()*stride,x.m_start}; } // negate the step iota_t<Size,Step==Dynamic?Dynamic:-Step,StartType> operator-() { return iota_t<Size,Step==Dynamic?Dynamic:-Step,StartType>{m_size.value(),-m_step.value(),m_start}; } // set first element iota_t<Size,Step,Index> operator+(int offset) const { return iota_t<Size,Step,Index>{m_size.value(),m_step.value(),m_start+offset}; } // set first element friend iota_t<Size,Step,Index> operator+(int offset,const iota_t &x) { return iota_t<Size,Step,Index>{x.m_size.value(),x.m_step.value(),x.m_start+offset}; } // set first index and negative step friend iota_t<Size,Step==Dynamic?Dynamic:-Step,Index> operator-(int offset,const iota_t &x) { return iota_t<Size,Step==Dynamic?Dynamic:-Step,Index>{x.m_size.value(),-x.m_step.value(),x.m_start+offset}; } // set first index to last element and negate the step friend iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last> operator-(last_t,const iota_t &x) { return iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last>{x.m_size.value(),-x.m_step.value(),shifted_last(x.m_start)}; } // set first index to last element and negate the step friend iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last> operator-(shifted_last l,const iota_t &x) { return iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last>{x.m_size.value(),-x.m_step.value(),shifted_last(l.offset+x.m_start)}; } }; iota_t<> iota(Index size) { return iota_t<>(size); } template<int Size> iota_t<Size> iota() { return iota_t<Size>(); } #if __cplusplus > 201103L template<int X> static const Index_c<X> c{}; template<int Size> static const iota_t<Size> Iota{}; #endif template<int Size> iota_t<Size> iota(Index_c<Size>) { return iota_t<Size>(); } //-------------------------------------------------------------------------------- // Matrix implementation //-------------------------------------------------------------------------------- // NOTE that declaring value_type here is needed to make SFINAE works properly in case T does not exhibit value_type // Another solution would be to specialize has_boolean_value_type and has_index_value_type for Range, Slice, etc. template<typename T,typename value_type=typename T::value_type> struct has_boolean_value_type { enum { value = bool(internal::is_same<value_type,bool>::value) #if EIGEN_COMP_CLANG // Workaround clang's issue: // (valarray::operator<)::value_type is int instead of bool, // so we check the return type of (valarray::operator<).operator[], which is bool! || bool(internal::is_same<typename internal::remove_all<decltype(declval<T>()[0])>::type,bool>::value) #endif }; }; template<typename T> struct is_index { enum { value = std::is_integral<T>::value }; }; template<> struct is_index<bool> { enum { value = false }; }; template<typename T,typename value_type=typename T::value_type> struct has_index_value_type { enum { value = is_index<value_type>::value }; }; struct MyMat { MyMat(Index n) : m_size(n) {} void operator()(Index k) { cout << "singleton: "; cout << k << '\n'; } void operator()(const char*) { cout << "all\n"; } void operator()(all_t) { cout << "all\n"; } template<typename FirstType,typename LastType> void operator()(Span_t<FirstType,Index_c<1>,LastType> ids) { cout << "range-based: "; print_indices(Span_t<Index,Index_c<1>,Index>(getIndex(ids.m_first),getIndex(ids.m_last))); } template<typename FirstType,typename StepType,typename LastType> void operator()(Span_t<FirstType,StepType,LastType> ids) { cout << "slice-based: "; print_indices(Span_t<Index,StepType,Index>(getIndex(ids.m_first),ids.m_step,getIndex(ids.m_last))); } template<int Size,int Step,typename StartType> void operator()(iota_t<Size,Step,StartType> x) { return operator()(x.span()); } template<typename Indices> typename internal::enable_if<has_index_value_type<Indices>::value>::type operator()(const Indices &ids) { cout << "index-based: "; print_indices(ids); } template<typename Mask> typename internal::enable_if<has_boolean_value_type<Mask>::value>::type operator()(const Mask &mask) { cout << "mask-based: "; for(int k=0; k<mask.size(); ++k) if(mask[k]) cout << k << " "; cout << "\n"; } template<typename T,std::size_t N> void operator()(const T (&ids)[N] ) { cout << "index-based: "; for(int k=0; k<N; ++k) cout << ids[k] << " "; cout << "\n"; } template<std::size_t N> void operator()(const bool (&mask)[N] ) { cout << "mask-based: "; for(int k=0; k<N; ++k) if(mask[k]) cout << k << " "; cout << "\n"; } protected: Index getIndex(Index x) const { return x; } Index getIndex(last_t) const { return m_size-1; } Index getIndex(shifted_last x) const { return m_size+x.offset-1; } template<typename T> void print_indices(const T& ids) { for(int k=0; k<ids.size(); ++k) cout << ids[k] << " "; cout << "\n"; } Index m_size; }; #define PRINT(X,P) cout << #X << P << " -> "; X int main() { int n = 13; ArrayXd eia(n); eia.setRandom(); ArrayXi eii(4); eii << 3, 1, 6, 5; valarray<double> vala(n); Map<ArrayXd>(&vala[0],n) = eia; valarray<int> vali(4); Map<ArrayXi>(&vali[0],4) = eii; vector<int> veci(4); Map<ArrayXi>(veci.data(),4) = eii; MyMat mat(n); cout << "\nc++98 API:\n" << "----------\n"; PRINT( mat(5)," " ); // singleton PRINT( mat(all)," " ); // all PRINT( mat("")," " ); // all PRINT( mat(span(3,9))," " ); // range PRINT( mat(span(3,last))," " ); // range PRINT( mat(span(3,last-2))," " ); // range PRINT( mat(span(9,3))," " ); // empty range (as in MatLab), need to use a slice: PRINT( mat(span(9,-1,3))," " ); // slice PRINT( mat(span(9,-2,1))," " ); // slice PRINT( mat(span(last,-2,3))," " ); // slice PRINT( mat(span(last-1,-2,3))," " ); // slice PRINT( mat(span(3,3,last-3))," " ); // slice PRINT( mat(span(last-8,2,last-1))," " ); // slice PRINT( mat(eii)," " ); // index-based PRINT( mat(eii*2-1)," " ); // index-based PRINT( mat(vali)," " ); // index-based PRINT( mat(veci)," " ); // index-based PRINT( mat(eia>0)," " ); // mask PRINT( mat(vala>0.0)," " ); // mask cout << "\nc++11 possibilities:\n" << "--------------------\n"; // PRINT( mat({3,9})," " ); // range // PRINT( mat({3,last})," " ); // range // PRINT( mat({3,last-2})," " ); // range // PRINT( mat({9,3})," " ); // empty range (as in MatLab), need to use a slice: // PRINT( mat({9,-1,3})," " ); // slice // PRINT( mat({9,-2,1})," " ); // slice // PRINT( mat({3,2,9})," " ); // slice // PRINT( mat({3,2,last})," " ); // slice // PRINT( mat({3,3,last-3})," " ); // slice PRINT( mat({5,2,5,6})," " ); // index-based PRINT( mat({false,true,true,false})," " ); // mask PRINT( mat(valarray<int>{5,2,5,6}*2-1)," " ); // index-based PRINT( mat(array<int,4>{5,2,5,6})," " ); // index-based PRINT( mat( vector<bool>{false,true,true,false})," " ); // mask PRINT( mat(!valarray<bool>{false,true,true,false}),"" ); // mask cout << "\nAPL-like API (c++98 compatible):\n" << "--------------------------------\n"; PRINT( mat(iota(3))," " ); PRINT( mat(2+iota(3))," " ); PRINT( mat(3+2*iota(3))," " ); PRINT( mat(9-iota(3))," " ); PRINT( mat(9-2*iota(3))," " ); PRINT( mat(9-2*iota<3>())," " ); PRINT( mat(last-2*iota<3>())," " ); PRINT( mat(last-1-2*iota<3>())," " ); #if __cplusplus > 201103L int c; // check naming collision PRINT( mat(9-2*iota(c<3>))," [C++14]" ); PRINT( mat(9-2*Iota<3>)," [C++14]" ); #endif }
Generated output
c++98 API: ---------- mat(5) -> singleton: 5 mat(all) -> all mat("") -> all mat(span(3,9)) -> range-based: 3 4 5 6 7 8 9 mat(span(3,last)) -> range-based: 3 4 5 6 7 8 9 10 11 12 mat(span(3,last-2)) -> range-based: 3 4 5 6 7 8 9 10 mat(span(9,3)) -> range-based: mat(span(9,-1,3)) -> slice-based: 9 8 7 6 5 4 3 mat(span(9,-2,1)) -> slice-based: 9 7 5 3 1 mat(span(last,-2,3)) -> slice-based: 12 10 8 6 4 mat(span(last-1,-2,3)) -> slice-based: 11 9 7 5 3 mat(span(3,3,last-3)) -> slice-based: 3 6 9 mat(span(last-8,2,last-1)) -> slice-based: 4 6 8 10 mat(eii) -> index-based: 3 1 6 5 mat(eii*2-1) -> index-based: 5 1 11 9 mat(vali) -> index-based: 3 1 6 5 mat(veci) -> index-based: 3 1 6 5 mat(eia>0) -> mask-based: 2 4 7 8 9 11 12 mat(vala>0.0) -> mask-based: 2 4 7 8 9 11 12 c++11 possibilities: -------------------- mat({5,2,5,6}) -> index-based: 5 2 5 6 mat({false,true,true,false}) -> mask-based: 1 2 mat(valarray<int>{5,2,5,6}*2-1) -> index-based: 9 3 9 11 mat(array<int,4>{5,2,5,6}) -> index-based: 5 2 5 6 mat( vector<bool>{false,true,true,false}) -> mask-based: 1 2 mat(!valarray<bool>{false,true,true,false}) -> mask-based: 0 3 APL-like API (c++98 compatible): -------------------------------- mat(iota(3)) -> slice-based: 0 1 2 mat(2+iota(3)) -> slice-based: 2 3 4 mat(3+2*iota(3)) -> slice-based: 3 5 7 mat(9-iota(3)) -> slice-based: 9 8 7 mat(9-2*iota(3)) -> slice-based: 9 7 5 mat(9-2*iota<3>()) -> slice-based: 9 7 5 mat(last-2*iota<3>()) -> slice-based: 12 10 8 mat(last-1-2*iota<3>()) -> slice-based: 11 9 7 mat(9-2*iota(c<3>)) [C++14] -> slice-based: 9 7 5 mat(9-2*Iota<3>) [C++14] -> slice-based: 9 7 5
Discussions
- For concision, this demo only considers the 1D case, but of course the idea is to implement it in the nD settings. For instance, with a 3D tensor, we could imagine:
mat(span(0,2,last), "", vec>0)
which is why an API likemat.slice(...)
is a no go. This is also a crucial limitation of the currentEigen::Block
-based API. - We use the word 'last' instead of Matlab's 'end' to (1) reduce collision with std::end, and (2) make it more consistent with STL's end as 'last' refer to the last element whereas 'end' refers to the last+1.
- Using
""
for all might looks odd, but it would avoid introducing a polluting identifierall
(nobody wants to writeEigen::all
). We have the same issue withlast
though. - c++98
span(first,[step,]last)
API:- Pros:
- mimics MatLab first:[step:]last API
- enable indexing from the end
- Cons:
- the argument order is not explicit
- we could specify compile time size with
span<3>(4,6)
orspan<3>(4,2,8)
but then we have redundant information...
- Pros:
- APL-like API:
- Pros:
- easy to specify compile-time info
- no argument-order confusion
- can be built on top of the range/slice API (hence, can be done outside Eigen)
- Cons:
- cumbersome?
- needs to specify length instead of first-last
- Pros:
C++03 fallback API
Internal implementation details
Performance
One could be tempted to leverage AVX2 gather/scatter instructions, but those are horribly slow. Better emulate them for now.