Difference between revisions of "Working notes - Indexing++"

From Eigen
Jump to: navigation, search
(Discussions)
(Some references: references to wikipedia)
Line 16: Line 16:
 
==Some references==
 
==Some references==
  
 +
* Some overviews what other programming languages do: [https://en.wikipedia.org/wiki/Array_slicing] [https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(array)#Slicing]
 
* This [http://ericniebler.com/2014/12/07/a-slice-of-python-in-c post] about the Range v3 library.
 
* This [http://ericniebler.com/2014/12/07/a-slice-of-python-in-c post] about the Range v3 library.
 
* This [http://ericniebler.com/2014/12/07/a-slice-of-python-in-c/#comment-130118 comment] describes an interesting API for range and slices inspired by [https://en.wikipedia.org/wiki/APL_(programming_language) APL].
 
* This [http://ericniebler.com/2014/12/07/a-slice-of-python-in-c/#comment-130118 comment] describes an interesting API for range and slices inspired by [https://en.wikipedia.org/wiki/APL_(programming_language) APL].

Revision as of 16:06, 22 December 2016

The goal of this page is to summarize the different ideas and working plan to (finally!) provide support for flexible row/column indexing in Eigen. See this bug report.

Needs

We aim to support various indexing mechanisms. For each dimension, we would like to be able to have any of:

  • all: as in A(:)
  • singleton: as in A(3) (already supported!)
  • index-array: as in A([3 1 5])
  • range-based: as in A(3:9), but sometimes it is also convenient to specify the start+length similar to what Eigen::Block offers.
  • slicing: as in A(3:2:9). Same as for range, we sometime also want to specify start+length+stride
  • negative indices: as in A(end-3) or A(3:2:end-1)
  • boolean masking: as in A(A>0)

All of this with a concise and unambiguous API! We should also be able to provide compile-time information such as lengths and maybe strides.

Some references

  • Some overviews what other programming languages do: [1] [2]
  • This post about the Range v3 library.
  • This comment describes an interesting API for range and slices inspired by APL.
  • libigl extends Eigen by a colon and a slice function
  • Several questions/feature request on StackOverflow: [3] [4] [5] [6]

C++11 API

Since achieving a lean API in C++03 seems to be rather impossible, let's first see what could be done in C++11 and provide more verbose fallbacks for C++03.

Better than a long story, here is a showcase demo.


Source code

Click expand to see the implementation details.

#include <iostream>
#include <array>
#include <valarray>
#include <vector>
#include <Eigen/Dense>
using namespace Eigen;
using namespace std;
 
struct all_t { all_t() {} };
static const all_t all{};
 
struct shifted_last {
  explicit shifted_last(int o) : offset(o) {}
  int offset;
  shifted_last operator+ (int x) const { return shifted_last(offset+x); }
  int operator- (shifted_last x) const { return offset-x.offset; }
};
 
struct last_t {
  shifted_last operator- (int offset) const { return shifted_last(-offset); }
  int operator- (last_t) const { return 0; }
  int operator- (shifted_last x) const { return -x.offset; }
};
static const last_t last{};
 
template<int V> struct Index_c {
  static const int value = V;
  operator int() const { return value; }
};
 
//--------------------------------------------------------------------------------
// Range(first,last) and Slice(first,step,last)
//--------------------------------------------------------------------------------
 
template<typename FirstType=Index,typename StepType=Index_c<1>,typename LastType=Index>
struct Span_t {
  Span_t(FirstType f, StepType s, LastType l) : m_first(f), m_step(s), m_last(l) {}
 
  FirstType m_first;
  StepType  m_step;
  LastType  m_last;
 
  Index size() const { return (m_last-m_first+m_step)/m_step; }
  Index operator[] (Index k) const { return m_first + k*m_step; }
};
 
template<typename FirstType,typename LastType>
struct Span_t<FirstType,Index_c<1>,LastType> {
  Span_t(FirstType f, LastType l) : m_first(f), m_last(l) { }
  Span_t(FirstType f, Index_c<1> s, LastType l) : m_first(f), m_last(l) {}
 
  FirstType m_first;
  LastType  m_last;
 
  Index size() const { return (m_last-m_first+1); }
  Index operator[] (Index k) const { return m_first + k; }
};
 
template<typename T> struct cleanup_span_type { typedef Index type; };
template<> struct cleanup_span_type<last_t> { typedef last_t type; };
template<> struct cleanup_span_type<shifted_last> { typedef shifted_last type; };
 
template<typename FirstType,typename LastType>
Span_t<typename cleanup_span_type<FirstType>::type,Index_c<1>,typename cleanup_span_type<LastType>::type >
span(FirstType f, LastType l)  {
  return Span_t<typename cleanup_span_type<FirstType>::type,Index_c<1>,typename cleanup_span_type<LastType>::type>(f,l);
}
 
template<typename FirstType,typename LastType>
Span_t<typename cleanup_span_type<FirstType>::type,Index,typename cleanup_span_type<LastType>::type >
span(FirstType f, Index s, LastType l)  {
  return Span_t<typename cleanup_span_type<FirstType>::type,Index,typename cleanup_span_type<LastType>::type>(f,s,l);
}
 
//--------------------------------------------------------------------------------
// APL-like implementation based on iota(n) <=> {0,1,2,3,...n-1}
//--------------------------------------------------------------------------------
 
template<typename StartType> struct iota2slice {
  typedef Span_t<Index,Index,Index> type;
  typedef Index last_type;
};
template<> struct iota2slice<shifted_last> {
  typedef Span_t<shifted_last,Index,shifted_last> type;
  typedef shifted_last last_type;
};
 
template<int Size=Dynamic,int Step=1,typename StartType=Index_c<0> >
struct iota_t {
  iota_t(Index size=Size) : m_size(size), m_step(Step) {}
  iota_t(Index size,Index step,StartType start) : m_size(size), m_step(step), m_start(start) {}
  internal::variable_if_dynamic<Index,Size> m_size;
  internal::variable_if_dynamic<Index,Step> m_step;
  StartType m_start;
 
  typedef typename iota2slice<StartType>::type SliceType;
  typedef typename iota2slice<StartType>::last_type last_type;
 
  SliceType span() const {
    return SliceType(m_start, m_step.value(),
                     m_start+(m_size.value()-1)*m_step.value());
  }
 
  operator SliceType() const { return span(); }
 
  // set a runtime step
  friend iota_t<Size,Dynamic,StartType> operator*(int stride,const iota_t &x) {
    return iota_t<Size,Dynamic,StartType>{x.m_size.value(),x.m_step.value()*stride,x.m_start};
  }
 
  // negate the step
  iota_t<Size,Step==Dynamic?Dynamic:-Step,StartType> operator-() {
    return iota_t<Size,Step==Dynamic?Dynamic:-Step,StartType>{m_size.value(),-m_step.value(),m_start};
  }
 
  // set first element
  iota_t<Size,Step,Index> operator+(int offset) const {
    return iota_t<Size,Step,Index>{m_size.value(),m_step.value(),m_start+offset};
  }
 
  // set first element
  friend iota_t<Size,Step,Index> operator+(int offset,const iota_t &x) {
    return iota_t<Size,Step,Index>{x.m_size.value(),x.m_step.value(),x.m_start+offset};
  }
 
  // set first index and negative step
  friend iota_t<Size,Step==Dynamic?Dynamic:-Step,Index> operator-(int offset,const iota_t &x) {
    return iota_t<Size,Step==Dynamic?Dynamic:-Step,Index>{x.m_size.value(),-x.m_step.value(),x.m_start+offset};
  }
 
  // set first index to last element and negate the step
  friend iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last> operator-(last_t,const iota_t &x) {
    return iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last>{x.m_size.value(),-x.m_step.value(),shifted_last(x.m_start)};
  }
 
  // set first index to last element and negate the step
  friend iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last> operator-(shifted_last l,const iota_t &x) {
    return iota_t<Size,Step==Dynamic?Dynamic:-Step,shifted_last>{x.m_size.value(),-x.m_step.value(),shifted_last(l.offset+x.m_start)};
  }
};
 
iota_t<> iota(Index size) { return iota_t<>(size); }
template<int Size>
iota_t<Size> iota() { return iota_t<Size>(); }
 
#if __cplusplus > 201103L
template<int X>
static const Index_c<X> c{};
 
template<int Size>
static const iota_t<Size> Iota{};
#endif
 
template<int Size>
iota_t<Size> iota(Index_c<Size>) { return iota_t<Size>(); }
 
//--------------------------------------------------------------------------------
// Matrix implementation
//--------------------------------------------------------------------------------
 
// NOTE that declaring value_type here is needed to make SFINAE works properly in case T does not exhibit value_type
// Another solution would be to specialize has_boolean_value_type and has_index_value_type for Range, Slice, etc.
template<typename T,typename value_type=typename T::value_type>
struct has_boolean_value_type {
  enum {
    value =  bool(internal::is_same<value_type,bool>::value)
#if EIGEN_COMP_CLANG
    // Workaround clang's issue:
    //  (valarray::operator<)::value_type is int instead of bool,
    //  so we check the return type of (valarray::operator<).operator[], which is bool!
          || bool(internal::is_same<typename internal::remove_all<decltype(declval<T>()[0])>::type,bool>::value)
#endif
  };
};
 
template<typename T> struct is_index { enum { value = std::is_integral<T>::value }; };
template<> struct is_index<bool> { enum { value = false }; };
 
template<typename T,typename value_type=typename T::value_type>
struct has_index_value_type {
  enum {
    value = is_index<value_type>::value
  };
};
 
struct MyMat {
 
  MyMat(Index n) : m_size(n) {}
 
  void operator()(Index k) { cout << "singleton: "; cout << k << '\n'; }
 
  void operator()(const char*) { cout << "all\n"; }
 
  void operator()(all_t) { cout << "all\n"; }
 
  template<typename FirstType,typename LastType>
  void operator()(Span_t<FirstType,Index_c<1>,LastType> ids) {
    cout << "range-based: ";
    print_indices(Span_t<Index,Index_c<1>,Index>(getIndex(ids.m_first),getIndex(ids.m_last)));
  }
 
  template<typename FirstType,typename StepType,typename LastType>
  void operator()(Span_t<FirstType,StepType,LastType> ids) {
    cout << "slice-based: ";
    print_indices(Span_t<Index,StepType,Index>(getIndex(ids.m_first),ids.m_step,getIndex(ids.m_last)));
  }
 
  template<int Size,int Step,typename StartType>
  void operator()(iota_t<Size,Step,StartType> x) {
    return operator()(x.span());
  }
 
  template<typename Indices>
  typename internal::enable_if<has_index_value_type<Indices>::value>::type
  operator()(const Indices &ids) { cout << "index-based: "; print_indices(ids); }
 
  template<typename Mask>
  typename internal::enable_if<has_boolean_value_type<Mask>::value>::type
  operator()(const Mask &mask) {
    cout << "mask-based: ";
    for(int k=0; k<mask.size(); ++k)
      if(mask[k]) cout << k << " ";
    cout << "\n";
  }
 
  template<typename T,std::size_t N>
  void operator()(const T (&ids)[N] ) {
    cout << "index-based: ";
    for(int k=0; k<N; ++k)
      cout << ids[k] << " ";
    cout << "\n";
  }
 
  template<std::size_t N>
  void operator()(const bool (&mask)[N] ) {
    cout << "mask-based: ";
    for(int k=0; k<N; ++k)
      if(mask[k]) cout << k << " ";
    cout << "\n";
  }
 
protected:
  Index getIndex(Index x) const       { return x; }
  Index getIndex(last_t) const        { return m_size-1; }
  Index getIndex(shifted_last x) const  { return m_size+x.offset-1; }
 
  template<typename T>
  void print_indices(const T& ids) {
    for(int k=0; k<ids.size(); ++k)
      cout << ids[k] << " ";
    cout << "\n";
  }
 
  Index m_size;
};
 
#define PRINT(X,P) cout << #X << P << "  -> "; X
 
int main()
{
  int n = 13;
  ArrayXd eia(n); eia.setRandom();
  ArrayXi eii(4); eii << 3, 1, 6, 5;
  valarray<double> vala(n); Map<ArrayXd>(&vala[0],n) = eia;
  valarray<int> vali(4); Map<ArrayXi>(&vali[0],4) = eii;
  vector<int> veci(4); Map<ArrayXi>(veci.data(),4) = eii;
 
  MyMat mat(n);
  cout << "\nc++98 API:\n"
       <<   "----------\n";
  PRINT( mat(5),"                     " );  // singleton
  PRINT( mat(all),"                   " );  // all
  PRINT( mat(""),"                    " );  // all
  PRINT( mat(span(3,9)),"             " );  // range
  PRINT( mat(span(3,last)),"          " );  // range
  PRINT( mat(span(3,last-2)),"        " );  // range
  PRINT( mat(span(9,3)),"             " );  // empty range (as in MatLab), need to use a slice:
  PRINT( mat(span(9,-1,3)),"          " );  // slice
  PRINT( mat(span(9,-2,1)),"          " );  // slice
  PRINT( mat(span(last,-2,3)),"       " );  // slice
  PRINT( mat(span(last-1,-2,3)),"     " );  // slice
  PRINT( mat(span(3,3,last-3)),"      " );  // slice
  PRINT( mat(span(last-8,2,last-1))," " );  // slice
  PRINT( mat(eii),"                   " );  // index-based
  PRINT( mat(eii*2-1),"               " );  // index-based
  PRINT( mat(vali),"                  " );  // index-based
  PRINT( mat(veci),"                  " );  // index-based
  PRINT( mat(eia>0),"                 " );  // mask
  PRINT( mat(vala>0.0),"              " );  // mask
 
  cout << "\nc++11 possibilities:\n"
       <<   "--------------------\n";
//   PRINT( mat({3,9}),"                 " );  // range
//   PRINT( mat({3,last}),"              " );  // range
//   PRINT( mat({3,last-2}),"            " );  // range
//   PRINT( mat({9,3}),"                 " );  // empty range (as in MatLab), need to use a slice:
//   PRINT( mat({9,-1,3}),"              " );  // slice
//   PRINT( mat({9,-2,1}),"              " );  // slice
//   PRINT( mat({3,2,9}),"               " );  // slice
//   PRINT( mat({3,2,last}),"            " );  // slice
//   PRINT( mat({3,3,last-3}),"          " );  // slice
  PRINT( mat({5,2,5,6}),"                             " );  // index-based
  PRINT( mat({false,true,true,false}),"               " );  // mask
  PRINT( mat(valarray<int>{5,2,5,6}*2-1),"            " );  // index-based
  PRINT( mat(array<int,4>{5,2,5,6}),"                 " );  // index-based
  PRINT( mat( vector<bool>{false,true,true,false}),"  " );  // mask
  PRINT( mat(!valarray<bool>{false,true,true,false}),"" );  // mask
 
  cout << "\nAPL-like API (c++98 compatible):\n"
       <<   "--------------------------------\n";
  PRINT( mat(iota(3)),"               " );
  PRINT( mat(2+iota(3)),"             " );
  PRINT( mat(3+2*iota(3)),"           " );
  PRINT( mat(9-iota(3)),"             " );
  PRINT( mat(9-2*iota(3)),"           " );
  PRINT( mat(9-2*iota<3>()),"         " );
  PRINT( mat(last-2*iota<3>()),"      " );
  PRINT( mat(last-1-2*iota<3>()),"    " );
#if __cplusplus > 201103L
  int c; // check naming collision
  PRINT( mat(9-2*iota(c<3>))," [C++14]" );
  PRINT( mat(9-2*Iota<3>),"    [C++14]" );
#endif
 
}

Generated output

c++98 API:
----------
mat(5)                       -> singleton: 5
mat(all)                     -> all
mat("")                      -> all
mat(span(3,9))               -> range-based: 3 4 5 6 7 8 9
mat(span(3,last))            -> range-based: 3 4 5 6 7 8 9 10 11 12
mat(span(3,last-2))          -> range-based: 3 4 5 6 7 8 9 10
mat(span(9,3))               -> range-based:
mat(span(9,-1,3))            -> slice-based: 9 8 7 6 5 4 3
mat(span(9,-2,1))            -> slice-based: 9 7 5 3 1
mat(span(last,-2,3))         -> slice-based: 12 10 8 6 4
mat(span(last-1,-2,3))       -> slice-based: 11 9 7 5 3
mat(span(3,3,last-3))        -> slice-based: 3 6 9
mat(span(last-8,2,last-1))   -> slice-based: 4 6 8 10
mat(eii)                     -> index-based: 3 1 6 5
mat(eii*2-1)                 -> index-based: 5 1 11 9
mat(vali)                    -> index-based: 3 1 6 5
mat(veci)                    -> index-based: 3 1 6 5
mat(eia>0)                   -> mask-based: 2 4 7 8 9 11 12
mat(vala>0.0)                -> mask-based: 2 4 7 8 9 11 12

c++11 possibilities:
--------------------
mat({5,2,5,6})                               -> index-based: 5 2 5 6
mat({false,true,true,false})                 -> mask-based: 1 2
mat(valarray<int>{5,2,5,6}*2-1)              -> index-based: 9 3 9 11
mat(array<int,4>{5,2,5,6})                   -> index-based: 5 2 5 6
mat( vector<bool>{false,true,true,false})    -> mask-based: 1 2
mat(!valarray<bool>{false,true,true,false})  -> mask-based: 0 3

APL-like API (c++98 compatible):
--------------------------------
mat(iota(3))                 -> slice-based: 0 1 2
mat(2+iota(3))               -> slice-based: 2 3 4
mat(3+2*iota(3))             -> slice-based: 3 5 7
mat(9-iota(3))               -> slice-based: 9 8 7
mat(9-2*iota(3))             -> slice-based: 9 7 5
mat(9-2*iota<3>())           -> slice-based: 9 7 5
mat(last-2*iota<3>())        -> slice-based: 12 10 8
mat(last-1-2*iota<3>())      -> slice-based: 11 9 7
mat(9-2*iota(c<3>)) [C++14]  -> slice-based: 9 7 5
mat(9-2*Iota<3>)    [C++14]  -> slice-based: 9 7 5

Discussions

  1. For concision, this demo only considers the 1D case, but of course the idea is to implement it in the nD settings. For instance, with a 3D tensor, we could imagine:
    mat(span(0,2,last), "", vec>0)
    which is why an API like mat.slice(...) is a no go. This is also a crucial limitation of the current Eigen::Block-based API.
  2. We use the word 'last' instead of Matlab's 'end' to (1) reduce collision with std::end, and (2) make it more consistent with STL's end as 'last' refer to the last element whereas 'end' refers to the last+1.
  3. Using "" for all might looks odd, but it would avoid introducing a polluting identifier all (nobody wants to write Eigen::all). We have the same issue with last though.
  4. c++98 span(first,[step,]last) API:
    • Pros:
      • mimics MatLab first:[step:]last API
      • enable indexing from the end
    • Cons:
      • the argument order is not explicit
      • we could specify compile time size with span<3>(4,6) or span<3>(4,2,8) but then we have redundant information...
  5. APL-like API:
    • Pros:
      • easy to specify compile-time info
      • no argument-order confusion
      • can be built on top of the range/slice API (hence, can be done outside Eigen)
    • Cons:
      • cumbersome?
      • needs to specify length instead of first-last

C++03 fallback API

Internal implementation details

Performance

One could be tempted to leverage AVX2 gather/scatter instructions, but those are horribly slow. Better emulate them for now.