Difference between revisions of "Functors"

From Eigen
Jump to: navigation, search
(The virtual functors approach: functors with dynamic polymorphism)
 
(4 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
== The functors approach with static polymorphism ==
 
== The functors approach with static polymorphism ==
  
One can define a class "functor" that contains the wanted function, and pass an object of class "functor" instead. If the function needed to take argument, they can be passed to the constructor of "functor" and stored in the functor object. The compiler is typically very good at optimizing this away when possible. Here's an example:
+
One can define a class "functor" that contains the wanted function as a method, and pass an object of class "functor" instead. If the function needed to take arguments, they can be passed to the constructor of "functor" and stored in the functor object. The compiler is typically very good at optimizing this away when possible. Here's an example:
  
 
'''functors.cpp:'''
 
'''functors.cpp:'''
Line 13: Line 13:
  
 
/*** print the name of some types... ***/
 
/*** print the name of some types... ***/
template<typename T, typename U> struct ei_is_same_type { enum { ret = 0 }; };
+
template<typename type> std::string name_of_type() { return "other"; }
template<typename T> struct ei_is_same_type<T,T> { enum { ret = 1 }; };
+
template<> std::string name_of_type<int>() { return "int"; }
 
+
template<> std::string name_of_type<float>() { return "float"; }
template<typename type>
+
template<> std::string name_of_type<double>() { return "double"; }
std::string name_of_type()
+
{
+
  if(ei_is_same_type<type,int>::ret)
+
    return "int";
+
  else if(ei_is_same_type<type,float>::ret)
+
    return "float";
+
  else if(ei_is_same_type<type,double>::ret)
+
    return "double";
+
  else return "other";
+
}
+
  
 
struct sum_of_ints_functor
 
struct sum_of_ints_functor
Line 32: Line 22:
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   {
 
   {
     std::cout << "let's compute the sum of the two ints " << a << " and " << b << std::endl;
+
     std::cout << "Type: int. Computing the sum of the two ints " << a << " and " << b << ".";
 
   }
 
   }
 
    
 
    
Line 46: Line 36:
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   {
 
   {
     std::cout << "let's compute the product of the two numbers (type: " << name_of_type<scalar>() << ") " << a << " and " << b << std::endl;
+
     std::cout << "Type: " << name_of_type<scalar>() << ". Computing the product of " << a << " and " << b << ".";
 
   }
 
   }
 
    
 
    
Line 57: Line 47:
 
template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
 
template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
 
{
 
{
   std::cout << functor_object.f() << std::endl;
+
   std::cout << " The result is: " << functor_object.f() << std::endl;
 
}
 
}
  
Line 67: Line 57:
 
}
 
}
 
</source>
 
</source>
 +
 +
'''Output:'''
 +
<pre>
 +
$ g++ functors.cpp -o functors && ./functors
 +
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8
 +
Type: float. Computing the product of 0.2 and 0.4. The result is: 0.08
 +
Type: int. Computing the product of 7 and 8. The result is: 56
 +
</pre>
  
 
The immediate advantages of this technique, over the C technique of passing function pointers, are that:
 
The immediate advantages of this technique, over the C technique of passing function pointers, are that:
Line 72: Line 70:
 
* This allows to pass all sorts of additional information as part of the functor type.
 
* This allows to pass all sorts of additional information as part of the functor type.
  
But there is also a drawback: the code for the function call_and_print_return_value has been generated 3 times:
+
But there is also a drawback: the code for the function call_and_print_return_value has been generated 3 times, instead of 2 (2 is the minimum possible here because we do 2 different types, int and float, so in C they would have to write 2 separate functions too):
 
<pre>
 
<pre>
 
$ nm --demangle functors | grep call_and_print
 
$ nm --demangle functors | grep call_and_print
Line 92: Line 90:
  
 
/*** print the name of some types... ***/
 
/*** print the name of some types... ***/
template<typename T, typename U> struct ei_is_same_type { enum { ret = 0 }; };
+
template<typename type> std::string name_of_type() { return "other"; }
template<typename T> struct ei_is_same_type<T,T> { enum { ret = 1 }; };
+
template<> std::string name_of_type<int>() { return "int"; }
 
+
template<> std::string name_of_type<float>() { return "float"; }
template<typename type>
+
template<> std::string name_of_type<double>() { return "double"; }
std::string name_of_type()
+
{
+
  if(ei_is_same_type<type,int>::ret)
+
    return "int";
+
  else if(ei_is_same_type<type,float>::ret)
+
    return "float";
+
  else if(ei_is_same_type<type,double>::ret)
+
    return "double";
+
  else return "other";
+
}
+
  
 
template<typename scalar> class functor
 
template<typename scalar> class functor
Line 117: Line 105:
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   {
 
   {
     std::cout << "let's compute the sum of the two ints " << a << " and " << b << std::endl;
+
     std::cout << "Type: int. Computing the sum of the two ints " << a << " and " << b << ".";
 
   }
 
   }
 
    
 
    
Line 131: Line 119:
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   {
 
   {
     std::cout << "let's compute the product of the two numbers (type: " << name_of_type<scalar>() << ") " << a << " and " << b << std::endl;
+
     std::cout << "Type: " << name_of_type<scalar>() << ". Computing the product of " << a << " and " << b << ".";
 
   }
 
   }
 
    
 
    
Line 142: Line 130:
 
template<typename scalar> void call_and_print_return_value(const functor<scalar>& functor_object)
 
template<typename scalar> void call_and_print_return_value(const functor<scalar>& functor_object)
 
{
 
{
   std::cout << functor_object.f() << std::endl;
+
   std::cout << " The result is: " << functor_object.f() << std::endl;
 
}
 
}
  
Line 152: Line 140:
 
}
 
}
 
</source>
 
</source>
 +
 +
'''Output:'''
 +
<pre>
 +
$ g++ virtual.cpp -o virtual && ./virtual
 +
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8
 +
Type: float. Computing the product of 0.2 and 0.4. The result is: 0.08
 +
Type: int. Computing the product of 7 and 8. The result is: 56
 +
</pre>
  
 
This new version looks very similar to the functors.cpp discussed earlier, actually the diff is very small:
 
This new version looks very similar to the functors.cpp discussed earlier, actually the diff is very small:
 
<source lang="cpp">
 
<source lang="cpp">
 
$ diff -u functors.cpp virtual.cpp
 
$ diff -u functors.cpp virtual.cpp
--- functors.cpp       2009-09-25 09:33:48.000000000 -0400
+
--- functors.cpp
+++ virtual.cpp 2009-09-25 09:34:01.000000000 -0400
+
+++ virtual.cpp
@@ -17,7 +17,13 @@
+
@@ -7,7 +7,13 @@
  else return "other";
+
template<> std::string name_of_type<float>() { return "float"; }
  }
+
  template<> std::string name_of_type<double>() { return "double"; }
  
 
-struct sum_of_ints_functor
 
-struct sum_of_ints_functor
Line 173: Line 169:
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   {
 
   {
@@ -31,7 +37,7 @@
+
@@ -21,7 +27,7 @@
 
  };
 
  };
  
Line 182: Line 178:
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   {
 
   {
@@ -44,7 +50,7 @@
+
@@ -34,7 +40,7 @@
 
   scalar m_a, m_b;
 
   scalar m_a, m_b;
 
  };
 
  };
Line 189: Line 185:
 
+template<typename scalar> void call_and_print_return_value(const functor<scalar>& functor_object)
 
+template<typename scalar> void call_and_print_return_value(const functor<scalar>& functor_object)
 
  {
 
  {
   std::cout << functor_object.f() << std::endl;
+
   std::cout << " The result is: " << functor_object.f() << std::endl;
 
  }
 
  }
 
</source>
 
</source>
Line 201: Line 197:
 
So we have only 2 instantiations anymore, instead of 3.
 
So we have only 2 instantiations anymore, instead of 3.
  
That said, it is important to say that the functors.cpp approach may still be preferable over virtual.cpp depending on circumstances. The functors.cpp version means that the polymorphism is resolved at compile time, which allows for compile-time optimizations that aren't possible in the virtual.cpp version. To begin with, functors.cpp allows the functor calls to be inlined, which virtual.cpp doesn't. To make things worse, in virtual.cpp the functor calls are virtual function calls, which are a bit more expensive than normal function calls. This may or may not be a problem, depending on your context.
+
On the other hand, this version also has drawbacks compared to the first one (functors.cpp). Either one may be preferable depending on circumstances. The functors.cpp version means that the polymorphism is resolved at compile time, which allows for compile-time optimizations that aren't possible in the virtual.cpp version. To begin with, functors.cpp allows the functor calls to be inlined, which virtual.cpp doesn't. To make things worse, in virtual.cpp the functor calls are virtual function calls, which are a bit more expensive than normal function calls. This may or may not be a problem, depending on your context.
  
 
There's no universal rule, only you can know what's best in your context.
 
There's no universal rule, only you can know what's best in your context.
Line 215: Line 211:
  
 
/*** print the name of some types... ***/
 
/*** print the name of some types... ***/
template<typename T, typename U> struct ei_is_same_type { enum { ret = 0 }; };
+
template<typename type> std::string name_of_type() { return "other"; }
template<typename T> struct ei_is_same_type<T,T> { enum { ret = 1 }; };
+
template<> std::string name_of_type<int>() { return "int"; }
 
+
template<> std::string name_of_type<float>() { return "float"; }
template<typename type>
+
template<> std::string name_of_type<double>() { return "double"; }
std::string name_of_type()
+
{
+
  if(ei_is_same_type<type,int>::ret)
+
    return "int";
+
  else if(ei_is_same_type<type,float>::ret)
+
    return "float";
+
  else if(ei_is_same_type<type,double>::ret)
+
    return "double";
+
  else return "other";
+
}
+
  
 
/*** functors inheriting a common virtual base ***/
 
/*** functors inheriting a common virtual base ***/
Line 242: Line 228:
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
 
   {
 
   {
     std::cout << "let's compute the sum of the two ints " << a << " and " << b << std::endl;
+
     std::cout << "Type: int. Computing the sum of the two ints " << a << " and " << b << ".";
 
   }
 
   }
 
    
 
    
Line 256: Line 242:
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
 
   {
 
   {
     std::cout << "let's compute the product of the two numbers (type: " << name_of_type<scalar>() << ") " << a << " and " << b << std::endl;
+
     std::cout << "Type: " << name_of_type<scalar>() << ". Computing the product of " << a << " and " << b << ". ";
 
   }
 
   }
 
    
 
    
Line 267: Line 253:
 
template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
 
template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
 
{
 
{
   std::cout << functor_object.f() << std::endl;
+
   std::cout << " The result is: " << functor_object.f() << "." << std::endl;
 
}
 
}
  
Line 273: Line 259:
 
{
 
{
 
   // by default, the function is instantiated separately for each functor type
 
   // by default, the function is instantiated separately for each functor type
  // the compiler should then not be disturbed by the virtual base, and do all the compile time optimizations.
 
 
   call_and_print_return_value(sum_of_ints_functor(3,5));
 
   call_and_print_return_value(sum_of_ints_functor(3,5));
 
   call_and_print_return_value(product_functor<float>(0.2f,0.4f));
 
   call_and_print_return_value(product_functor<float>(0.2f,0.4f));
Line 286: Line 271:
 
}
 
}
 
</source>
 
</source>
 +
 +
'''Output:'''
 +
<pre>
 +
$ g++ unified.cpp -o unified && ./unified
 +
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8.
 +
Type: float. Computing the product of 0.2 and 0.4.  The result is: 0.08.
 +
Type: int. Computing the product of 7 and 8.  The result is: 56.
 +
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8.
 +
Type: float. Computing the product of 0.2 and 0.4.  The result is: 0.08.
 +
Type: int. Computing the product of 7 and 8.  The result is: 56.
 +
</pre>
  
 
Let's now examine the instantiations of call_and_print_return_value:
 
Let's now examine the instantiations of call_and_print_return_value:

Latest revision as of 19:51, 26 September 2009

In pure C, when one wants to pass a function as parameter to another function, one passes its address. However, in C++ this technique should generally not be used, as one can do much better.

The functors approach with static polymorphism

One can define a class "functor" that contains the wanted function as a method, and pass an object of class "functor" instead. If the function needed to take arguments, they can be passed to the constructor of "functor" and stored in the functor object. The compiler is typically very good at optimizing this away when possible. Here's an example:

functors.cpp:

#include<iostream>
#include<string>
 
/*** print the name of some types... ***/
template<typename type> std::string name_of_type() { return "other"; }
template<> std::string name_of_type<int>() { return "int"; }
template<> std::string name_of_type<float>() { return "float"; }
template<> std::string name_of_type<double>() { return "double"; }
 
struct sum_of_ints_functor
{
  sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
  {
    std::cout << "Type: int. Computing the sum of the two ints " << a << " and " << b << ".";
  }
 
  int f() const { return m_a + m_b; }
 
  private:
  int m_a, m_b;
};
 
template<typename scalar=int>
struct product_functor
{
  product_functor(scalar a, scalar b) : m_a(a), m_b(b)
  {
    std::cout << "Type: " << name_of_type<scalar>() << ". Computing the product of " << a << " and " << b << ".";
  }
 
  scalar f() const { return m_a * m_b; }
 
  private:
  scalar m_a, m_b;
};
 
template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
{
  std::cout << " The result is: " << functor_object.f() << std::endl;
}
 
int main()
{
  call_and_print_return_value(sum_of_ints_functor(3,5));
  call_and_print_return_value(product_functor<float>(0.2f,0.4f));
  call_and_print_return_value(product_functor<>(7,8));    
}

Output:

$ g++ functors.cpp -o functors && ./functors
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8
Type: float. Computing the product of 0.2 and 0.4. The result is: 0.08
Type: int. Computing the product of 7 and 8. The result is: 56

The immediate advantages of this technique, over the C technique of passing function pointers, are that:

  • This allows the functor calls to be inlined, and in that case the compiler will easily optimize the functor object away completely. Thus, this is an optimization when the functor call inlining is important for performance.
  • This allows to pass all sorts of additional information as part of the functor type.

But there is also a drawback: the code for the function call_and_print_return_value has been generated 3 times, instead of 2 (2 is the minimum possible here because we do 2 different types, int and float, so in C they would have to write 2 separate functions too):

$ nm --demangle functors | grep call_and_print
08048b2c W void call_and_print_return_value<product_functor<float> >(product_functor<float> const&)
08048af8 W void call_and_print_return_value<product_functor<int> >(product_functor<int> const&)
08048ac4 W void call_and_print_return_value<sum_of_ints_functor>(sum_of_ints_functor const&)

Here, of course the code needs to be generated separately for int and for float, but one may at least want to factor the code for the two versions with int. This is possible through virtual inheritance of the functors, as explained in the next section:

The virtual functors approach: functors with dynamic polymorphism

It goes like this:

virtual.cpp:

#include<iostream>
#include<string>
 
/*** print the name of some types... ***/
template<typename type> std::string name_of_type() { return "other"; }
template<> std::string name_of_type<int>() { return "int"; }
template<> std::string name_of_type<float>() { return "float"; }
template<> std::string name_of_type<double>() { return "double"; }
 
template<typename scalar> class functor
{
  public:
    virtual scalar f() const = 0;
};
 
struct sum_of_ints_functor : public functor<int>
{
  sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
  {
    std::cout << "Type: int. Computing the sum of the two ints " << a << " and " << b << ".";
  }
 
  int f() const { return m_a + m_b; }
 
  private:
  int m_a, m_b;
};
 
template<typename scalar=int>
struct product_functor : public functor<scalar>
{
  product_functor(scalar a, scalar b) : m_a(a), m_b(b)
  {
    std::cout << "Type: " << name_of_type<scalar>() << ". Computing the product of " << a << " and " << b << ".";
  }
 
  scalar f() const { return m_a * m_b; }
 
  private:
  scalar m_a, m_b;
};
 
template<typename scalar> void call_and_print_return_value(const functor<scalar>& functor_object)
{
  std::cout << " The result is: " << functor_object.f() << std::endl;
}
 
int main()
{
  call_and_print_return_value(sum_of_ints_functor(3,5));
  call_and_print_return_value(product_functor<float>(0.2f,0.4f));
  call_and_print_return_value(product_functor<>(7,8));    
}

Output:

$ g++ virtual.cpp -o virtual && ./virtual
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8
Type: float. Computing the product of 0.2 and 0.4. The result is: 0.08
Type: int. Computing the product of 7 and 8. The result is: 56

This new version looks very similar to the functors.cpp discussed earlier, actually the diff is very small:

$ diff -u functors.cpp virtual.cpp
--- functors.cpp
+++ virtual.cpp
@@ -7,7 +7,13 @@
 template<> std::string name_of_type<float>() { return "float"; }
 template<> std::string name_of_type<double>() { return "double"; }
 
-struct sum_of_ints_functor
+template<typename scalar> class functor
+{
+  public:
+    virtual scalar f() const = 0;
+};
+
+struct sum_of_ints_functor : public functor<int>
 {
   sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
   {
@@ -21,7 +27,7 @@
 };
 
 template<typename scalar=int>
-struct product_functor
+struct product_functor : public functor<scalar>
 {
   product_functor(scalar a, scalar b) : m_a(a), m_b(b)
   {
@@ -34,7 +40,7 @@
   scalar m_a, m_b;
 };
 
-template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
+template<typename scalar> void call_and_print_return_value(const functor<scalar>& functor_object)
 {
   std::cout << " The result is: " << functor_object.f() << std::endl;
 }

The advantage of this new version is that the code of call_and_print_return_value is now shared for all functors sharing the same base (that is, here, to say that they share the same scalar type). Indeed:

$ nm --demangle virtual | grep call_and_print
08048d0f W void call_and_print_return_value<float>(functor<float> const&)
08048bec W void call_and_print_return_value<int>(functor<int> const&)

So we have only 2 instantiations anymore, instead of 3.

On the other hand, this version also has drawbacks compared to the first one (functors.cpp). Either one may be preferable depending on circumstances. The functors.cpp version means that the polymorphism is resolved at compile time, which allows for compile-time optimizations that aren't possible in the virtual.cpp version. To begin with, functors.cpp allows the functor calls to be inlined, which virtual.cpp doesn't. To make things worse, in virtual.cpp the functor calls are virtual function calls, which are a bit more expensive than normal function calls. This may or may not be a problem, depending on your context.

There's no universal rule, only you can know what's best in your context.

The unified approach: write once, let the user decide between the two

The good thing is that, as a library developer, you can write your library code in such a way that you allow the user to choose for himself which approach is best for his context. Indeed, you can do this:

unified.cpp

#include<iostream>
#include<string>
 
/*** print the name of some types... ***/
template<typename type> std::string name_of_type() { return "other"; }
template<> std::string name_of_type<int>() { return "int"; }
template<> std::string name_of_type<float>() { return "float"; }
template<> std::string name_of_type<double>() { return "double"; }
 
/*** functors inheriting a common virtual base ***/
 
template<typename scalar=int> class functor
{
  public:
    virtual scalar f() const = 0;
};
 
struct sum_of_ints_functor : public functor<int>
{
  sum_of_ints_functor(int a, int b) : m_a(a), m_b(b)
  {
    std::cout << "Type: int. Computing the sum of the two ints " << a << " and " << b << ".";
  }
 
  int f() const { return m_a + m_b; }
 
  private:
  int m_a, m_b;
};
 
template<typename scalar=int>
struct product_functor : public functor<scalar>
{
  product_functor(scalar a, scalar b) : m_a(a), m_b(b)
  {
    std::cout << "Type: " << name_of_type<scalar>() << ". Computing the product of " << a << " and " << b << ". ";
  }
 
  scalar f() const { return m_a * m_b; }
 
  private:
  scalar m_a, m_b;
};
 
template<typename functor_type> void call_and_print_return_value(const functor_type& functor_object)
{
  std::cout << " The result is: " << functor_object.f() << "." << std::endl;
}
 
int main()
{
  // by default, the function is instantiated separately for each functor type
  call_and_print_return_value(sum_of_ints_functor(3,5));
  call_and_print_return_value(product_functor<float>(0.2f,0.4f));
  call_and_print_return_value(product_functor<>(7,8));
 
  // but if we want, we may also tell it "hey, instantiate only with respect to the virtual base type"!
  // then it is instantiated only once per scalar type, so we factor the binary code (good!)
  // and the polymorphism (for a given scalar type) is resolved at runtime
  call_and_print_return_value<functor<> >(sum_of_ints_functor(3,5));
  call_and_print_return_value<functor<float> >(product_functor<float>(0.2f,0.4f));
  call_and_print_return_value<functor<> >(product_functor<>(7,8));
}

Output:

$ g++ unified.cpp -o unified && ./unified
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8.
Type: float. Computing the product of 0.2 and 0.4.  The result is: 0.08.
Type: int. Computing the product of 7 and 8.  The result is: 56.
Type: int. Computing the sum of the two ints 3 and 5. The result is: 8.
Type: float. Computing the product of 0.2 and 0.4.  The result is: 0.08.
Type: int. Computing the product of 7 and 8.  The result is: 56.

Let's now examine the instantiations of call_and_print_return_value:

$ nm --demangle unified | grep call_and_print
08048d89 W void call_and_print_return_value<product_functor<float> >(product_functor<float> const&)
08048e9d W void call_and_print_return_value<product_functor<int> >(product_functor<int> const&)
08048c66 W void call_and_print_return_value<sum_of_ints_functor>(sum_of_ints_functor const&)
08048f0b W void call_and_print_return_value<functor<float> >(functor<float> const&)
08048ed4 W void call_and_print_return_value<functor<int> >(functor<int> const&)

Only 5 instantiations for 6 function calls, because the two virtual calls for int are sharing the same code.

A note about overloading operator()

Some people like to overload operator() in a functor, instead of defining a method f() like we did here. It's just a matter of taste, it really amounts to the same. It's nice syntactic sugar, because you then call your functor exactly with the same syntax as a function:

functor_object();

However, the user himself doesn't call the functor, so that doesn't make his life easier, on the contrary the syntax is a bit more cumbersome for him when he defines the functor class, as he needs to overload operator().

There's also a situation when overloading operator() is not an option: if your functor class doesn't need to carry any member data, you may want to make the method static, so one doesn't even need to construct a functor object. This may further help the compiler to optimize away the abstraction. In that case, operator() isn't an option, because it can't be static.