2019-01-22

Projection, a powerful feature in C++20 Ranges library

I'm Ryou Ezoe. Today, I'm going to write about the projection, a powerful feature in C++20 Ranges library.

Suppose, you have a class that represents a person:

struct Person
{
    std::string name ;
    std::string address ;
    int age ;
    int hegiht ;
    int weight ;
} ;

And a vector of persons.

std::vector<Person> persons ;

Naturally, you want to sort persons by a specific data member.

How can we do that? You can write your own compare function.

std::sort( persons, []( auto & a, auto & b ) { return a.age < b.age ; } ) ;

I don't want to write this. Not just for itslong boilerplate code, but the compiler can't catch the obvious bugs like this.

// using a wrong comparison operator 
std::sort( persons, []( auto & a, auto & b ) { return a.age > b.age ; } ) ;

Or this.

// It doens't compare the two parameters.
// Compiler don't warn it because it's perfectly well-formed code.
std::sort( persons, []( auto & a, auto & b ) { return a.age < a.age ; } ) ;

Or this.

// comparing wrong data members.
// the types are same so compiler don't warn it.
std::sort( persons, []( auto & a, auto & b ) { return a.age < b.height ; } ) ;

The C++ compiler cannot warn these codes because it's perfectly well-formed code. The compiler can't guess the programmer's unwritten intention and the last time I checked, nobody seriously researched on using trending machine learning 2.0 based solution which can guess the unwritten intention.

The C++20 Ranges got you covered on this problem with the projection. You can simply pass the ranges::less and a pointer to the data member as arguments and it just works.

std::ranges::sort( persons, std::ranges::less, &Person::age ) ;

Why does it work? the ranges::sort without projection works like this.

auto sort = []( auto && range, auto comp )
{
    // ...
    // i, j are iteretors
    // compare two elements for ordering
    if ( comp( *i, *j ) )
    // ...
} ;

But ranges::sort has a extra parameter for projection.

auto sort = []( auto && range, auto comp, auto proj ) ...

And it works like this.

auto sort = []( auto && range, auto comp, auto proj )
{
    // ...
    if ( comp( std::invoke( proj, *i), std::invoke( proj, *j ) ) )
    // ...
}

std::invoke is an ugly version of function call. std::invoke( f, args.. ) is equivalent to f( args ... ) if the f is function. In that sense, above code is equivalent of

if ( comp( proj(*i), proj(*j) ) )

But if the f is a pointer to a data member, and args... has exactly one argument which is a object of class type, std::invoke( f, a ) is equivalent to a.*f ;


if ( comp( (*i).*proj, (*j).*proj ) )

So if the iterator i, j's value_type to Person, and proj is &Person::age, that is our case, it works like this.

if ( comp ( i->age, j->age ) )

Thus it just works.

Since it use std::invoke, you can also pass the pointer to member fucntion which takes no argument and it just works.

class Person
{
    int age ;
public :
    int get_age() const noexcept { return age ; } ;
} ;

int main()
{
    std::vector<Person> persons ;
    std::ranges::sort( persons, std::ranges::less, &Person::get_age ) ;
}

You can also pass function object too.

std::ranges::sort( persons, std::ranages::less, []( auto && n ) { return n.age ; } ) ;

This code looks like boilerplate too you. But it's actually better than C++17 era code. Because projection function only deal with one parameter and how to project that parameter. You don't need to write the rest of boilerplate code so you are immune from above typical problems.

So, what other algorithms support the projection? Well, most of them. Those which take a function object from user also take the projection function object in the last parameter.


all_of( range, pred, proj ) ;
for_each( range, function, proj ) ;

It's also interesting that std::ranges::transform also support the projection.

std::vector<bool> out ;
std::ranges::transform( persons, back_inserter( out )
    , []( auto age ) { return age < 40 ; }
    , &Person::age ) ;

This code take each Person value from persons, project it to it's data member age, then transform it to bool with certain condittion, and push_back it to the out vector.

Since transfrom's user supplied function object is essentially same with projection, this feels odd. But it's good for consistency and you don't need to precombine the function object and projection by yourself.

Speaking of transform, std::ranges::view::transform_view call function with std::invoke too. Although this isn't a projection in strictly speaking, but it works like a projection.

for ( auto age : persons | transform( &Person::age ) )
    std::cout << age << '\n' ; 

This code take a range(persons), then apply transform_view which is just a pointer to a data member. Since transform_view call function by std::invoke, it just works. and variable auto age take each age value of Person object inside the range.

2 comments:

HP said...
This comment has been removed by the author.
HP said...

Oh, I got it wrong it seems. Thanks.