2019-01-10

The overview of C++20 Range view

The latest C++ draft as of this writing incorporated One Range proposal.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/n4791.pdf

So what is the Range anyway? A C++ Standard Comittee member, Eric Nibeler, summrised it well.

Eric Niebler – Eric Niebler

Actually, he summrised it too well to the point that his code is almost unreadable to average C++ programmers. It's practically useless. So this article serve as the quick tutorial for the Range view.

The Range is a beefed up Iterator. Just like the Iterator was modeled after the pointer, the Range was modeled after the pair of iterators [begin, end).

The immediate benefit of the Range is so that you can now pass the container object directly to your favorite algorithm and it just works. Because the Containers satisfy the concept of Range.

std::vector v = {1,2,3,4,5} ;
std::for_each( v,
[]( auto x )
{ std::cout << x ; }
) ;

Or, as its name suggests, Range-based for.

for ( auto x : v )
    std::cout << x ;

"But Range-based for existed from C++11! It was already there!" You might say. Well C++11's Range-based for didn't have the power it was originally intended. Because the Concept didn't make it to the C++11. Those good-for-nothing standard commitee members couldn't agreed on whether the concept map shall be implicitly generated or not. The same bunch of idiots who can't accept char8_t until C++20, blubbing: "but char is the suitable type for representing the contagious raw bytes" blah blah blah.

Now the Concept is there, we can finally emblace the full power of Range-based for.

The Range library has the View. A View is a range adaptor. Views behaves like a range, so it is a range. Sort of.

Now suppose we want to iterate over the above vector, but backwards.

C++17 Range-based for is useless for this very very simple task. We have to fallback to the C++98 era reverse iterator.

std::for_each( rbegin(v), rend(v),
    []( auto i )
    {
        std::cout << i ;
    } ) ;

At least, we have lambda expression. But it doesn't save much of the ugliness.

In C++20, you can simply use reverse_view.

for ( auto i : v | reverse )
    std::cout << i ;

Yes. That's it. This very simple, and actually readable code prints "54321". Don't worry. There is absolutely no performance penalty at all! C++ is better than Haskell.

Now, suppose that, you want to iterate over 1 to n. The n is a runtime value. Creating a vector object is inefficient.

std::vector<int> v ;
int n ;
std::cin >> n ;
for ( int i = 1 ; i != n+1 ; ++i ) 
    v.push_back(i) ;

Fortunately, C++20 has iota_view. iota(a, b) create a range of integers incrementing in range \(a \leq; n < b\) .

int n = 10 ;
for ( auto i : iota( 1, n ) | reverse )
    std::cout << i ;

Now, this code prints "987654321".

There are a lot of numbers. We want to get rid of the even numbers so that we only deal with odd numbers. We can use filter_view.

for ( auto i : iota(1, 10)
    | filter( [](auto x){ return x % 2 == 1 ; }
)
    std::cout << i ;

It prints "13579".

filter(rule) drop elements x where function rule(x) returns false.

Now let's suppose that we have a function is_prime(n) which returns true if n is probably a prime number. I don't go into details how we can implement is_prime. If you want to know, search for Miller–Rabin.

This code prints all the prime between number 1-1000.

for ( auto i : iota(1, 1001)
    | filter( is_prime )
)
    std::cout << i << ', ' ;

This works. But what if you want the first 10 prime numbers? We can use take_view. take(n) takes n elements from the range.

for ( auto i : iota(1)
    | filter( is_prime )
    | take ( 10 )
)
    std::cout << i << ', ' ;

It prints "2, 3, 5, 7, 11, 13, 17, 19, 23, 29, "

You may notice that above code pass only one argument to iota_view. iota(n) create a range start from n and increment indefinitely. That means if you wrote like this:

for ( auto i : iota(1) )
    std::cout << i << '\n' ;

It prints numbers until it overflows and still continues printing overflowed numbers. It's a inifinite loop. It never stops.

take_view can stop the execution such inifinte loop because it only take n elements from the previous range.

for ( auto i : iota(1) | take(10) )
    std::cout << i '\n' ;

This code prints 1 to 10 and stop the loop.

We can use iota_view to write index loop. Suppose we want to iterate integer from 0 to 100. Traditionally, we write like this.

for ( int i = 0 ; i != 101 ; ++i )
    ... ;

This works. But frankly, I don't want to write this code. I have to manually initialize a variable, check for loop terminating condition, and increment the variable, all by my hands. What I want is to iterate over integer of range a to b. You see, I can achieve this by just specify a and b. You can achieve that with iota(a, b+1).

for ( auto i : iota( 1, 101 ) )
    std::cout << i ;

Speaking of index loop, have you ever heard of the FizzBuzz problem? It goes like this "Print natural numbers 1 to 100. But for numbers multiple of 3, print Fizz instead of that number. For multiples of 5, print Buzz instead. For both multiple of 3 and 5, print FizzBuzz."

We have already written the index loop of 1 to 100. Let's write a function fizzbuzz(n) which take number n and return a string it should print to.

auto fizzbuzz = []( auto n ) -> std::string
{
    if ( n % 15 == 0 )
        return "FizzBuzz" ;
    else if ( n % 3 == 0 )
        return "Fizz" ;
    else if ( n % 5 = 0 )
        return "Buzz" ;
    else
        return std::to_string(n) ;
}

Now we wrote function fizzbuzz, we can use transform_view to transform the each elements in the range to corresponding string it should print to.

for ( auto msg : iota( 1, 101 )
    | transform( fizzbuzz )
)
    std::cout << msg << '\n' ; 

Isn't this fantastic?

Finally, you can combine as many views as you like. You can iota it, filter it, transform it, take it, then reverse it.

for ( auto i : iota(1)
    | filter(filter_rule)
    | transform(transfrom_rule)
    | take(n)
    | reverse
)
    std::cout << i '\n' ;

You can add even more views after reverse if you really want.

All the standard library's view can be use either as piping the function object

for ( auto n : iota(1, 100) | fileter(rule) | reverse )
    std::cout << n ;

or using as _view class.

iota_view i( 1, 100 ) ;
filter_view f( i, rule ) ;
reverse_view r( f ) ;

for ( auto n : r )
    std::cout << n ;

Both code do the same things. Basically, "A | B(args)" means a view object of "B_view( A, args )".

There are other views such as split_view which split the range to range of range separated by a specific value.

auto str = "quick brown fox" ;
std::vector< std::string > words ;
for ( auto word : str | split(' ') )
    words.emplace_back( begin(word), end(word) ) ;

after execution, words is {"quick", "brown", "fox"}

or join_view which flatten range of range to range.

std::string flat ;
for ( auto c : words | join )
    flat.push_back(c) ;

flat is "quickbrownfox".

All the example code assumes we use following using declarations.

using namespace std::ranges ;
using namespace std::ranges::view ;

So how do we write a view? Well, that's rather difficult if you want to write a standard library quality view. But let's try.

Let's write a drop_view which dropss n elements from the range.

for ( auto i : iota(0, 10) | drop(5) )
    std::cout << i ;

This code prints "56789".

Here is the implementation.

template < InputRange V >
    requres View<V>
class drop_view : public view_interface< dropVieww<V> >
{
    V base_ = V() ;
    iterator_t<V> iter = begin(base_) ;
    iter_difference_t<iterator_t<V>> count_ ;
public :
    drop_view() = default ;
    constexpr drop_view( V base, iter_difference_t<iterator_t<V>> count )
        : base_(base), iter( std::next( begin(base_), count ), count_(count)
    { }

    template < ViewableRange R >
        requires Constructible< R, all_view<R> >
    constexpr drop_view( R && base, iter_difference_t<iterator_t<V>> count )
        : base_(std::forward<R>(base)), iter( std::next( begin(base_), count ), count_(count)
    { }

    constexpr V base() const
    { return base_ ; }

    constexpr auto begin()
    { return iter ; }
    constexpr auto begin() const
    { return iter ; }

    constexpr auto end()
    { return end(base_) ; }
    constexpor auto end() const
    { return end(base_) ; }

    constexpr auto size()
        requires SizedRange<V>
    { return size(base_) - count_ ; }
    
    constexpr auto size() const
        requires SizedRange<const V>
    { return size(base_) - count_ ; }

    template < Range R >
    drop_view( R &&, iter_difference_t<iterator_t<V>> )
        -> drop_view< all_view<R> > ;
} ;


// I'm not 100% sure this is the correct way to implement range adaptor object.
// If my interpretation of the draft is correct, it should be.

struct drop_range_adaptor_closure_object
{
    std::size_t count ;
    drop_range_adaptor_closure_object( std::size_t count )
        : count(count)
    { }
    template < ViewableRange R >
    constexpr auto operator( R && r )
    {
        return drop_view( std::forward<R>(r), count ) ;
    }
} ;
struct drop_range_adaptor_object
{
    template < ViewableRange R >
    constexpr auto operator () ( R && r, iter_difference_t<iterator_t<R>> count )
    {
        return drop_view( std::forward<R>(r), count ) ;
    }

    constexpr auto operator () ( std::size_t count )
    {
        return drop_range_adaptor_closure_object( count ) ;
    }

} drop ;

template < ViewableRange R >
constexpr auto operator | ( R && r, drop_range_adaptor_closure_object a )
{
    return a(std::forward<R>(r)) ;
}

Phew, that's Eric Niebler-level of boilarplate-ness. I think we need to wait the metaclass to eliminate the boilarplate code. Hopefully, we can have a metaclass within another decade.

1 comment:

said...

ふえー iota まで盗みやがった