Fallible Iterator Adapters

— 2020-04-12

A great aspect of the Rust stdlib is that many common operations are provided with a shared idiom. Rather than say having to remember how to write a correct sort function by hand, the stdlib provides a fast one for you.

One of the places with the most such conveniences is the std::iter::Iterator trait. This trait provides a rich set of operations for filtering, combining and merging streams of items. And because iterators are lazy they're incredibly efficient too .

However one downside to iterators is that they don't play well with all input. In particular there are some limits around Result (and Try in general). When using any stream adapter with fallible input , we generally want to be able to do several things:

We want to stop iterating (short-circuit) when we encounter an error.

We want to be able to return the error out of the iterator so it can be bubbled up.

We never want to have to ignore errors.

For example with the try_for_each function we're able to take a vec of names, and perform some IO operation on them. This will stop iterating over file names if an error occurs, and will immediately return the error:

use std::io::{stdout, Write}; let data = [" no_tea.txt ", " stale_bread.json ", " torrential_rain.png "]; data. iter (). try_for_each (| x | { writeln!(stdout(), " {} ", x)?; Ok(()) })?;

This works pretty smoothly. Now imagine we wanted to do something with IO operations. Say we had a folder for a database, and we wanted to compare the content of several files with each other, finding the smallest string by index .

Now we would likely want to use something like fs::read and Iterator::min_by , which we could do as such:

let offsets = [ 1 , 4 , 9 , 18 , 22 , 17 ] let res = db_files. iter (). min_by (| x , y | { let fx = fs::read_file(format!(" .db/ {} .index "), x). unwrap (); let fy = fs::read_file(format!(" .db/ {} .index "), y). unwrap (); fx. cmp (fy) });

Now as you might have noticed, we need to use unwrap here. Oh no! What this practically means is that if any error occurs when reading files our program will panic. Which means that Iterator::min_by probably shouldn't be used here, and we should turn this into manual loop logic instead.

Surveying Fallible Adapters

In order to figure out which methods are incompatible with fallible operations we should take a look at all methods on Iterator . The only methods that have the potential to be incompatible are the ones that take closures.

We've created an overview of all iterator methods that take closures. Here we look at the iterator adapter name, the signature of the closure that's passed, and whether it can short-circuit on error. Finally we also look at whether it has a fallible counterpart that can short-circuit.

In the following overview there are probably ways to make certain methods work with errors, or perhaps they work if Result has been passed into the method, but does not originate from the method. The measure we'll be using is whether the ? operator can be used inside the closure.

Name Closure Signature Accepts try operator? Fallible counterpart all FnMut(Self::Item) -> bool No - any FnMut(Self::Item) -> bool No - cmp_by FnMut(Self::Item, <I as IntoIterator>::Item) -> Ordering No - eq_by FnMut(Self::Item, <I as IntoIterator>::Item) -> Ordering No - filter FnMut(&Self::Item) -> bool No - filter_map FnMut(Self::Item) -> Option<B> Yes - find FnMut(&Self::Item) -> bool No try_find find_map FnMut(Self::Item) -> Option<B> Yes - flat_map FnMut(Self::Item) -> impl IntoIterator Yes - fold FnMut(B, Self::Item) -> B No try_fold for_each FnMut(Self::Item) No try_for_each inspect FnMut(&Self::Item) Yes - is_partitioned FnMut(Self::Item) -> bool No - is_sorted_by FnMut(&Self::Item, &Self::Item) -> Option<Ordering> No - is_sorted_by_key FnMut(&Self::Item) -> impl PartialOrd No - map FnMut(Self::Item) -> B Yes - max_by FnMut(&Self::Item, &Self::Item) -> Ordering No - max_by_key FnMut(&Self::Item) -> impl Ord No - min_by FnMut(&Self::Item, &Self::Item) -> Ordering No - min_by_key FnMut(&Self::Item) -> impl Ord No - partial_cmp_by FnMut(Self::Item, <I as IntoIterator>::Item) -> Option<Ordering> No - partition FnMut(&Self::Item) -> bool No - partition_in_place FnMut(&T) -> bool No - position FnMut(Self::Item) -> bool No - rposition FnMut(Self::Item) -> bool No - scan FnMut(&mut St, Self::Item) -> Option<B> No - skip_while FnMut(&Self::Item) -> bool No - take_while FnMut(&Self::Item) -> bool No - try_find FnMut(&Self::Item) -> impl Try<Ok = bool, Error = E> Yes - try_fold FnMut(B, Self::Item) -> impl Try<Ok = B> Yes - try_for_each FnMut(Self::Item) -> impl Try<Ok = ()> Yes -

Out of the 63 current stream adapters available, 30 take closures. Of those 30 adapters only 10 can bubble up a Result or have a counterpart that can do so. That means that 20 methods, or almost a third of the functionality in Iterator is currently incompatible with fallible operations.

Implementing missing adapters

If we wanted to expose the full range of functionality to fallible iterators, we would likely need to create try_ counterparts to the stream adapters that currently don't support it. This is a list of methods and their proposed fallible counterparts:

Name Fallible counterpart all try_all any try_any cmp_by try_cmp_any eq_by try_eq_by filter try_filter inspect try_inspect is_partitioned try_is_partitioned is_sorted_by try_is_sorted_by is_sorted_by_key try_is_sorted_by_key max_by try_max_by max_by_key try_max_by_key min_by try_min_by min_by_key try_min_by_key partial_cmp_by try_partial_cmp_by partition try_partition partition_in_place try_partition_in_place position try_position rposition try_rposition scan try_scan skip_while try_skip_while take_while try_take_while

Conclusion

In this post we've looked at the various adapters std::iter::Iterator provides, talked about error handling, and surveyed how existing methods interact with them. Finally we've provided an overview of methods that currently don't provide support for the try operator in the closure body, and which counterparts would need to be added so they could work.

Some readers may be wondering now whether it's actually worth adding several new methods to Iterator , and whether that's perhaps too much. But I don't think counting methods is a particularly effective way of making API decisions . Instead it's better to look at overall cohesion and usability.

I think a more useful framework for answering this design question is to ask: "Are we okay that a third of the methods on iterator only work some of the time?" I think the answer there should be a clear "no". If a person wants to perform IO operations while filtering some input, they should be able to, without needing to start over and write manual loops from scratch.