[iterators in rust]
written 12/17/2024
around and around we go
Sandbox
The git repo where I played around with these ideas can be found here.
Also, my conversation with chatGPT regarding this topic can be found here.
Pitstop
While I was learning hyper, I realized I needed to take a pitstop and learn a bit more about Rust iterators. From what I understand, iterators are preferred over loops and I see them used a lot around Rust.
I know with the zero-cost abstraction model Rust has, it makes using iterators the same cost as running a loop. Let's see whats up.
The Book
First, I read the entry in The Book regarding iterators to see what the Rust team has to say about them.
Lazy
First, I learned iterators are lazy and they do not do anything until you consume the iterators.
This means that iterators are first created, then consumed. Can an iterator be consumed more than once? Nope.
Here is an example of iterating through a vector:
1fn using_iter_on_vec() {
2 let some_vec = vec!["a", "b", "c"];
3 let my_iterator = some_vec.iter();
4 for val in my_iterator {
5 println!("Value: {}", val);
6 }
7}
Many Different Methods
Iterators are really Rust's take on higher order functions. It looks like many different methods can be combined to get different results with your iterator. I think the key here is really understanding what these methods do and how they work.
Here is an example of transforming data with map
and gathering the output into a vector with collect
:
1fn using_map_and_collect() {
2 let v = vec![0, 1, 2];
3 let incremented: Vec<_> = v.iter().map(|x| x + 1).collect();
4 println!("{:?}", incremented);
5}
In this example, I see two concepts I am not entirely sure of so I need to diverge and research them.
Vec<_>
I am guessing the Vec<_>
represents some generic type. Maybe it means the lack of a type? I don't know. But I know I want to know!..
..OK it looks like the Vec<_>
syntax is a way of stating the variable is a vector with a type that should be inferred based on how the vector is used. This is a concept that has come up a few times but I'd like to state it here:
Rust is always attempting to guess the types in your code base and it is able to do so because of how strict the compiler is. Here, when we use Vec<_>
, we are knowingly taking advantage of the fact that the compiler can make inferences about the code base and about how types are used.
Inspecting Closures
I am taking note of this line here:
1let incremented: Vec<_> = v.iter().map(|x| x + 1).collect();
I notice a closure is used in map()
and I need to do a little bit of research about how closures work in Rust.
I do want to address them a bit right here, though. Let me do a bit of digging to get some light context..
..OK, the first thing I am finding is that closures are intended for short-lived computations. They are similar to lambda
functions in other languages. Basically, a closure is an unnamed function that lives only for the time it is used.
The basic syntax of a closure is:
1|parameters| expression
Another unique thing about closures versus functions is that closures capture variables from their surrounding environment while functions do not.
Oh man, this might be a bigger trail than I thought..
Closures
It seems I've made an error. Iterators are useful, but they make heavy use of closures. And if we do not understand closures, then we cannot make proper use of iterators.
We need to spend some time looking into closures and getting a good grasp on how they are used.
I mentioned it above, but the first thing to know is that closures capture their surrounding environment. For example:
1let y = 5;
2let add_y = |x| x + y; // `y` is captured from the environment
3println!("{}", add_y(3)); // 8
How Closures Capture Their Environment
I am learning the majority of this from chatGPT, so be sure to checkout the chat as this content is being directly influenced from it's output.
It looks like a closure can capture the surrounding environment by:
borrowing:
1fn using_closure_with_borrow() {
2 let x = 10;
3 let my_closure = || println!("x is {}", x);
4 my_closure();
5}
mutably borrowing:
1fn using_closure_with_mut_borrow() {
2 let mut x = 10;
3 let mut increment_x = |y| x += y;
4 increment_x(1);
5 println!("x is {}", x);
6}
And take note in the above example, we marked the closure itself as mut
along with x
, the value being mutated.
And finally, taking ownership:
1fn using_closure_with_ownership() {
2 let x = String::from("I am owned!");
3 let take_x = move || println!("{}", x);
4 take_x();
5 // println!("{}", x); // Error: `x` has been moved
6}
Where Can Closures Be Used?
Anywhere a function or callback is expected. This is what makes closures so popular with iterators, the topic we are currently discussing, remember?
Creating a Function Which Takes a Closure
And I am just too curious to walk away. Here is a function which takes a closure as input:
1fn using_fn_with_closure_input<F>(func: F) where F: Fn(String) {
2 func(String::from("I am injected into the closure!"));
3}
4
5using_fn_with_closure_input(|str| println!("{}", str));
Oh and this is very helpful. Here, I discovered the way in which we use Fn
, FnMut
, and FnOnce
, will dictate the way our closure captures its environment when being utilized as input into a function.
1fn call_fn<F: Fn()>(f: F) { f(); } // Fn
2fn call_fnmut<F: FnMut()>(mut f: F) { f(); } // FnMut
3fn call_fnonce<F: FnOnce()>(f: F) { f(); } // FnOnce
4
5let x = String::from("Hello");
6
7call_fnonce(move || println!("{}", x)); // Takes ownership (FnOnce)
8call_fn(|| println!("This just borrows")); // Borrows (Fn)
Just know, if we are ever in a situation where we are using a closure as input to a function, choosing between the three above options is part of the process and influences how the closure handles memory in relation to it's environment.
Back To Iterators
We can filter a list of elements using filter
:
1fn using_filter() {
2 let nums = vec![1,2,3,4,5,6,7,8,9,10];
3 let even: Vec<_> = nums.iter().filter(|&&x| x % 2 == 0).collect();
4 println!("{:?}", even);
5}
I can see why people love iterators with how concise the syntax is. However, in the above example, I am wondering why &&x
was used.
For some reason, these two options run fine as well:
1fn using_filter() {
2 let nums = vec![1,2,3,4,5,6,7,8,9,10];
3 let even: Vec<_> = nums.iter().filter(|x| *x % 2 == 0).collect();
4 println!("{:?}", even);
5}
6
7fn using_filter() {
8 let nums = vec![1,2,3,4,5,6,7,8,9,10];
9 let even: Vec<_> = nums.iter().filter(|&x| x % 2 == 0).collect();
10 println!("{:?}", even);
11}
Now, it makes sense and I understand how dereferencing works in the first example, but I do not understand why both &&x
and &x
work in this scenario..
..OK it looks like all three have the same outcome. I still don't fully understand why, but I do know in the above example everything resolves to i32
. Using &&x
is somehow communicating a double-unwrapping of a reference, but also seeing &x
is the "correct" way to go about it. All in all, I feel confortable with both *x
and &x
so I will just stick to &x
until I learn more.
Implementing Your Own iterators
I think the most important question is, "How do we make our own types iterators?". Well, Iterator
is just a trait. We like with any trait, we can implement the trait on our custom types.
Here is a TimeBomb
type I implemented the Iterator
trait on and then made use of in the function using_custom_iterator
:
1struct TimeBomb {
2 count: u32,
3 limit: u32,
4}
5
6impl Iterator for TimeBomb {
7 type Item = u32;
8 fn next(&mut self) -> Option<Self::Item> {
9 self.count += 1;
10 if self.count < self.limit {
11 Some(self.count)
12 } else {
13 println!("💥");
14 None
15 }
16 }
17}
18
19fn using_custom_iterator() {
20 let mut tb = TimeBomb{
21 count: 0,
22 limit: 10,
23 };
24 while let Some(count) = tb.next() {
25 println!("{}", count);
26 }
27}
Other Examples
Here are some of the other examples I ended up coming across, all of which are in the repo listed above:
1// combining methods like map and filter
2fn using_map_and_filter() {
3 let nums = vec![1,2,3,4,5,6];
4 let doubled_even: Vec<_> = nums
5 .iter()
6 .filter(|&x| x % 2 == 0)
7 .map(|&x| x * 2)
8 .collect();
9 println!("{:?}", doubled_even);
10}
11
12// consuming an iterator using sum
13fn using_sum_to_consume() {
14 let nums = vec![1,2,3,4];
15 let total: i32 = nums.iter().sum();
16 println!("{:?}", total);
17}
18
19// chaining operations with fold and building accumlative values
20fn using_fold_to_accumulate() {
21 let nums = vec![1,2,3,4,5];
22 let sum = nums.iter().fold(0, |acc, &x| acc + x);
23 println!("{:?}", sum);
24}
25
26// using into_iter to consume and take ownership of a collection
27fn using_into_iter_to_take_ownership() {
28 let nums = vec![1,2,3,4,5];
29 for num in nums.into_iter() {
30 println!("{}", num);
31 }
32 // nums is no longer valid
33}
34
35// using iter_mut to mutate values in place
36fn using_iter_mut_for_mutation() {
37 let mut nums = vec![1,2,3,4];
38 for num in nums.iter_mut() {
39 *num = *num + 1;
40 }
41 println!("{:?}", nums);
42}
43
44// using enumerate to add an index to each iter_mut
45fn using_enumerate_to_index() {
46 let colors = vec!["red", "green", "blue"];
47 for (i, color) in colors.iter().enumerate() {
48 println!("{}: {}", i, color);
49 }
50}
Conclusion
I think iterators are an important concept in Rust and they seem very widespread throughout the language. Iterators are tied closely to closures, and so having a solid understanding on how closures interact with their environment is really the key to making solid use of iterators.