第14章 Closure - 《Rust程序设计》

闭包的引入
- c++
- python
三种类型的闭包
“偷取”值的闭包
闭包的性能
闭包的 Copy 和 Clone

:::info 本章内容改编自《Programming Rust, 2nd Edition》的第14章。 :::

闭包的引入

主流的语言都支持闭包的操作，下面有一些简单的例子。

c++

int base = 10;
auto add_base_value = [=] (int x) { return x + base; };
auto add_base_ref = [&] (int x) { return x + base; };
base = 20;
assert(add_base_value(30) == 10 + 30);
assert(add_base_ref(30) == 20 + 30);

该例子定义了两个匿名函数，分别捕获了 base 的值和引用，实现了不同语义的 add_base

python

base = [1]
add_base_value = lambda x, base=base.copy(): base + x
add_base_ref = lambda x: base + x
base.append(2) # now base is [1, 2]
assert add_base_value([3, 5]) == [1, 3, 5]
assert add_base_ref([3, 4]) == [1, 2, 3, 4]

因为 python 总是传递引用，若想要创建捕获值的闭包，需要视情况手动拷贝捕获的值。
简单理解，闭包就是捕获了一些值的函数。

三种类型的闭包

Rust 中有三种不同类型的闭包，分别为 Fn, FnMut, FnOnce

Fn

Fn 是只捕获不变引用、或根本没有捕获任何值的闭包

struct City {
    name: String,
    population: i64,
    country: String,
    ...
}
/// Helper function for sorting cities by population.
fn city_population_descending(city: &City) -> i64 {
    -city.population
}
fn sort_cities_by_helperfunction(cities: &mut Vec<City>) {
    cities.sort_by_key(city_population_descending);
}
fn sort_cities_by_closure(cities: &mut Vec<City>) {
    cities.sort_by_key(|city| -city.population);
}

对比 sort_cities_by_helperfunction 函数与 sort_cities_by_closure 函数，sort_cities_by_closure 通过闭包的方式定义了 City 的比较方法——这个闭包根据传入的 city ，返回其人口数量的相反数——以实现按人口降序排序的功能。
容易看出，该处的闭包并没有捕获任何值。

fn sort_by_statistic(cities: &mut Vec<City>, stat: Statistic) {
    cities.sort_by_key(|city| -city.get_statistic(stat));
}

sort_by_statistic 通过 stat 指定排序的属性，该处的闭包就捕获了 stat 变量。因为在闭包内并未对 stat 变量做任何修改，所以 Rust 在创建闭包时会自动借用一个 Stat 的不可变引用，此时这个闭包依然是属于 Fn 类型（它仅包含不可变引用）。

FnOnce

考虑到 Rust 的所有权概念，可以想到并非所有闭包都能被无限调用：如果闭包捕获的值的所有权已经被上一个闭包消耗掉了，那么后续便无法调用了。为此，Rust 抽象出了仅能调用一次的闭包，即 FnOnce 。

let my_str = "hello".to_string();
let f = || drop(my_str);
f();
f();

考虑这样的一个例子，闭包 f 调用了 drop ，释放了 my_str 的内存空间。如果再次调用 f ，第二次尝试释放 my_str 空间时便会出现 C++ 编程中常见的问题 double free 。当然 Rust 的编译器不会被轻易骗过，当尝试编译时，会出现编译错误：

error[E0382]: use of moved value: `f`
 --> test.rs:5:5
  |
4 |     f();
  |     --- `f` moved due to this call
5 |     f();
  |     ^ value used here after move
  |

接下来尝试其他方法“骗过” Rust 编译器：

fn call_twice<F>(closure: F) where F: Fn() {
    closure();
    closure();
}
let my_str = "hello".to_string();
let f = || drop(my_str);
call_twice(f);

函数 call_twice 的参数为一个闭包，并在函数体内调用了两次这个闭包。我们尝试把仅能调用一次的不安全的闭包传递给 call_twice ：

error[E0525]: expected a closure that implements the `Fn` trait, but this closure only implements `FnOnce`
 --> test.rs:8:13
  |
8 |     let f = || drop(my_str);
  |             ^^^^^^^^------^
  |             |       |
  |             |       closure is `FnOnce` because it moves the variable `my_str` out of its environment
  |             this closure implements `FnOnce`, not `Fn`
9 |     call_twice(f);
  |     ---------- the requirement to implement `Fn` derives from here

编译信息告诉我们：闭包 f 将 my_str 移出了它原来的环境，所以它是 FnOnce 类型而非 Fn ，并不能传给 call_twice 函数。

FnMut

FuMut 是另一种类型的闭包，其包含可变的数据，或者是 mut 引用。
Rust 认为，non-mut 的值可以安全的在线程间共享，但线程间共享 mut 数据的 non-mut 闭包可能因数据竞争 (race) 引发不安全的行为。
所以又定义了一种类别 FuMut ，用于对捕获值进行写入（但不删除）的闭包。

// Pseudocode for `Fn`, `FnMut`, and `FnOnce` traits.
trait Fn() -> R {
    fn call(&self) -> R;
}
trait FnMut() -> R {
    fn call_mut(&mut self) -> R;
}
trait FnOnce() -> R {
    fn call_once(self) -> R;
}

可以假想 Rust 内部对各种类别的闭包做了这样的区分。

let mut i = 0;
let incr = || {
    i += 1;  // incr borrows a mut reference to i
    println!("Ding! i is now: {}", i);
};
call_twice(incr);

一个简单的 FnMut 类型的闭包例子。 incr 闭包内部将值 i 自增了 1 并输出。
虽然上述的代码依然无法正常编译——类似于 FnOnce ，FnMut 类型的闭包也不能传给 call_twice ——但可以简单总结一下三种不同类别闭包的关系。

Fn 是可以没有限制地调用多次的闭包和函数集，包括所有的 fn 函数
FnMut 是如果闭包本身被声明为 mut 时可以调用多次的闭包集
FnOnce 是当调用者拥有它时可以调用一次的闭包集

所以这三者有着包含关系：

FnMut 是 FnOnce 的子集
Fn 是 FnMut 的子集
fn （普通的函数）是 Fn 的子集

根据以上知识，call_twice 可以接受所有的 FnMut 闭包：

fn call_twice<F>(mut closure: F) where F: FnMut() {
    closure();
    closure();
}

这样上文的 incr 闭包便能作为参数传入 call_twice 函数。

“偷取”值的闭包

use std::thread;
fn start_sorting_thread(mut cities: Vec<City>, stat: Statistic)
    -> thread::JoinHandle<Vec<City>>
{
    let key_fn = |city: &City| -> i64 { -city.get_statistic(stat) };
    thread::spawn(|| {
        cities.sort_by_key(key_fn);
        cities
    })
}

考虑这样一个例子，函数 start_sorting_thread 创建了一个新的线程以实现排序的功能。
类似前文的例子，闭包 key_fn 包含 stat 的引用，但与前文不同，stat 在函数结束后生命周期就结束了，不过新线程里的排序却依然需要 stat 属性来确定如何排序，所以在编译时 Rust 会给出错误信息。

error[E0373]: closure may outlive the current function, but it borrows `stat`,
              which is owned by the current function
  --> closures_sort_thread.rs:33:18
   |
33 | let key_fn = |city: &City| -> i64 { -city.get_statistic(stat) };
   |              ^^^^^^^^^^^^^^^^^^^^                       ^^^^
   |              |                                      `stat` is borrowed here
   |              may outlive borrowed value `stat`

不只是 stat ，cities 变量也会不安全地被共享，这根本的原因是创建的新线程不能保证在 cities 和 stat 生命周期结束前完成工作。
为了解决这一问题，Rust 提供一种把变量移动 (Move) 到闭包中的实现，而不是借用引用。

fn start_sorting_thread(mut cities: Vec<City>, stat: Statistic)
    -> thread::JoinHandle<Vec<City>>
{
    let key_fn = move |city: &City| -> i64 { -city.get_statistic(stat) };
    thread::spawn(move || {
        cities.sort_by_key(key_fn);
        cities
    })
}

与上述代码不同的地方是，在两个闭包前加上了 move 关键字，以告诉 Rust 并非借用值，而是移动：在第一个闭包里，获得了 stat 的所有权，第二个闭包里获得了 cities 与 key_fn 的所有权。
当然，对于一些 Copy Type，Rust 并不会尝试 move ，而会尝试拷贝它们的值，所以即使闭包包含了 move 关键字，这些 Copy Type 类型的数据在闭包后也能正常的使用。
如果是 Vec<City> 这样的非拷贝类型，就无法在创建闭包后再次访问了。
这些严格的规则均是为了保证线程安全。

闭包的性能

其它的语言（如 python，javascript 等），闭包被生成在堆上，还需要 GC 回收，这些额外的操作导致了大量运算资源的浪费。但 Rust 的闭包被设计得比函数指针更快，甚至快到可以在非常频繁、性能敏感的代码中使用，并且安全：它们被生成在栈上，编译器知道具体的类型，也被设计得尽可能地节省空间。

闭包的 Copy 和 Clone

闭包可以被理解为包含一些字段（捕获的值或引用）的 Struct ，并附着了方法。Rust 也能根据捕获的值的类型推断闭包是否能 Copy 与 Clone。

一个没有可变变量的 non-move 闭包只有共享引用，共享引用是 Clone 和 Copy 的，所以这种闭包当然也是 Clone 和 Copy 的：

let y = 10;
let add_y = |x| x + y;
let copy_of_add_y = add_y;
assert_eq!(add_y(copy_of_add_y(22)), 42);

一个有可变值的 non-move 闭包在其内部表示中包含可变引用，所以既不能 Copy 也不能 Clone：

let mut x = 0;
let mut add_to_x = |n| { x += n; x };
let copy_of_add_to_x = add_to_x; // move
assert_eq!(add_to_x(copy_of_add_to_x(1)), 2); // error

对于 move 闭包：

如果所有捕获值都是 Copy 的，那么闭包是 Copy 的

如果所有捕获值都实现了 Clone，那么闭包也是可 Clone 的

let mut greeting = String::from("Hello, ");
let greet = move |name| {
  greeting.push_str(name);
  println!("{}", greeting);
};
greet.clone()("Alfred"); // Hello, Alfred
greet.clone()("Bruce"); // Hello, Bruce

在 clone greet 时，其内部的 greeting 也被 clone 。