Python 中的模式匹配

Python 中的模式匹配

在今年6月份，Python PEP 622 准备引入一个新的语法 —— 模式匹配（Structural Pattern Matching）。这是一个非常大的语法特性。该 PEP 内容过多，后来又分成了 3 个 PEP ，[PEP 634](https://www.python.org/dev/peps/pep-0634/) ，[PEP 635](https://www.python.org/dev/peps/pep-0635/) ，[PEP 636](https://www.python.org/dev/peps/pep-0636/)。3 个 PEP 的阅读顺序是： PEP 636 -> PEP 634 -> PEP 635。
PEP 636：模式匹配的应用教程，可以让阅读者跟着教程使用模式匹配手写一个交互式文字冒险游戏。
PEP 634：模式匹配的语法规则。
PEP 635：语法规则的背后，为什么会规定成这样子的语法。

下面大致介绍一下 PEP 636 的内容，了解一下什么是模式匹配，和它的使用场景。

"模式匹配(pattern match)" 简单来说有2个功能，解构赋值和流程控制。

多模式匹配

command = input("What are you doing next? ")  # like "go north, quit"
match command.split():
    case [action]:
        ... # interpret single-verb action
    case [action, obj]:
        ... # interpret action, obj

一个模式匹配可以完成以下两件事：

检查被匹配的对象是都满足特定结构，例如我们上面的代码，如果 command.split() 放回结果是一个长度为 1 的列表，匹配第一个模式，如果是长度为 2 的列表，匹配第二个模式。
将匹配到的元素与绑定到变量上。在本例中，如果 command.split() 返回的列表有一个元素，它将绑定 action = list[0]，如果是2个元素，则匹配第二个模式，绑定为 action = list[0] and obj = list[1]。

匹配顺序从上到下，如果匹配，则使用绑定变量执行 case 语块中的语句，如果没有匹配，则什么都不会做 (实际有大坑)。这看起来很像我们平时开发中使用的 解构表达式，解包中的大部分语法都能在模式匹配中使用，但其实并不完全兼容。

常量匹配

match command.split(): 
    case ["quit"]:
        print("Goodbye!")
        quit_game()
    case ["look"]:
        current_room.describe()
    case ["get", obj]:
        character.get(obj, current_room)
    case ["go", direction]:
        current_room = current_room.neighbor(direction)

常量匹配类似于其他语言的 switch case，被匹配元素必须与常量 == 为 True 才算匹配。

第三个 case 表示匹配一个列表长度为 2，且第一个元素等于 “get” ，此时绑定 obj = list[1]。

多值匹配

match command.split():
    case ["drop", *objects]:
        for obj in objects:
            character.drop(obj, current_room)

和 解构表达式 一样，可以用 * 号来表示序列中的多个元素

通配符

match command.split():
    case ["quit"]: ... # Code omitted for brevity
    case ["go", _,]: ...
    case ["drop", *objects]: ...
    ... # Other cases
    case _:
        print(f"Sorry, I couldn't understand {command!r}")

特殊模式：_ 通配符，匹配任何模式，但不会绑定任何变量，它可以用来列表中来表示任一元素。

当 _ 作为单独的匹配模式时，只能作为最后一个模式（如代码段中的形式），不然会引发错误。

嵌套模式

match command.split():
    case ["first", (left, right), _, *rest]:
        print(left, right, rest)

变量的绑定情况：left = list[1][0], right = list[1][1], and rest = list[3:]

支持嵌套模式意味着我们可以使用递归处理一些层次比较多的结构。

Or 模式

match command.split():
    ... # Other cases
    case ["north"] | ["go", "north"]:
        current_room = current_room.neighbor("north")
    case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
        ... # Code for picking up the given object

使用 | 来表示或模式

子模式

match command.split():
    case ["go", ("north" | "south" | "east" | "west")]:
        current_room = current_room.neighbor(...)
        # how do I know which direction to go?

为模式增加条件

match command.split():
    case ["go", direction] if direction in current_room.exits:
        current_room = current_room.neighbor(direction)
    case ["go", _]:
        print("Sorry, you can't go that way")

条件判断不是 patten 的一部分，而且 case 的一部分。也就是说，会先执行匹配，如果匹配成功则会执行条件判断，条件判断成功则执行 case 语块的语句。

如果条件失败，则进行下一个匹配，但是有副作用！给之前成功的变量绑定数。如上面的例子，如果 command.split() 返回 [“go”, “down”]，且 “down” 不在 current_room.exits 中，则会进行第二个 case 的判定，但是 direction 会绑定 “down”。

匹配对象

from dataclasses import dataclass
@dataclass
class Click:
    position: tuple
    button: Button
match event.get(): # Click(postion=(1, 2), buttion=3)
    case Click(position=(x, y)):
        handle_click_at(x, y)
match event.get():
    case Click((x, y)):
        handle_click_at(x, y)

两种方法都能将（x，y）模式与 position 属性匹配。

第一种方法是标准的使用显示参数来表示要匹配的模式。

第二种方法比较特殊，只有当使用内置的 dateclass 创建类时，才能允许模式使用位置匹配属性。

第二种方法的本质原因是：当使用内置的 dateclass 类装饰器创建类时，会按属性的顺序生成一个 __match_args__ ，所以，普通类想使用位置匹配模式的话，可以手动指定 __match_args__ ，如下

class Click:
    __match_args__ = ["position", "button"]
    def __init__(self, position, button):
        ...

匹配枚举对象

match event.get():
    case Click((x, y), button=Button.LEFT):  # This is a left click
        handle_click_at(x, y)
    case Click():
        pass  # ignore other clicks

重点是一定要使用，类.属性的方式来表示常量。

匹配字典

for action in message:
    match action:
        case {"text": message, "color": c}:
            ui.set_text_color(c)
            ui.display(message)
        case {"sleep": duration}:
            ui.wait(duration)
        case {"sound": url, "format": "ogg"}
            ui.play(url)
        case {"sound": _, "format": _}
            warning("Unsupported audio format")

字典模式的键必须是字符串，值可以是任意模式。当所有的子模式（键名）都匹配时，才算整个模式匹配，可以使用 **rest 表示剩下的所有键。

只要被匹配的字典中有存在满足模式的键，即可匹配，无需满足所有键，例如，{"text": "111", “a”: "b", "color": "red"} 会被第一个模式匹配。

匹配内置类

for action in message:
    match action:
        case {"text": str(message), "color": str(c)}:
            ui.set_text_color(c)
            ui.display(message)
        case {"sleep": float(duration)}:
            ui.wait(duration)
        case {"sound": str(url), "format": "ogg"}
            ui.play(url)
        case {"sound": _, "format": _}
            warning("Unsupported audio format")

任何类都是有效的匹配目标，包括内置类。这是有一个语法糖，以第一个模式举例，case {"text": str(message), "color": str(c)} == case {"text": str() as message , "color": str() as c} ，明显第一个语句的可读性更好。

[PEP 636]: https://www.python.org/dev/peps/pep-0636/
[Unclear partial binding semantics]: https://github.com/gvanrossum/patma/issues/110 “匹配失败的绑定语义”
[Revisit load vs. store]: https://github.com/gvanrossum/patma/issues/90 “如何表示区分常量和变量”
[tracker]: https://github.com/gvanrossum/patma “讨论开发流程，跟踪开发进度”
[cpython]: https://github.com/brandtbucher/cpython/tree/patma “cpython pattern match 开发分支”
[jupyter]: https://mybinder.org/v2/gh/gvanrossum/patma/master?urlpath=lab/tree/playground-622.ipynb “在线测试 pattern match 语法”