11. 类型转换


  1. 任意侧使用原生的C++类型。这种情况下,必须使用pybind11生成类型的绑定,Python才能使用它。
  2. 任意侧使用原生的Python类型。同样需要包装后,C++函数才能够使用它。
  3. C++侧使用原生C++类型,Python侧使用原生Python类型。pybind11称其为类型转换。 某种意义下,在任意侧使用原生类型,类型转换是最自然的选项。该方法主要的缺点是,每次Python和C++之间转换时都需要拷贝一份数据,因为C++和Python的对相同类型的内存布局不一样。 pybind11可以自动完成多种类型的转换。后面会提供所有内置转换的表格。


11.1 概述

1. Native type in C++, wrapper in Python


2. Wrapper in C++, native type in Python


  1. void print_list(py::list my_list) {
  2. for (auto item : my_list)
  3. std::cout << item << " ";
  4. }
  1. >>> print_list([1, 2, 3])
  2. 1 2 3

Python的list仅仅是包裹在了C++ py::list类里,并没有仅仅任何转换。它的核心任然是一个Python对象。拷贝一个py::list会像Python中一样增加引用计数。将对象返回到Python侧,将去掉这层封装。

3. Converting between native C++ and Python types


  1. void print_vector(const std::vector<int> &v) {
  2. for (auto item : v)
  3. std::cout << item << "\n";
  4. }
  1. >>> print_vector([1, 2, 3])
  2. 1 2 3

这个例子中,pybind11将创建一个std::vector<int>实例,并从Python list中拷贝每个元素。然后将该实例传递给print_vector。同样的事情发生在另一个方向:新建了一个list,并从C++的vector中获取元素值。

如下表所示,多数转换是开箱即用的。他们相当方便,但请记住一点,这些转换是基于数据拷贝的。这对小型的不变的类型相当友好,对于大型数据结构则相当昂贵。这可以通过自定义包装类型重载自动转换来规避(如上面提到的方法1)。This requires some manual effort and more details are available in the Making opaque types section.



Data type Description Header file
int8_t, uint8_t 8-bit integers pybind11/pybind11.h
int16_t, uint16_t 16-bit integers pybind11/pybind11.h
int32_t, uint32_t 32-bit integers pybind11/pybind11.h
int64_t, uint64_t 64-bit integers pybind11/pybind11.h
ssize_t, size_t Platform-dependent size pybind11/pybind11.h
float, double Floating point types pybind11/pybind11.h
bool Two-state Boolean type pybind11/pybind11.h
char Character literal pybind11/pybind11.h
char16_t UTF-16 character literal pybind11/pybind11.h
char32_t UTF-32 character literal pybind11/pybind11.h
wchar_t Wide character literal pybind11/pybind11.h
const char * UTF-8 string literal pybind11/pybind11.h
const char16_t * UTF-16 string literal pybind11/pybind11.h
const char32_t * UTF-32 string literal pybind11/pybind11.h
const wchar_t * Wide string literal pybind11/pybind11.h
std::string STL dynamic UTF-8 string pybind11/pybind11.h
std::u16string STL dynamic UTF-16 string pybind11/pybind11.h
std::u32string STL dynamic UTF-32 string pybind11/pybind11.h
std::wstring STL dynamic wide string pybind11/pybind11.h
std::string_view, std::u16string_view, etc. STL C++17 string views pybind11/pybind11.h
std::pair<T1, T2> Pair of two custom types pybind11/pybind11.h
std::tuple<...> Arbitrary tuple of types pybind11/pybind11.h
std::reference_wrapper<...> Reference type wrapper pybind11/pybind11.h
std::complex<T> Complex numbers pybind11/complex.h
std::array<T, Size> STL static array pybind11/stl.h
std::vector<T> STL dynamic array pybind11/stl.h
std::deque<T> STL double-ended queue pybind11/stl.h
std::valarray<T> STL value array pybind11/stl.h
std::list<T> STL linked list pybind11/stl.h
std::map<T1, T2> STL ordered map pybind11/stl.h
std::unordered_map<T1, T2> STL unordered map pybind11/stl.h
std::set<T> STL ordered set pybind11/stl.h
std::unordered_set<T> STL unordered set pybind11/stl.h
std::optional<T> STL optional type (C++17) pybind11/stl.h
std::experimental::optional<T> STL optional type (exp.) pybind11/stl.h
std::variant<...> Type-safe union (C++17) pybind11/stl.h
std::filesystem::path<T> STL path (C++17) 1 pybind11/stl.h
std::function<...> STL polymorphic function pybind11/functional.h
std::chrono::duration<...> STL time duration pybind11/chrono.h
std::chrono::time_point<...> STL date/time pybind11/chrono.h
Eigen::Matrix<...> Eigen: dense matrix pybind11/eigen.h
Eigen::Map<...> Eigen: mapped memory pybind11/eigen.h
Eigen::SparseMatrix<...> Eigen: sparse matrix pybind11/eigen.h

11.2 Strings, bytes and Unicode conversions

Note: 本节讨论的string处理基于Python3 strings。对于python2.7,使用unicode替换strstr替换bytes。Python2.7用于最好使用from __future__ import unicode_literals避免无意间使用str代替unicode

11.2.1 传递Python strings到C++

当向一个接收std::stringchar *参数的函数传递Python的str时,pybind11会将Python字符串编码为UTF-8。所有的Python str都能够用UTF-8编码,所以这个操作不会失败。

C++语言是encoding agnostic。程序员负责处理编码,最简单的做法就是每处都使用UTF-8。

  1. m.def("utf8_test",
  2. [](const std::string &s) {
  3. cout << "utf-8 is icing on the cake.\n";
  4. cout << s;
  5. }
  6. );
  7. m.def("utf8_charptr",
  8. [](const char *s) {
  9. cout << "My favorite food is\n";
  10. cout << s;
  11. }
  12. );
  1. >>> utf8_test("🎂")
  2. utf-8 is icing on the cake.
  3. 🎂
  4. >>> utf8_charptr("🍕")
  5. My favorite food is
  6. 🍕

Note: 有些终端模拟器不支持UTF-8或emoji字体,上面的例子可能无法显示。



向接收std::stringchar *类型参数的C++函数传递Python bytes对象无需转换。在Python3上,如果想要函数只接收bytes,不接收str,可以声明参数类型为py::bytes

11.2.2 向Python返回C++ 字符串

当C++函数返回std::stringchar*参数给Python调用者时,pybind11会将字符串以UTF-8格式解码给原生Python str,类似于Python中的bytes.decode('utf-8')。如果隐式转换失败,pybind11将会抛出异常UnicodeDecodeError

  1. m.def("std_string_return",
  2. []() {
  3. return std::string("This string needs to be UTF-8 encoded");
  4. }
  5. );
  1. >>> isinstance(example.std_string_return(), str)
  2. True


Warning: 隐式转换假定char *字符串以null为结束符。若不是,将导致缓冲区溢出。



  1. // This uses the Python C API to convert Latin-1 to Unicode
  2. m.def("str_output",
  3. []() {
  4. std::string s = "Send your r\xe9sum\xe9 to Alice in HR"; // Latin-1
  5. py::str py_s = PyUnicode_DecodeLatin1(s.data(), s.length());
  6. return py_s;
  7. }
  8. );
  1. >>> str_output()
  2. 'Send your résumé to Alice in HR'

Python C API提供了一些内置的编解码方法可以使用。也可以使用第三方库如libiconv 来转换UTF-8。


如果C++ std::string中的数据不表示文本,则应该以bytes的形式传递给Python,这时我们可以返回一个py::btyes对象。

  1. m.def("return_bytes",
  2. []() {
  3. std::string s("\xba\xd0\xba\xd0"); // Not valid UTF-8
  4. return py::bytes(s); // Return the data without transcoding
  5. }
  6. );
  1. >>> example.return_bytes()
  2. b'\xba\xd0\xba\xd0'


  1. m.def("asymmetry",
  2. [](std::string s) { // Accepts str or bytes from Python
  3. return s; // Looks harmless, but implicitly converts to str
  4. }
  5. );
  1. >>> isinstance(example.asymmetry(b"have some bytes"), str)
  2. True
  3. >>> example.asymmetry(b"\xba\xd0\xba\xd0") # invalid utf-8 as bytes
  4. UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 0: invalid start byte

11.2.3 宽字符串

向入参为std::wstringwchar_t*std::u16stringstd::u32string的C++函数传递Python str对象,str将被编码为UTF-16或UTF-32(具体哪种取决于C++编译器的支持)。当C++函数返回这些类型的字符串到Python str时,需要保证字符串是合法的UTF-16或UTF-32。

  1. #define UNICODE
  2. #include <windows.h>
  3. m.def("set_window_text",
  4. [](HWND hwnd, std::wstring s) {
  5. // Call SetWindowText with null-terminated UTF-16 string
  6. ::SetWindowText(hwnd, s.c_str());
  7. }
  8. );
  9. m.def("get_window_text",
  10. [](HWND hwnd) {
  11. const int buffer_size = ::GetWindowTextLength(hwnd) + 1;
  12. auto buffer = std::make_unique< wchar_t[] >(buffer_size);
  13. ::GetWindowText(hwnd, buffer.data(), buffer_size);
  14. std::wstring text(buffer.get());
  15. // wstring will be converted to Python str
  16. return text;
  17. }
  18. );

警告:带--enable-unicode=ucs2选项编译的Python 2.7和3.3版本可能不支持上述的宽字符串。


11.2.4 字符类型

向一个入参为字符类型(char, wchar_t)的C++函数,传递Python str,C++函数将接收str的首字符。如果字符串超过一个Unicode字符长度,将忽略尾部字节。


  1. m.def("pass_char", [](char c) { return c; });
  2. m.def("pass_wchar", [](wchar_t w) { return w; });
  1. example.pass_char("A")
  2. 'A'

虽然C++可以将整数转换为字符类型(char c = 0x65),pybind11并不会隐式转换Python整数到字符类型。可以使用chr()Python函数来将整数转换为字符。

  1. >>> example.pass_char(0x65)
  2. TypeError
  3. >>> example.pass_char(chr(0x65))
  4. 'A'


11.2.5 Grapheme clusters

A single grapheme may be represented by two or more Unicode characters. For example ‘é’ is usually represented as U+00E9 but can also be expressed as the combining character sequence U+0065 U+0301 (that is, the letter ‘e’ followed by a combining acute accent). The combining character will be lost if the two-character sequence is passed as an argument, even though it renders as a single grapheme.

  1. >>> example.pass_wchar("é")
  2. 'é'
  3. >>> combining_e_acute = "e" + "\u0301"
  4. >>> combining_e_acute
  5. 'é'
  6. >>> combining_e_acute == "é"
  7. False
  8. >>> example.pass_wchar(combining_e_acute)
  9. 'e'

Normalizing combining characters before passing the character literal to C++ may resolve some of these issues:

  1. >>> example.pass_wchar(unicodedata.normalize("NFC", combining_e_acute))
  2. 'é'

In some languages (Thai for example), there are graphemes that cannot be expressed as a single Unicode code point, so there is no way to capture them in a C++ character type.

11.2.6 c++17 string_view

C++17 string views are automatically supported when compiling in C++17 mode. They follow the same rules for encoding and decoding as the corresponding STL string type (for example, a std::u16string_view argument will be passed UTF-16-encoded data, and a returned std::string_view will be decoded as UTF-8).

11.3 STL容器

11.3.1 自动转换

包含头文件pybind11/stl.h后,自动支持 std::vector<>/std::deque<>/std::list<>/std::array<>/std::valarray<>, std::set<>/std::unordered_set<>, 和std::map<>/std::unordered_map<> 到Python list, setdict 的类型转换。 std::pair<>std::tuple<> 类型转换在pybind11/pybind11.h中已经支持。


Note: 这些类型任意嵌套都是可以的。

11.3.2 C++17库的容器

pybind11/stl.h支持C++17的 std::optional<>std::variant<>,C++14的std::experimental::optional<>


  1. // `boost::optional` as an example -- can be any `std::optional`-like container
  2. namespace pybind11 { namespace detail {
  3. template <typename T>
  4. struct type_caster<boost::optional<T>> : optional_caster<boost::optional<T>> {};
  5. }}

上述内容应放到头文件中,并在需要的地方包含它们。Similarly, a specialization can be provided for custom variant types:

  1. // `boost::variant` as an example -- can be any `std::variant`-like container
  2. namespace pybind11 { namespace detail {
  3. template <typename... Ts>
  4. struct type_caster<boost::variant<Ts...>> : variant_caster<boost::variant<Ts...>> {};
  5. // Specifies the function used to visit the variant -- `apply_visitor` instead of `visit`
  6. template <>
  7. struct visit_helper<boost::variant> {
  8. template <typename... Args>
  9. static auto call(Args &&...args) -> decltype(boost::apply_visitor(args...)) {
  10. return boost::apply_visitor(args...);
  11. }
  12. };
  13. }} // namespace pybind11::detail

The visit_helper specialization is not required if your name::variant provides a name::visit() function. For any other function name, the specialization must be included to tell pybind11 how to visit the variant.

Warning: When converting a variant type, pybind11 follows the same rules as when determining which function overload to call (Overload resolution order), and so the same caveats hold. In particular, the order in which the variant’s alternatives are listed is important, since pybind11 will try conversions in this order. This means that, for example, when converting variant<int, bool>, the bool variant will never be selected, as any Python bool is already an int and is convertible to a C++ int. Changing the order of alternatives (and using variant<bool, int>, in this example) provides a solution.

11.3.3 制作opaque类型

pybind11严重依赖于模板匹配机制来转换STL类型的参数和返回值,如vector,链表,哈希表等。甚至会递归处理,如lists of hash maps of pairs of elementary and custom types。



  1. void append_1(std::vector<int> &v) {
  2. v.push_back(1);
  3. }


  1. >>> v = [5, 6]
  2. >>> append_1(v)
  3. >>> print(v)
  4. [5, 6]


  1. /* ... definition ... */
  2. class MyClass {
  3. std::vector<int> contents;
  4. };
  5. /* ... binding code ... */
  6. py::class_<MyClass>(m, "MyClass")
  7. .def(py::init<>())
  8. .def_readwrite("contents", &MyClass::contents);


  1. >>> m = MyClass()
  2. >>> m.contents = [5, 6]
  3. >>> print(m.contents)
  4. [5, 6]
  5. >>> m.contents.append(7)
  6. >>> print(m.contents)
  7. [5, 6]


  1. PYBIND11_MAKE_OPAQUE(std::vector<int>);


  1. py::class_<std::vector<int>>(m, "IntVector")
  2. .def(py::init<>())
  3. .def("clear", &std::vector<int>::clear)
  4. .def("pop_back", &std::vector<int>::pop_back)
  5. .def("__len__", [](const std::vector<int> &v) { return v.size(); })
  6. .def("__iter__", [](std::vector<int> &v) {
  7. return py::make_iterator(v.begin(), v.end());
  8. }, py::keep_alive<0, 1>()) /* Keep vector alive while iterator is used */
  9. // ....

11.3.4 绑定STL容器


  1. // Don't forget this
  2. #include <pybind11/stl_bind.h>
  3. PYBIND11_MAKE_OPAQUE(std::vector<int>);
  4. PYBIND11_MAKE_OPAQUE(std::map<std::string, double>);
  5. // ...
  6. // later in binding code:
  7. py::bind_vector<std::vector<int>>(m, "VectorInt");
  8. py::bind_map<std::map<std::string, double>>(m, "MapStringDouble");

绑定STL容器时,pybind11会根据容器元素的类型来决定该容器是否应该局限于模块内(参考Module-local class bindings特性)。如果容器元素的类型不是已经绑定的自定义类型且未标识py::module_local,那么容器绑定将应用py::module_local。这包括数值类型、strings、Eigen类型,和其他在绑定STL容器时还未绑定的类型。module-local绑定的意图是为了避免模块间的潜在的冲突(如,两个独立的模块都试图绑定std::vector<int>)。


  1. py::bind_vector<std::vector<int>>(m, "VectorInt", py::module_local(false));


11.4 函数对象




这里有一个接收任意函数签名为int -> int的函数类型参数(有状态或无状态):

  1. int func_arg(const std::function<int(int)> &f) {
  2. return f(10);
  3. }


  1. std::function<int(int)> func_ret(const std::function<int(int)> &f) {
  2. return [f](int i) {
  3. return f(i) + 1;
  4. };
  5. }


  1. py::cpp_function func_cpp() {
  2. return py::cpp_function([](int i) { return i+1; },
  3. py::arg("number"));
  4. }


  1. #include <pybind11/functional.h>
  2. PYBIND11_MODULE(example, m) {
  3. m.def("func_arg", &func_arg);
  4. m.def("func_ret", &func_ret);
  5. m.def("func_cpp", &func_cpp);
  6. }


  1. $ python
  2. >>> import example
  3. >>> def square(i):
  4. ... return i * i
  5. ...
  6. >>> example.func_arg(square)
  7. 100L
  8. >>> square_plus_1 = example.func_ret(square)
  9. >>> square_plus_1(4)
  10. 17L
  11. >>> plus_1 = func_cpp()
  12. >>> plus_1(number=43)
  13. 44L



这里有个例外:一个无状态函数作为参数传递给在Python中公开的另一个C++函数时,将不会有额外的开销。Pybind11将从封装的函数中提取C++函数指针,以回避潜在地C++ -> Python -> C++的往返。

11.5 Chrono

包含pybind11/chrono将使能C++11 chrono和Python datatime对象将的自动转换,还支持python floats(从time.monotonic()time.perf_counter()获取的)和time.process_time()到durations的转换。

11.5.1 C++11时钟的概览


标准定义的第一种时钟std::chrono::system_clock。它测量当前的时间和日期。但是,这个时钟会随着操作系统的时钟变化而改变。例如,在系统时间与时间服务器同步时,这个时钟也会跟着改变。这对计时功能来说很糟糕,但对测量wall time还是有用的。


标准定义的第二种时钟std::chrono::high_resolution_clock。它是系统中分辨率最高的时钟,通常是system clock 或 steady clock的一种,也可以有自己独立的时钟。需要注意的是,你在Python中获取到的该时钟的转换值,可能存在差异,这取决于系统的实现。如果它是系统时钟的一种,Python将得到datetime对象,否则将得到timedelta对象。

11.5.2 提供的转换


  • std::chrono::system_clock::time_pointdatetime.datetime
  • std::chrono::durationdatetime.timedelta
  • std::chrono::[other_clocks]::time_pointdatetime.timedelta


  • datetime.datetimeordatetime.dateordatetime.timestd::chrono::system_clock::time_point
  • datetime.timedeltastd::chrono::duration
  • datetime.timedeltastd::chrono::[other_clocks]::time_point
  • floatstd::chrono::duration
  • floatstd::chrono::[other_clocks]::time_point

11.6 Eigen


11.7 自定义类型转换

在极少数情况下,程序可能需要一些pybind11没有提供的自定义类型转换,这需要使用到原始的Python C API。这是相当高级的使用方法,只有熟悉Python引用计数复杂之处的专家才能使用。

The following snippets demonstrate how this works for a very simple inty type that that should be convertible from Python types that provide a __int__(self) method.

  1. struct inty { long long_value; };
  2. void print(inty s) {
  3. std::cout << s.long_value << std::endl;
  4. }

The following Python snippet demonstrates the intended usage from the Python side:

  1. class A:
  2. def __int__(self):
  3. return 123
  4. from example import print
  5. print(A())

To register the necessary conversion routines, it is necessary to add an instantiation of the pybind11::detail::type_caster<T> template. Although this is an implementation detail, adding an instantiation of this type is explicitly allowed.

  1. namespace pybind11 { namespace detail {
  2. template <> struct type_caster<inty> {
  3. public:
  4. /**
  5. * This macro establishes the name 'inty' in
  6. * function signatures and declares a local variable
  7. * 'value' of type inty
  8. */
  9. PYBIND11_TYPE_CASTER(inty, _("inty"));
  10. /**
  11. * Conversion part 1 (Python->C++): convert a PyObject into a inty
  12. * instance or return false upon failure. The second argument
  13. * indicates whether implicit conversions should be applied.
  14. */
  15. bool load(handle src, bool) {
  16. /* Extract PyObject from handle */
  17. PyObject *source = src.ptr();
  18. /* Try converting into a Python integer value */
  19. PyObject *tmp = PyNumber_Long(source);
  20. if (!tmp)
  21. return false;
  22. /* Now try to convert into a C++ int */
  23. value.long_value = PyLong_AsLong(tmp);
  24. Py_DECREF(tmp);
  25. /* Ensure return code was OK (to avoid out-of-range errors etc) */
  26. return !(value.long_value == -1 && !PyErr_Occurred());
  27. }
  28. /**
  29. * Conversion part 2 (C++ -> Python): convert an inty instance into
  30. * a Python object. The second and third arguments are used to
  31. * indicate the return value policy and parent object (for
  32. * ``return_value_policy::reference_internal``) and are generally
  33. * ignored by implicit casters.
  34. */
  35. static handle cast(inty src, return_value_policy /* policy */, handle /* parent */) {
  36. return PyLong_FromLong(src.long_value);
  37. }
  38. };
  39. }} // namespace pybind11::detail

Note: A type_caster<T> defined with PYBIND11_TYPE_CASTER(T, ...) requires that T is default-constructible (value is first default constructed and then load() assigns to it).

Warning: When using custom type casters, it’s important to declare them consistently in every compilation unit of the Python extension module. Otherwise, undefined behavior can ensue.