12. Python C++接口

pybind11通过简单的C++包装公开了Python类型和函数,这使得我们可以方便的在C++中调用Python代码,而无需借助Python C API。

12.1 Python类型

12.1.1 可用的封装

所有主要的Python类型通过简单C++类封装公开出来了,可以当做参数参数来使用。包括: handle, object, bool_, int_, float_, str, bytes, tuple, list, dict, slice, none, capsule, iterable, iterator, function, buffer, array, 和array_t.

Warning: Be sure to review the Gotchas before using this heavily in your C++ API.

12.1.2 在C++中实例化复合Python类型


  1. using namespace pybind11::literals; // to bring in the `_a` literal
  2. py::dict d("spam"_a=py::none(), "eggs"_a=42);


  1. py::tuple tup = py::make_tuple(42, py::none(), "spam");


simple namespace可以这样实例化:

  1. using namespace pybind11::literals; // to bring in the `_a` literal
  2. py::object SimpleNamespace = py::module_::import("types").attr("SimpleNamespace");
  3. py::object ns = SimpleNamespace("spam"_a=py::none(), "eggs"_a=42);

namespace的属性可以通过py::delattr()py::getattr()py::setattr()来修改。Simple namespaces可以作为类实例的轻量级替代。

12.1.3 相互转换


  1. MyClass *cls = ...;
  2. py::object obj = py::cast(cls);


  1. py::object obj = ...;
  2. MyClass *cls = obj.cast<MyClass *>();


12.1.4 在C++中访问Python库


  1. // Equivalent to "from decimal import Decimal"
  2. py::object Decimal = py::module_::import("decimal").attr("Decimal");
  3. // Try to import scipy
  4. py::object scipy = py::module_::import("scipy");
  5. return scipy.attr("__version__");

12.1.5 调用Python函数


  1. // Construct a Python object of class Decimal
  2. py::object pi = Decimal("3.14159");
  3. // Use Python to make our directories
  4. py::object os = py::module_::import("os");
  5. py::object makedirs = os.attr("makedirs");
  6. makedirs("/tmp/path/to/somewhere");

One can convert the result obtained from Python to a pure C++ version if a py::class_ or type conversion is defined.

  1. py::function f = <...>;
  2. py::object result_py = f(1234, "hello", some_instance);
  3. MyClass &result = result_py.cast<MyClass>();

12.1.6 调用Python对象的方法


  1. // Calculate e^π in decimal
  2. py::object exp_pi = pi.attr("exp")();
  3. py::print(py::str(exp_pi));

In the example above pi.attr("exp") is a bound method: it will always call the method for that same instance of the class. Alternately one can create an unbound method via the Python class (instead of instance) and pass the self object explicitly, followed by other arguments.

  1. py::object decimal_exp = Decimal.attr("exp");
  2. // Compute the e^n for n=0..4
  3. for (int n = 0; n < 5; n++) {
  4. py::print(decimal_exp(Decimal(n));
  5. }

12.1.7 关键字参数


  1. def f(number, say, to):
  2. ... # function code
  3. f(1234, say="hello", to=some_instance) # keyword call in Python


  1. using namespace pybind11::literals; // to bring in the `_a` literal
  2. f(1234, "say"_a="hello", "to"_a=some_instance); // keyword call in C++

12.1.8 拆包参数


  1. // * unpacking
  2. py::tuple args = py::make_tuple(1234, "hello", some_instance);
  3. f(*args);
  4. // ** unpacking
  5. py::dict kwargs = py::dict("number"_a=1234, "say"_a="hello", "to"_a=some_instance);
  6. f(**kwargs);
  7. // mixed keywords, * and ** unpacking
  8. py::tuple args = py::make_tuple(1234);
  9. py::dict kwargs = py::dict("to"_a=some_instance);
  10. f(*args, "say"_a="hello", **kwargs);

Generalized unpacking according to PEP448 is also supported:

  1. py::dict kwargs1 = py::dict("number"_a=1234);
  2. py::dict kwargs2 = py::dict("to"_a=some_instance);
  3. f(**kwargs1, "say"_a="hello", **kwargs2);

12.1.9 隐式转换


  1. #include <pybind11/numpy.h>
  2. using namespace pybind11::literals;
  3. py::module_ os = py::module_::import("os");
  4. py::module_ path = py::module_::import("os.path"); // like 'import os.path as path'
  5. py::module_ np = py::module_::import("numpy"); // like 'import numpy as np'
  6. py::str curdir_abs = path.attr("abspath")(path.attr("curdir"));
  7. py::print(py::str("Current directory: ") + curdir_abs);
  8. py::dict environ = os.attr("environ");
  9. py::print(environ["HOME"]);
  10. py::array_t<float> arr = np.attr("ones")(3, "dtype"_a="float32");
  11. py::print(py::repr(arr + py::int_(1)));



If a trivial conversion via move constructor is not possible, both implicit and explicit casting (calling obj.cast()) will attempt a “rich” conversion. For instance, py::list env = os.attr("environ"); will succeed and is equivalent to the Python code env = list(os.environ) that produces a list of the dict keys.

12.1.10 处理异常


12.1.11 Gotchas

Default-Constructed Wrappers


Assigning py::none() to wrappers


12.2 NumPy

12.2.1 缓冲协议(buffer protocol)


  1. class Matrix {
  2. public:
  3. Matrix(size_t rows, size_t cols) : m_rows(rows), m_cols(cols) {
  4. m_data = new float[rows*cols];
  5. }
  6. float *data() { return m_data; }
  7. size_t rows() const { return m_rows; }
  8. size_t cols() const { return m_cols; }
  9. private:
  10. size_t m_rows, m_cols;
  11. float *m_data;
  12. };

下面的绑定代码将Matrix作为一个buffer对象公开,使得Matrices可以转型为NumPy arrays。甚至可以完全避免拷贝操作,类似python语句np.array(matrix_instance, copy = False)

  1. py::class_<Matrix>(m, "Matrix", py::buffer_protocol())
  2. .def_buffer([](Matrix &m) -> py::buffer_info {
  3. return py::buffer_info(
  4. m.data(), /* Pointer to buffer */
  5. sizeof(float), /* Size of one scalar */
  6. py::format_descriptor<float>::format(), /* Python struct-style format descriptor */
  7. 2, /* Number of dimensions */
  8. { m.rows(), m.cols() }, /* Buffer dimensions */
  9. { sizeof(float) * m.cols(), /* Strides (in bytes) for each index */
  10. sizeof(float) }
  11. );
  12. });


  1. struct buffer_info {
  2. void *ptr;
  3. py::ssize_t itemsize;
  4. std::string format;
  5. py::ssize_t ndim;
  6. std::vector<py::ssize_t> shape;
  7. std::vector<py::ssize_t> strides;
  8. };

要想创建一个支持Python buffer对象为参数的C++函数,可以简单实用py::buffer作为函数参数之一。buffer对象会存在多种配置,因此通常在需要在函数体中进行安全检查。下面的例子,将展示如果定义一个双精度类型的Eigen矩阵的自定义构造函数,支持从兼容buffer对象来初始化(如NumPy matrix)。

  1. /* Bind MatrixXd (or some other Eigen type) to Python */
  2. typedef Eigen::MatrixXd Matrix;
  3. typedef Matrix::Scalar Scalar;
  4. constexpr bool rowMajor = Matrix::Flags & Eigen::RowMajorBit;
  5. py::class_<Matrix>(m, "Matrix", py::buffer_protocol())
  6. .def(py::init([](py::buffer b) {
  7. typedef Eigen::Stride<Eigen::Dynamic, Eigen::Dynamic> Strides;
  8. /* Request a buffer descriptor from Python */
  9. py::buffer_info info = b.request();
  10. /* Some sanity checks ... */
  11. if (info.format != py::format_descriptor<Scalar>::format())
  12. throw std::runtime_error("Incompatible format: expected a double array!");
  13. if (info.ndim != 2)
  14. throw std::runtime_error("Incompatible buffer dimension!");
  15. auto strides = Strides(
  16. info.strides[rowMajor ? 0 : 1] / (py::ssize_t)sizeof(Scalar),
  17. info.strides[rowMajor ? 1 : 0] / (py::ssize_t)sizeof(Scalar));
  18. auto map = Eigen::Map<Matrix, 0, Strides>(
  19. static_cast<Scalar *>(info.ptr), info.shape[0], info.shape[1], strides);
  20. return Matrix(map);
  21. }));


  1. .def_buffer([](Matrix &m) -> py::buffer_info {
  2. return py::buffer_info(
  3. m.data(), /* Pointer to buffer */
  4. sizeof(Scalar), /* Size of one scalar */
  5. py::format_descriptor<Scalar>::format(), /* Python struct-style format descriptor */
  6. 2, /* Number of dimensions */
  7. { m.rows(), m.cols() }, /* Buffer dimensions */
  8. { sizeof(Scalar) * (rowMajor ? m.cols() : 1),
  9. sizeof(Scalar) * (rowMajor ? 1 : m.rows()) }
  10. /* Strides (in bytes) for each index */
  11. );
  12. })


12.2.2 Arrays

将上述代码中的py::buffer替换为py::array,我们可以限制函数只接收NumPy array(而不是任意满足缓冲协议的Python类型)。

在很多场合,我们希望函数只接受特定数据类型的NumPy array,可以使用py::array_t<T>来实现。如下所示,函数需要一个双精度浮点类型的NumPy array。

  1. void f(py::array_t<double> array);

当上面的函数被其他类型(如int)调用时,绑定代码将试图将输入转型为期望类型的NumPy array。该特性需要包含pybind11/numpy.h头文件。该文件不依赖与NumPy的头文件,因此可以独立于NumPy编译。运行时需要NumPy版本大于1.7.0。

NumPy array的数据并不保证密集排布;此外,数据条目可以以任意的行列跨度分隔。有时,我们需要函数仅接受C(行优先)或Fortran(列优先)次序的密集排布数组。这就需要指定第二个模板参数为py::array::c_stylepy::array::f_style

  1. void f(py::array_t<double, py::array::c_style | py::array::forcecast> array);


arrays有一些基于NumPy API的方法:

  • .dtype()返回数组元素的类型。
  • .strides()返回数组strides的指针。
  • .squeeze()从给定数组的形状中删除一维的条目。
  • .view(dtype)返回指定dtype类型的数组视图。
  • .reshape({i, j, ...})返回指定shape的数组视图。.resize({})也可以。
  • .index_at(i, j, ...)获取数组指定所以的元素。


12.2.3 结构体类型


  1. struct A {
  2. int x;
  3. double y;
  4. };
  5. struct B {
  6. int z;
  7. A a;
  8. };
  9. // ...
  10. PYBIND11_MODULE(test, m) {
  11. // ...
  12. PYBIND11_NUMPY_DTYPE(A, x, y);
  13. PYBIND11_NUMPY_DTYPE(B, z, a);
  14. /* now both A and B can be used as template arguments to py::array_t */
  15. }


12.2.4 向量化函数


  1. double my_func(int x, float y, double z);


  1. m.def("vectorized_func", py::vectorize(my_func));

这样将对数组中每个元素调用函数进行处理。与numpy.vectorize()一类方案相比,该方案显著的优势是:元素处理的循环完全在c++端运行,编译器可以将其压缩成一个紧凑的、优化后的循环。函数函数值将返回NumPy 数组类型numpy.dtype.float64

  1. x = np.array([[1, 3], [5, 7]])
  2. y = np.array([[2, 4], [6, 8]])
  3. z = 3
  4. result = vectorized_func(x, y, z)




如果计算太过复杂而无法对其进行量化,就需要手动创建和访问缓冲区内容。下面的代码展示了这该如何进行。(the code is somewhat contrived, since it could have been done more simply using vectorize).

  1. #include <pybind11/pybind11.h>
  2. #include <pybind11/numpy.h>
  3. namespace py = pybind11;
  4. py::array_t<double> add_arrays(py::array_t<double> input1, py::array_t<double> input2) {
  5. py::buffer_info buf1 = input1.request(), buf2 = input2.request();
  6. if (buf1.ndim != 1 || buf2.ndim != 1)
  7. throw std::runtime_error("Number of dimensions must be one");
  8. if (buf1.size != buf2.size)
  9. throw std::runtime_error("Input shapes must match");
  10. /* No pointer is passed, so NumPy will allocate the buffer */
  11. auto result = py::array_t<double>(buf1.size);
  12. py::buffer_info buf3 = result.request();
  13. double *ptr1 = static_cast<double *>(buf1.ptr);
  14. double *ptr2 = static_cast<double *>(buf2.ptr);
  15. double *ptr3 = static_cast<double *>(buf3.ptr);
  16. for (size_t idx = 0; idx < buf1.shape[0]; idx++)
  17. ptr3[idx] = ptr1[idx] + ptr2[idx];
  18. return result;
  19. }
  20. PYBIND11_MODULE(test, m) {
  21. m.def("add_arrays", &add_arrays, "Add two NumPy arrays");
  22. }

12.2.5 直接访问


  1. m.def("sum_3d", [](py::array_t<double> x) {
  2. auto r = x.unchecked<3>(); // x must have ndim = 3; can be non-writeable
  3. double sum = 0;
  4. for (py::ssize_t i = 0; i < r.shape(0); i++)
  5. for (py::ssize_t j = 0; j < r.shape(1); j++)
  6. for (py::ssize_t k = 0; k < r.shape(2); k++)
  7. sum += r(i, j, k);
  8. return sum;
  9. });
  10. m.def("increment_3d", [](py::array_t<double> x) {
  11. auto r = x.mutable_unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false
  12. for (py::ssize_t i = 0; i < r.shape(0); i++)
  13. for (py::ssize_t j = 0; j < r.shape(1); j++)
  14. for (py::ssize_t k = 0; k < r.shape(2); k++)
  15. r(i, j, k) += 1.0;
  16. }, py::arg().noconvert());

要从array对象获取代理,你必须同时制定数据类型和维数作为模板参数,如auto r = myarray.mutable_unchecked<float, 2>()


注意,返回的代理类时直接引用array的数据,只在构造时读取shape, strides, writeable flag。您必须确保所引用的数组在返回对象的持续时间内不会被销毁或reshape, typically by limiting the scope of the returned instance.

The returned proxy object supports some of the same methods as py::array so that it can be used as a drop-in replacement for some existing, index-checked uses of py::array:

  • .ndim() returns the number of dimensions
  • .data(1, 2, ...) and r.mutable_data(1, 2, ...) returns a pointer to the const T or T data, respectively, at the given indices. The latter is only available to proxies obtained via a.mutable_unchecked().
  • .itemsize() returns the size of an item in bytes, i.e. sizeof(T).
  • .ndim() returns the number of dimensions.
  • .shape(n) returns the size of dimension n
  • .size() returns the total number of elements (i.e. the product of the shapes).
  • .nbytes() returns the number of bytes used by the referenced elements (i.e. itemsize() times size()).

12.2.6 省略号

Python 3 provides a convenient ... ellipsis notation that is often used to slice multidimensional arrays. For instance, the following snippet extracts the middle dimensions of a tensor with the first and last index set to zero. In Python 2, the syntactic sugar ... is not available, but the singleton Ellipsis (of type ellipsis) can still be used directly.

  1. a = ... # a NumPy array
  2. b = a[0, ..., 0]

The function py::ellipsis() function can be used to perform the same operation on the C++ side:

  1. py::array a = /* A NumPy array */;
  2. py::array b = a[py::make_tuple(0, py::ellipsis(), 0)];

12.2.7 内存视图

当我们只想提供C/C++ buffer的访问接口而不用构造类对象时,我们可以返回一个memoryview对象。假设我们希望公开2*4 uint8_t数组的memoryview时,可以这样做:

  1. const uint8_t buffer[] = {
  2. 0, 1, 2, 3,
  3. 4, 5, 6, 7
  4. };
  5. m.def("get_memoryview2d", []() {
  6. return py::memoryview::from_buffer(
  7. buffer, // buffer pointer
  8. { 2, 4 }, // shape (rows, cols)
  9. { sizeof(uint8_t) * 4, sizeof(uint8_t) } // strides in bytes
  10. );
  11. })



  1. m.def("get_memoryview1d", []() {
  2. return py::memoryview::from_memory(
  3. buffer, // buffer pointer
  4. sizeof(uint8_t) * 8 // buffer size
  5. );
  6. })

Note: memoryview::from_memory is not available in Python 2.

12.3 实用工具

12.3.1 在C++中使用Python print函数


函数包含了Python print一样的sep, end, file, flush等参数。

  1. py::print(1, 2.0, "three"); // 1 2.0 three
  2. py::print(1, 2.0, "three", "sep"_a="-"); // 1-2.0-three
  3. auto args = py::make_tuple("unpacked", true);
  4. py::print("->", *args, "end"_a="<-"); // -> unpacked True <-

12.3.2 从ostream捕获标准输出

C++库通常使用std::coutstd::cerr来打印输出,但它们和Python的标准sys.stdoutsys.stderr不能很好的协同工作。使用py::print代替库的打印是不现实的。我们可以将库函数的输出重定向到相应的Python streams来处理该问题:

  1. #include <pybind11/iostream.h>
  2. ...
  3. // Add a scoped redirect for your noisy code
  4. m.def("noisy_func", []() {
  5. py::scoped_ostream_redirect stream(
  6. std::cout, // std::ostream&
  7. py::module_::import("sys").attr("stdout") // Python output
  8. );
  9. call_noisy_func();
  10. });



此方法会对输出流进行刷新,并在scoped_ostream_redirect被销毁时根据需要进行刷新。这允许实时地重定向输出,比如输出到Jupyter notebook。C++流和Python输出这两个参数是可选的,不指定时默认为标准输出。py::scoped_estream_redirect <scoped_estream_redirect>是作用于标准错误的。可以通过py::call_guard来简便设置。

  1. // Alternative: Call single function using call guard
  2. m.def("noisy_func", &call_noisy_function,
  3. py::call_guard<py::scoped_ostream_redirect,
  4. py::scoped_estream_redirect>());

The redirection can also be done in Python with the addition of a context manager, using the py::add_ostream_redirect() <add_ostream_redirect> function:

  1. py::add_ostream_redirect(m, "ostream_redirect");

The name in Python defaults to ostream_redirect if no name is passed. This creates the following context manager in Python:

  1. with ostream_redirect(stdout=True, stderr=True):
  2. noisy_function()

It defaults to redirecting both streams, though you can use the keyword arguments to disable one of the streams if needed.

12.3.3 从字符串和文件执行Python表达式

pybind11 provides the eval, exec and eval_file functions to evaluate Python expressions and statements. The following example illustrates how they can be used.

  1. // At beginning of file
  2. #include <pybind11/eval.h>
  3. ...
  4. // Evaluate in scope of main module
  5. py::object scope = py::module_::import("__main__").attr("__dict__");
  6. // Evaluate an isolated expression
  7. int result = py::eval("my_variable + 10", scope).cast<int>();
  8. // Evaluate a sequence of statements
  9. py::exec(
  10. "print('Hello')\n"
  11. "print('world!');",
  12. scope);
  13. // Evaluate the statements in an separate Python file on disk
  14. py::eval_file("script.py", scope);

C++11 raw string literals are also supported and quite handy for this purpose. The only requirement is that the first statement must be on a new line following the raw string delimiter R"(, ensuring all lines have common leading indent:

  1. py::exec(R"(
  2. x = get_answer()
  3. if x == 42:
  4. print('Hello World!')
  5. else:
  6. print('Bye!')
  7. )", scope
  8. );