本文参考
本文涉及代码来自PyTorch1.7.0, https://github.com/pytorch/pytorch/tree/v1.7.0/torch
背景
在使用PyTorch深度学习框架的时候,不管是训练还是测试,代码中引入PyTorch的第一句总是:
import torch
按照python规范,会找到torch package目录下的init.py,在这个文件中进一步会调用:
from torch._C import *
对于PyTorch来说这个modulename 是_C,因此我们可以揣测,在torch/csrc/stub.cpp
中一定实现了PyInit_C这个函数。是的,PyTorch就是这么做的,torch/csrc/stub.cpp
中的代码就是下面这样:
#include <Python.h>
extern PyObject* initModule(void);
PyMODINIT_FUNC PyInit__C(void)
{
return initModule();
}
本文将从initModule函数展开,全面阐述PyTorch框架的初始化工作。initModule就是PyTorch初始化时候的第一层调用栈了,因为所有的初始化工作都是在这个函数内完成的,内容比较多,将其划分为7部分:
1、torch._C的诞生
这一步就是产生torch._C类,并在这个python类上面注册众多函数:
[torch/csrc/Module.cpp]
PyObject* initModule() {
...
THPUtils_addPyMethodDefs(methods, TorchMethods);
THPUtils_addPyMethodDefs(methods, DataLoaderMethods);
THPUtils_addPyMethodDefs(methods, torch::autograd::python_functions());
THPUtils_addPyMethodDefs(methods, torch::multiprocessing::python_functions());
#ifdef USE_CUDA
THPUtils_addPyMethodDefs(methods, THCPModule_methods());
#endif
#if defined(USE_DISTRIBUTED) && defined(USE_C10D)
THPUtils_addPyMethodDefs(methods, torch::distributed::c10d::python_functions());
#ifndef _WIN32
THPUtils_addPyMethodDefs(methods, torch::distributed::rpc::python_functions());
THPUtils_addPyMethodDefs(
methods, torch::distributed::autograd::python_functions());
THPUtils_addPyMethodDefs(methods, torch::distributed::rpc::testing::python_functions());
#endif
#endif
static struct PyModuleDef torchmodule = {
PyModuleDef_HEAD_INIT,
"torch._C",
nullptr,
-1,
methods.data()
};
ASSERT_TRUE(module = PyModule_Create(&torchmodule));
...
}
- TorchMethods注册了48个方法,见【torch/csrc/Module.cpp#L574】
- DataLoaderMethods注册了4个方法,见【torch/csrc/DataLoader.cpp#L217】
- torch::autograd::python_functions()注册了9个方法,见【torch/csrc/autograd/init.cpp#L192】
- torch::multiprocessing::python_functions()注册了1个方法,见【torch/csrc/multiprocessing/init.cpp#L53】
- THCPModule_methods(),见【torch/csrc/cuda/Module.cpp#L527】
- torch::distributed::c10d::python_functions()
总而言之,在这一小步,我们达到了这样一个里程碑,torch._C符号诞生,并且向torch._C注册了百余个函数,涉及torch、dataloader、autograd、multiprocess、cuda、distribute、c10d方面。
2、一些关键类型
[torch/csrc/Module.cpp,https://github.com/pytorch/pytorch/blob/v1.7.0/torch/csrc/Module.cpp#L681]
PyObject* initModule() {
...
ASSERT_TRUE(THPGenerator_init(module));
ASSERT_TRUE(THPException_init(module));
THPSize_init(module);
THPDtype_init(module);
THPDTypeInfo_init(module);
THPLayout_init(module);
THPMemoryFormat_init(module);
THPQScheme_init(module);
THPDevice_init(module);
THPStream_init(module);
ASSERT_TRUE(THPVariable_initModule(module));
ASSERT_TRUE(THPFunction_initModule(module));
ASSERT_TRUE(THPEngine_initModule(module));
...
}
3、torch._C._TensorBase的诞生
将以下三个初始化函数归为这一小节:
[torch/csrc/Module.cpp,https://github.com/pytorch/pytorch/blob/v1.7.0/torch/csrc/Module.cpp#L681]
PyObject* initModule() {
...
ASSERT_TRUE(THPVariable_initModule(module));
ASSERT_TRUE(THPFunction_initModule(module));
ASSERT_TRUE(THPEngine_initModule(module));
...
}
为什么呢?因为地位太显赫了。
THPVariable_initModule(module)
创建了torch._C._TensorBase
,这是一切Tensor的基类。THPFunction_initModule(module)
创建了torch._C._FunctionBase
,在torch/autograd/function.py
中,以下两个类以torch._C._FunctionBase
为基类:[torch/autograd/function.py]
class BackwardCFunction(_C._FunctionBase, _ContextMethodMixin, _HookMixin):
...
class Function(with_metaclass(FunctionMeta, _C._FunctionBase, _ContextMethodMixin, _HookMixin)):
...
这个Function继承体系就构成了DAG的基础。
THPEngine_initModule(module)
创建了torch._C._EngineBase
,_EngineBase这个类负责动态图执行之前的preprocess,_EngineBase会将torch.autograd的backward之类的请求预处理后送给真正的Engine去执行。
4、pybind11绑定
这一小节的初始化内容都是和pybind11相关的:
[torch/csrc/Module.cpp,https://github.com/pytorch/pytorch/blob/v1.7.0/torch/csrc/Module.cpp#L681]
PyObject* initModule() {
...
// NOTE: We need to be able to access OperatorExportTypes from ONNX for use in
// the export side of JIT, so this ONNX init needs to appear before the JIT
// init.
torch::onnx::initONNXBindings(module);
torch::jit::initJITBindings(module);
torch::impl::dispatch::initDispatchBindings(module);
torch::throughput_benchmark::initThroughputBenchmarkBindings(module);
torch::autograd::initNNFunctions(module);
torch::autograd::initFFTFunctions(module);
torch::autograd::initLinalgFunctions(module);
torch::autograd::init_legacy_variable(module);
torch::python::init_bindings(module);
#ifdef USE_CUDA
torch::cuda::initModule(module);
#endif
...
}
initONNXBindings是ONNX的python binding:torch._C._onnx.TensorProtoDataType和torch._C._onnx.OperatorExportTypes
>>> dir(torch._C._onnx.TensorProtoDataType)
['BOOL', 'COMPLEX128', 'COMPLEX64', 'DOUBLE', 'FLOAT', 'FLOAT16', 'INT16', 'INT32', 'INT64', 'INT8', 'STRING', 'UINT16', 'UINT32', 'UINT64', 'UINT8', 'UNDEFINED', '__class__', '__delattr__', '__dir__', '__doc__', '__entries', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__int__', '__le__', '__lt__', '__members__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', 'name']
>>> dir(torch._C._onnx.OperatorExportTypes)
['ONNX', 'ONNX_ATEN', 'ONNX_ATEN_FALLBACK', 'ONNX_FALLTHROUGH', 'RAW', '__class__', '__delattr__', '__dir__', '__doc__', '__entries', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__int__', '__le__', '__lt__', '__members__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', 'name']
initJITBindings则是通过pybind11往torch._C上注册了一堆和JIT相关的C++函数/对象;
initNNFunctions初始化了一个torch._C._nn 对象,并注册了一些nn相关的函数
>>> dir(torch._C._nn)
['__doc__', '__loader__', '__name__', '__package__', '__spec__', '_parse_to', '_test_optional_filled_intlist', '_test_optional_floatlist', '_test_optional_intlist', 'adaptive_avg_pool2d', 'adaptive_avg_pool3d', 'adaptive_max_pool2d', 'adaptive_max_pool3d', 'avg_pool2d', 'avg_pool3d', 'binary_cross_entropy', 'col2im', 'elu', 'elu_', 'fractional_max_pool2d', 'fractional_max_pool3d', 'gelu', 'glu', 'hardsigmoid', 'hardsigmoid_', 'hardswish', 'hardswish_', 'hardtanh', 'hardtanh_', 'im2col', 'l1_loss', 'leaky_relu', 'leaky_relu_', 'linear', 'log_sigmoid', 'max_pool2d_with_indices', 'max_pool3d_with_indices', 'max_unpool2d', 'max_unpool3d', 'mkldnn_linear', 'mkldnn_reorder_conv2d_weight', 'mkldnn_reorder_conv3d_weight', 'mse_loss', 'multi_margin_loss', 'multilabel_margin_loss', 'nll_loss', 'nll_loss2d', 'one_hot', 'reflection_pad1d', 'reflection_pad2d', 'replication_pad1d', 'replication_pad2d', 'replication_pad3d', 'rrelu_with_noise', 'rrelu_with_noise_', 'silu', 'silu_', 'slow_conv3d', 'slow_conv_dilated2d', 'slow_conv_dilated3d', 'slow_conv_transpose2d', 'slow_conv_transpose3d', 'smooth_l1_loss', 'soft_margin_loss', 'softplus', 'softshrink', 'thnn_conv2d', 'thnn_conv_depthwise2d', 'upsample_bicubic2d', 'upsample_bilinear2d', 'upsample_linear1d', 'upsample_nearest1d', 'upsample_nearest2d', 'upsample_nearest3d', 'upsample_trilinear3d']
init_legacy_variable注册了torch._C._LegacyVariableBase
>>> dir(torch._C._LegacyVariableBase) ['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
_LegacyVariableBase类会派生出Variable类(该类的_execution_engine会初始化为torch._C._EngineBase)
[torch/autograd/varibale.py] # mypy doesn't understand torch._six.with_metaclass class Variable(with_metaclass(VariableMeta, torch._C._LegacyVariableBase)): # type: ignore pass
init_bindings是通过pybind11往torch._C上注册一些函数,torch::cuda::initModule类似,也是通过pybind11往torch._C上注册一些函数,只不过内容是和cuda相关的。
5、在torch._C上注册StorageBase类
[torch/csrc/Module.cpp,https://github.com/pytorch/pytorch/blob/v1.7.0/torch/csrc/Module.cpp#L681]
PyObject* initModule() {
...
ASSERT_TRUE(THPDoubleStorage_init(module));
ASSERT_TRUE(THPFloatStorage_init(module));
ASSERT_TRUE(THPHalfStorage_init(module));
ASSERT_TRUE(THPLongStorage_init(module));
ASSERT_TRUE(THPIntStorage_init(module));
ASSERT_TRUE(THPShortStorage_init(module));
ASSERT_TRUE(THPCharStorage_init(module));
ASSERT_TRUE(THPByteStorage_init(module));
ASSERT_TRUE(THPBoolStorage_init(module));
ASSERT_TRUE(THPQUInt8Storage_init(module));
ASSERT_TRUE(THPQInt8Storage_init(module));
ASSERT_TRUE(THPQInt32Storage_init(module));
ASSERT_TRUE(THPQUInt4x2Storage_init(module));
ASSERT_TRUE(THPBFloat16Storage_init(module));
ASSERT_TRUE(THPComplexDoubleStorage_init(module));
ASSERT_TRUE(THPComplexFloatStorage_init(module));
#ifdef USE_CUDA
// This will only initialise base classes and attach them to library namespace
// They won't be ready for real usage until importing cuda module, that will
// complete the process (but it defines Python classes before calling back into
// C, so these lines have to execute first)..
ASSERT_TRUE(THCPDoubleStorage_init(module));
ASSERT_TRUE(THCPFloatStorage_init(module));
ASSERT_TRUE(THCPHalfStorage_init(module));
ASSERT_TRUE(THCPLongStorage_init(module));
ASSERT_TRUE(THCPIntStorage_init(module));
ASSERT_TRUE(THCPShortStorage_init(module));
ASSERT_TRUE(THCPCharStorage_init(module));
ASSERT_TRUE(THCPByteStorage_init(module));
ASSERT_TRUE(THCPBoolStorage_init(module));
ASSERT_TRUE(THCPBFloat16Storage_init(module));
ASSERT_TRUE(THCPComplexDoubleStorage_init(module));
ASSERT_TRUE(THCPComplexFloatStorage_init(module));
...
}
这些初始化工作主要就是往torch._C上注册Storage类:
- DoubleStorageBase
- FloatStorageBase
- CudaFloatStorageBase
比如以FloatStorageBase为例的话,我们可以这样查看它注册的方法:
>>> dir(torch._C.FloatStorageBase)
['__class__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '_cdata', '_expired', '_free_weak_ref', '_get_shared_fd', '_new_shared_fd', '_new_shared_filename', '_new_using_fd', '_new_using_filename', '_new_with_file', '_new_with_weak_ptr', '_set_cdata', '_set_from_file', '_share_fd_', '_share_filename_', '_shared_decref', '_shared_incref', '_weak_ref', '_write_file', 'copy_', 'data_ptr', 'device', 'dtype', 'element_size', 'fill_', 'from_buffer', 'from_file', 'is_pinned', 'is_shared', 'new', 'resize_', 'size']
这些类会在python体系中被继承:
[torch/__init__.py]
class FloatStorage(_C.FloatStorageBase, _StorageBase):
...
_C.FloatStorageBase等方法是用宏生成的,具体生成方法见本专栏文章如何用C宏命令支持泛型