2. 预备知识 - 2.2 数据操作 - 《机器学习》

2.2.1 创建 Tensor
2.2.2 操作
2.2.3 广播机制(broadcasting)
2.2.4 运算的内存开销
2.2.5 Tensor 和 NumPy 的相互转换
2.2.6 Tensor on GPU

在PyTorch中，torch.Tensor是存储和变换数据的主要工具。Tensor和NumPy的多维数组 ndarray 非常类似。然而，Tensor提供GPU计算和自动求梯度等更多功能，这些使Tensor更加适合深度学习。

“tensor”这个单词一般可译作“张量”，张量可以看作是一个多维数组。标量可以看作是0维张量，向量可以看作1维张量，矩阵可以看作是二维张量。

2.2.1 创建 `Tensor`

常用的创建方法

# 创建 Tensor
# 创建空 Tensor, 只分配空间不赋值
x = torch.empty(5, 3)
# 用随机数填充
x = torch.rand(5, 3)
# 用 0 填充
x = torch.zeros(5, 3)
# 用 1 填充, 指定类型为 long
x = torch.ones(5, 3, dtype=torch.long)
# 直接用数据创建
x = torch.tensor([5.5, 3])
# 用原有的 Tensor 对象创建, 默认保留 dtype, device 等属性
x = x.new_ones(5, 3, dtype=torch.float64)
x = torch.rand_like(x, dtype=torch.float)

获取 Tensor 的规格

# 获取规格, 返回一个 tuple
print(x.size())
print(x.shape)

以下为其他创建方法参考目录, 创建时可以指定 dtype 和 device (即在cpu/gpu中)

函数	功能
Tensor(*sizes)	基础构造函数
tensor(data,)	类似np.array的构造函数
ones(*sizes)	全1Tensor
zeros(*sizes)	全0Tensor
eye(*sizes)	对角线为1，其他为0
arange(s,e,step)	从s到e，步长为step
linspace(s,e,steps)	从s到e，均匀切分成steps份
rand/randn(*sizes)	均匀/标准分布
normal(mean,std)/uniform(from,to)	正态分布/均匀分布
randperm(m)	随机排列

2.2.1 创建Tensor.py

2.2.2 操作

算数操作，以加法为例

# 加法
x = torch.ones(5, 3)
y = torch.ones(5, 3)
# 运算符重载
z = x+y
# add 方法
z = torch.add(x, y)
x = x.add(y)
# 自定义输出对象
torch.add(x, y, out=z)

torch 中带有 _ 后缀的方法被称为 in-place operation (原地操作)，执行后会该对象会保存运算的结果

# 关于普通操作和 in-place 操作
x = torch.ones(1, 1)
y = torch.ones(1, 1)
print(x.add(y))
print(x)
print(x.add_(y))
print(x)

普通操作后原对象的值不改变

tensor([[2.]])
tensor([[1.]])

in-place 操作后元对象的值改变

tensor([[2.]])
tensor([[2.]])

索引

索引的语法和 NumPy 中相同，注意索引后的对象与原对象共享 **data**

# 索引
x = torch.rand(5, 3)
y = x[0, :]
print(y)
y += 1
print(y)
print(x[0, :])

运行结果

tensor([0.1743, 0.4485, 0.3080])
tensor([1.1743, 1.4485, 1.3080])
tensor([1.1743, 1.4485, 1.3080])

改变形状

view 和 reshape 只改变维度，不能改变总的元素个数，同时 view 返回的新对象和原对象共享 **data** ，而 reshape 返回的新对象可能是一个副本可能不是。
关于 view 和 reshape

# 改变形状
y = x.view(15)
y = x.reshape(5, 3)
# -1 表示的维度可以由其他维度推出来
z = x.view(-1, 5)

推荐使用 clone() 来 创建副本 。

xCopy = x.clone().view(15)
xCopy += 1
print(x)
print(xCopy)

运行结果

tensor([[1.7682, 1.5818, 1.7810],
        [0.8496, 0.8107, 0.1221],
        [0.5390, 0.1897, 0.5532],
        [0.9929, 0.5308, 0.0106],
        [0.7495, 0.8266, 0.8893]])
tensor([2.7682, 2.5818, 2.7810, 1.8496, 1.8107, 1.1221, 1.5390, 1.1897, 1.5532,
        1.9929, 1.5308, 1.0106, 1.7495, 1.8266, 1.8893])

使用clone还有一个好处是会被记录在计算图中，即梯度回传到副本时也会传到源Tensor。

标量转换

item() 将标量 Tensor 转换成 Python 中的 Number 类型

# 标量转换
x = torch.rand(1)
print(x)
print(x.item())

运行结果

tensor([0.5314])
0.5313704013824463

线性代数

简单展示一些线性代数操作

# 线性代数
x = torch.arange(0, 12).reshape(3, 4)
print(x)
# 矩阵的迹
print(x.trace())
# 矩阵的转置
print(x.t())
# 对角线元素
print(x.diag())

运行结果

tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor(15)
tensor([[ 0,  4,  8],
        [ 1,  5,  9],
        [ 2,  6, 10],
        [ 3,  7, 11]])
tensor([ 0,  5, 10])

总之，PyTorch中的Tensor支持超过一百种操作，非常强大，想到什么操作可以先查API。

2.2.2 操作.py

2.2.3 广播机制(broadcasting)

当两个形状不同的 Tensor 进行运算时，可能会触发广播机制，这一点和 NumPy 相似。

x = torch.arange(1, 3).reshape(1, 2)
y = torch.arange(1, 4).reshape(3, 1)
print(x)
print(y)
print(x+y)

运行结果

tensor([[1, 2]])
tensor([[1],
        [2],
        [3]])
tensor([[2, 3],
        [3, 4],
        [4, 5]])

2.2.3 广播机制.py

2.2.4 运算的内存开销

我们可以用 Python 自带的 id() 方法查看在进行一些运算后是否创建了新的对象

# id() 查看对象内存地址
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
idBefore = id(y)
y = y + x
print(id(y) == idBefore)    # False
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
idBefore = id(y)
y[:] = y + x
print(id(y) == idBefore)    # True
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
idBefore = id(y)
y += x
print(id(y) == idBefore)    # True
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
idBefore = id(y)
y.add(x)
print(id(y) == idBefore)    # True
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
idBefore = id(y)
y.add_(x)
print(id(y) == idBefore)    # True
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
idBefore = id(y)
torch.add(x, y, out=y)
print(id(y) == idBefore)    # True

注：虽然view返回的Tensor与源Tensor是共享data的，但是依然是一个新的Tensor（因为Tensor除了包含data外还有一些其他属性），二者id（内存地址）并不一致。

2.2.4 运算的内存开销.py

2.2.5 Tensor 和 NumPy 的相互转换

一种是共享 data 的转化，非常方便
所有在 CPU 上的Tensor（除了CharTensor）都支持与NumPy数组相互转换。

# 将 NumPy 数组转换成 Tensor
a = np.ones(5)
b = torch.from_numpy(a)
print(a)
print(b)
# 将 Tensor 转换成 NumPy 数组
a = torch.ones(5)
b = a.numpy()
print(a)
print(b)
# 上述处理共享数据
a += 1
print(a, b)
b += 1
print(a, b)

运行结果

[1. 1. 1. 1. 1.]
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
tensor([1., 1., 1., 1., 1.])
[1. 1. 1. 1. 1.]
tensor([2., 2., 2., 2., 2.]) [2. 2. 2. 2. 2.]
tensor([3., 3., 3., 3., 3.]) [3. 3. 3. 3. 3.]

还有一种是将 NumPy 数组转换成 Tensor 但不共享 data

# 不共享数据
c = torch.tensor(b)
b += 1
print(b, c)

运行结果

[4. 4. 4. 4. 4.] tensor([3., 3., 3., 3., 3.])

2.2.5 Tensor 和 NumPy 的相互转换.py

2.2.6 Tensor on GPU

Tensor 支持在 GPU 上运算以获得更好的并行计算能力，只需要让 Tensor 的 device 是 GPU 就行。
用 to 方法可以很方便在 CPU 和 GPU 之间进行转移
因为启动 GPU，将数据在内存和显存之间交换也需要时间，例子中启动可能需要等待一段时间，但是随着数据规模的增大， GPU 的优势才会体现出来。

# 使用 GPU 加速， 需要安装 GPU 版的 PyTorch 以及硬件支持
x = torch.tensor([1, 2])
y = torch.tensor([3, 4])
if torch.cuda.is_available():
    GPU = torch.device("cuda")
    y = torch.ones_like(x, device=GPU)
    x = x.to(GPU)
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))

运行结果

tensor([2, 3], device='cuda:0')
tensor([2., 3.], dtype=torch.float64)

2.2.6 Tensor on GPU.py

2.2 数据操作

2.2.1 创建 Tensor

2.2.2 操作

2.2.3 广播机制(broadcasting)

2.2.4 运算的内存开销

2.2.5 Tensor 和 NumPy 的相互转换

2.2.6 Tensor on GPU

2.2.1 创建 `Tensor`