PyTorch
提供几个pytorch中常用的向量相似度评估方法,并给出其源码实现,供大家参考。分别为以下六个。

  1. CosineSimilarity
  2. DotProductSimilarity
  3. ProjectedDotProductSimilarity
  4. BiLinearSimilarity
  5. TriLinearSimilarity
  6. MultiHeadedSimilarity

    1、余弦相似度

    余弦相似度用向量空间中两个向量夹角的余弦值作为衡量两个个体间差异的大小。余弦值越接近1,就表明夹角越接近0度,也就是两个向量越相似,称为”余弦相似性” ```python import torch import torch.nn as nn import math

class CosineSimilarity(nn.Module):

  1. def forward(self, tensor_1, tensor_2):
  2. normalized_tensor_1 = tensor_1 / tensor_1.norm(dim=-1, keepdim=True)
  3. normalized_tensor_2 = tensor_2 / tensor_2.norm(dim=-1, keepdim=True)
  4. return (normalized_tensor_1 * normalized_tensor_2).sum(dim=-1)
  1. <a name="Ql9Lf"></a>
  2. ### 2、DotProductSimilarity
  3. 这个相似度函数简单地计算每对向量之间的点积,并使用可选的缩放来减少输出的方差。
  4. ```python
  5. class DotProductSimilarity(nn.Module):
  6. def __init__(self, scale_output=False):
  7. super(DotProductSimilarity, self).__init__()
  8. self.scale_output = scale_output
  9. def forward(self, tensor_1, tensor_2):
  10. result = (tensor_1 * tensor_2).sum(dim=-1)
  11. if self.scale_output:
  12. # TODO why allennlp do multiplication at here ?
  13. result /= math.sqrt(tensor_1.size(-1))
  14. return result

3、ProjectedDotProductSimilarity

这个相似度函数做一个投影,然后计算点积,计算公式为:
Pytorch 中六种常用的向量相似度评估方法 - 图1
计算后的激活函数。默认为不激活。

  1. class ProjectedDotProductSimilarity(nn.Module):
  2. def __init__(self, tensor_1_dim, tensor_2_dim, projected_dim,
  3. reuse_weight=False, bias=False, activation=None):
  4. super(ProjectedDotProductSimilarity, self).__init__()
  5. self.reuse_weight = reuse_weight
  6. self.projecting_weight_1 = nn.Parameter(torch.Tensor(tensor_1_dim, projected_dim))
  7. if self.reuse_weight:
  8. if tensor_1_dim != tensor_2_dim:
  9. raise ValueError('if reuse_weight=True, tensor_1_dim must equal tensor_2_dim')
  10. else:
  11. self.projecting_weight_2 = nn.Parameter(torch.Tensor(tensor_2_dim, projected_dim))
  12. self.bias = nn.Parameter(torch.Tensor(1)) if bias else None
  13. self.activation = activation
  14. def reset_parameters(self):
  15. nn.init.xavier_uniform_(self.projecting_weight_1)
  16. if not self.reuse_weight:
  17. nn.init.xavier_uniform_(self.projecting_weight_2)
  18. if self.bias is not None:
  19. self.bias.data.fill_(0)
  20. def forward(self, tensor_1, tensor_2):
  21. projected_tensor_1 = torch.matmul(tensor_1, self.projecting_weight_1)
  22. if self.reuse_weight:
  23. projected_tensor_2 = torch.matmul(tensor_2, self.projecting_weight_1)
  24. else:
  25. projected_tensor_2 = torch.matmul(tensor_2, self.projecting_weight_2)
  26. result = (projected_tensor_1 * projected_tensor_2).sum(dim=-1)
  27. if self.bias is not None:
  28. result = result + self.bias
  29. if self.activation is not None:
  30. result = self.activation(result)
  31. return result

4、BiLinearSimilarity

此相似度函数执行两个输入向量的双线性变换。这个函数有一个权重矩阵“W”和一个偏差“b”,以及两个向量之间的相似度,计算公式为:
Pytorch 中六种常用的向量相似度评估方法 - 图2
计算后的激活函数。 默认为不激活。

  1. class BiLinearSimilarity(nn.Module):
  2. def __init__(self, tensor_1_dim, tensor_2_dim, activation=None):
  3. super(BiLinearSimilarity, self).__init__()
  4. self.weight_matrix = nn.Parameter(torch.Tensor(tensor_1_dim, tensor_2_dim))
  5. self.bias = nn.Parameter(torch.Tensor(1))
  6. self.activation = activation
  7. self.reset_parameters()
  8. def reset_parameters(self):
  9. nn.init.xavier_uniform_(self.weight_matrix)
  10. self.bias.data.fill_(0)
  11. def forward(self, tensor_1, tensor_2):
  12. intermediate = torch.matmul(tensor_1, self.weight_matrix)
  13. result = (intermediate * tensor_2).sum(dim=-1) + self.bias
  14. if self.activation is not None:
  15. result = self.activation(result)
  16. return result

5、TriLinearSimilarity

此相似度函数执行两个输入向量的三线性变换,计算公式为:
Pytorch 中六种常用的向量相似度评估方法 - 图3
计算后的激活函数。 默认为不激活。

  1. class TriLinearSimilarity(nn.Module):
  2. def __init__(self, input_dim, activation=None):
  3. super(TriLinearSimilarity, self).__init__()
  4. self.weight_vector = nn.Parameter(torch.Tensor(3 * input_dim))
  5. self.bias = nn.Parameter(torch.Tensor(1))
  6. self.activation = activation
  7. self.reset_parameters()
  8. def reset_parameters(self):
  9. std = math.sqrt(6 / (self.weight_vector.size(0) + 1))
  10. self.weight_vector.data.uniform_(-std, std)
  11. self.bias.data.fill_(0)
  12. def forward(self, tensor_1, tensor_2):
  13. combined_tensors = torch.cat([tensor_1, tensor_2, tensor_1 * tensor_2], dim=-1)
  14. result = torch.matmul(combined_tensors, self.weight_vector) + self.bias
  15. if self.activation is not None:
  16. result = self.activation(result)
  17. return result

6、MultiHeadedSimilarity

这个相似度函数使用多个“头”来计算相似度。也就是说,将输入张量投影到多个新张量中,并分别计算每个投影张量的相似度。这里的结果比典型的相似度函数多一个维度。

  1. class MultiHeadedSimilarity(nn.Module):
  2. def __init__(self,
  3. num_heads,
  4. tensor_1_dim,
  5. tensor_1_projected_dim=None,
  6. tensor_2_dim=None,
  7. tensor_2_projected_dim=None,
  8. internal_similarity=DotProductSimilarity()):
  9. super(MultiHeadedSimilarity, self).__init__()
  10. self.num_heads = num_heads
  11. self.internal_similarity = internal_similarity
  12. tensor_1_projected_dim = tensor_1_projected_dim or tensor_1_dim
  13. tensor_2_dim = tensor_2_dim or tensor_1_dim
  14. tensor_2_projected_dim = tensor_2_projected_dim or tensor_2_dim
  15. if tensor_1_projected_dim % num_heads != 0:
  16. raise ValueError("Projected dimension not divisible by number of heads: %d, %d"
  17. % (tensor_1_projected_dim, num_heads))
  18. if tensor_2_projected_dim % num_heads != 0:
  19. raise ValueError("Projected dimension not divisible by number of heads: %d, %d"
  20. % (tensor_2_projected_dim, num_heads))
  21. self.tensor_1_projection = nn.Parameter(torch.Tensor(tensor_1_dim, tensor_1_projected_dim))
  22. self.tensor_2_projection = nn.Parameter(torch.Tensor(tensor_2_dim, tensor_2_projected_dim))
  23. self.reset_parameters()
  24. def reset_parameters(self):
  25. torch.nn.init.xavier_uniform_(self.tensor_1_projection)
  26. torch.nn.init.xavier_uniform_(self.tensor_2_projection)
  27. def forward(self, tensor_1, tensor_2):
  28. projected_tensor_1 = torch.matmul(tensor_1, self.tensor_1_projection)
  29. projected_tensor_2 = torch.matmul(tensor_2, self.tensor_2_projection)
  30. # Here we split the last dimension of the tensors from (..., projected_dim) to
  31. # (..., num_heads, projected_dim / num_heads), using tensor.view().
  32. last_dim_size = projected_tensor_1.size(-1) // self.num_heads
  33. new_shape = list(projected_tensor_1.size())[:-1] + [self.num_heads, last_dim_size]
  34. split_tensor_1 = projected_tensor_1.view(*new_shape)
  35. last_dim_size = projected_tensor_2.size(-1) // self.num_heads
  36. new_shape = list(projected_tensor_2.size())[:-1] + [self.num_heads, last_dim_size]
  37. split_tensor_2 = projected_tensor_2.view(*new_shape)
  38. # And then we pass this off to our internal similarity function. Because the similarity
  39. # functions don't care what dimension their input has, and only look at the last dimension,
  40. # we don't need to do anything special here. It will just compute similarity on the
  41. # projection dimension for each head, returning a tensor of shape (..., num_heads).
  42. return self.internal_similarity(split_tensor_1, split_tensor_2)