排序、搜索、计数和集合的操作 - 集合操作 - 《深度学习》

构造集合
布尔运算

构造集合

numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)Find the unique elements of an array.
- return_index=True 表示返回新列表元素在旧列表中的位置。
- return_inverse=True表示返回旧列表元素在新列表中的位置。
- return_counts=True表示返回新列表元素在旧列表中出现的次数。

【例】找出数组中的唯一值并返回已排序的结果。

import numpy as np
x = np.unique([1, 1, 3, 2, 3, 3])
print(x)  # [1 2 3]
x = sorted(set([1, 1, 3, 2, 3, 3]))
print(x)  # [1, 2, 3]
x = np.array([[1, 1], [2, 3]])
u = np.unique(x)
print(u)  # [1 2 3]
x = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])
y = np.unique(x, axis=0)
print(y)
# [[1 0 0]
#  [2 3 4]]
x = np.array(['a', 'b', 'b', 'c', 'a'])
u, index = np.unique(x, return_index=True)
print(u)  # ['a' 'b' 'c']
print(index)  # [0 1 3]
print(x[index])  # ['a' 'b' 'c']
x = np.array([1, 2, 6, 4, 2, 3, 2])
u, index = np.unique(x, return_inverse=True)
print(u)  # [1 2 3 4 6]
print(index)  # [0 1 4 3 1 2 1]
print(u[index])  # [1 2 6 4 2 3 2]
u, count = np.unique(x, return_counts=True)
print(u)  # [1 2 3 4 6]
print(count)  # [1 3 1 1 1]

布尔运算

numpy.in1d(ar1, ar2, assume_unique=False, invert=False) Test whether each element of a 1-D array is also present in a second array.

Returns a boolean array the same length as ar1 that is True where an element of ar1 is in ar2 and False otherwise.

【例】前面的数组是否包含于后面的数组，返回布尔值。返回的值是针对第一个参数的数组的，所以维数和第一个参数一致，布尔值与数组的元素位置也一一对应。

import numpy as np
test = np.array([0, 1, 2, 5, 0])
states = [0, 2]
mask = np.in1d(test, states)
print(mask)  # [ True False  True False  True]
print(test[mask])  # [0 2 0]
mask = np.in1d(test, states, invert=True)
print(mask)  # [False  True False  True False]
print(test[mask])  # [1 5]

求两个集合的交集

numpy.intersect1d(ar1, ar2, assume_unique=False, return_indices=False) Find the intersection of two arrays.

Return the sorted, unique values that are in both of the input arrays.

【例】求两个数组的唯一化+求交集+排序函数。

import numpy as np
from functools import reduce
x = np.intersect1d([1, 3, 4, 3], [3, 1, 2, 1])
print(x)  # [1 3]
x = np.array([1, 1, 2, 3, 4])
y = np.array([2, 1, 4, 6])
xy, x_ind, y_ind = np.intersect1d(x, y, return_indices=True)
print(x_ind)  # [0 2 4]
print(y_ind)  # [1 0 2]
print(xy)  # [1 2 4]
print(x[x_ind])  # [1 2 4]
print(y[y_ind])  # [1 2 4]
x = reduce(np.intersect1d, ([1, 3, 4, 3], [3, 1, 2, 1], [6, 3, 4, 2]))
print(x)  # [3]

求两个集合的并集

numpy.union1d(ar1, ar2) Find the union of two arrays.

Return the unique, sorted array of values that are in either of the two input arrays.
【例】计算两个集合的并集，唯一化并排序。

import numpy as np
from functools import reduce
x = np.union1d([-1, 0, 1], [-2, 0, 2])
print(x)  # [-2 -1  0  1  2]
x = reduce(np.union1d, ([1, 3, 4, 3], [3, 1, 2, 1], [6, 3, 4, 2]))
print(x)  # [1 2 3 4 6]
'''
functools.reduce(function, iterable[, initializer])
将两个参数的 function 从左至右积累地应用到 iterable 的条目，以便将该可迭代对象缩减为单一的值。 例如，reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) 是计算 ((((1+2)+3)+4)+5) 的值。 左边的参数 x 是积累值而右边的参数 y 则是来自 iterable 的更新值。 如果存在可选项 initializer，它会被放在参与计算的可迭代对象的条目之前，并在可迭代对象为空时作为默认值。 如果没有给出 initializer 并且 iterable 仅包含一个条目，则将返回第一项。
大致相当于：
def reduce(function, iterable, initializer=None):
    it = iter(iterable)
    if initializer is None:
        value = next(it)
    else:
        value = initializer
    for element in it:
        value = function(value, element)
    return value
'''

求两个集合的差集

numpy.setdiff1d(ar1, ar2, assume_unique=False) Find the set difference of two arrays.

Return the unique values in ar1 that are not in ar2.
【例】集合的差，即元素存在于第一个函数不存在于第二个函数中。

import numpy as np
a = np.array([1, 2, 3, 2, 4, 1])
b = np.array([3, 4, 5, 6])
x = np.setdiff1d(a, b)
print(x)  # [1 2]

求两个集合的异或

setxor1d(ar1, ar2, assume_unique=False) Find the set exclusive-or of two arrays.

【例】集合的对称差，即两个集合的交集的补集。简言之，就是两个数组中各自独自拥有的元素的集合。

import numpy as np
a = np.array([1, 2, 3, 2, 4, 1])
b = np.array([3, 4, 5, 6])
x = np.setxor1d(a, b)
print(x)  # [1 2 5 6]