Similarity - 《notes》

metric
index

metric

criteria

Euclidean distance

Manhattan distance

The Euclidean and Manhattan distances are special cases of Minkowski distance

Minkowski distance

The larger the value of p the more emphasis is placed on the features with large differences in values because these differences are raised to the power of p

index

Jaccard similarity

co-presence (CP), co-absence (CA), presence-absence (PA), and absence- presence (AP).

Cosine Similarity

the cosine of the inner angle between the two vectors
especially useful measure of similarity when the descriptive features describing instances in a dataset are related to each other
All instances are normalized so as to lie on a hypersphere of radius 1.0 with its center at the origin of the feature space
This normalization is what makes cosine similarity so useful in scenarios in which we are interested in the relative spread of values across a set of descriptive features rather than the magnitudes of the values themselves.
If both customers use about four times as many SMS messages as VOICE calls, the cosine similarity will be 1; because even though the magnitudes of their feature values are different, the relationship between the feature values for both instances is the same.

Mahalanobis Distance

measure the similarity between instances with continuous descriptive features
it allows us to take into account how spread out the instances in a dataset are when judging similarities
uses covariance to scale distances so that distances along a direction where the dataset is very spread out are scaled down, and distances along directions where the dataset is tightly packed are scaled up
Mahalanobis distance between B and A will be less than the Mahalanobis distance between C and A

a) equally distributed in all directions b) negative covariance c) positive covariance
in b), Mahalanobis distance(A, B) < Mahalanobis distance(A, C)

Σ−1, represents the inverse covariance matrix computed across all instances in the dataset
effects:
- the larger the variance of a feature, the less weight the difference between the values for that feature will contribute to the distance calculation.
- the larger the correlation between two features, the less weight they contribute to the distance
The rotation and scaling of the axes are the result of the multiplication by the inverse covariance matrix of the dataset (Σ−1)
椭圆上的点距离A的Mahalanobis距离一样