metric

criteria

image.png

Euclidean distance

image.png

Manhattan distance

image.png
The Euclidean and Manhattan distances are special cases of Minkowski distance

Minkowski distance

image.png
The larger the value of p the more emphasis is placed on the features with large differences in values because these differences are raised to the power of p

index

Jaccard similarity

co-presence (CP), co-absence (CA), presence-absence (PA), and absence- presence (AP).
image.png

Cosine Similarity

image.png

  1. the cosine of the inner angle between the two vectors
  2. especially useful measure of similarity when the descriptive features describing instances in a dataset are related to each other
  3. All instances are normalized so as to lie on a hypersphere of radius 1.0 with its center at the origin of the feature space
  4. This normalization is what makes cosine similarity so useful in scenarios in which we are interested in the relative spread of values across a set of descriptive features rather than the magnitudes of the values themselves.
  5. If both customers use about four times as many SMS messages as VOICE calls, the cosine similarity will be 1; because even though the magnitudes of their feature values are different, the relationship between the feature values for both instances is the same.

Mahalanobis Distance

image.png

  1. measure the similarity between instances with continuous descriptive features
  2. it allows us to take into account how spread out the instances in a dataset are when judging similarities
  3. uses covariance to scale distances so that distances along a direction where the dataset is very spread out are scaled down, and distances along directions where the dataset is tightly packed are scaled up
  4. Mahalanobis distance between B and A will be less than the Mahalanobis distance between C and A

image.png
a) equally distributed in all directions b) negative covariance c) positive covariance
in b), Mahalanobis distance(A, B) < Mahalanobis distance(A, C)

  1. Σ−1, represents the inverse covariance matrix computed across all instances in the dataset
  2. effects:
    • the larger the variance of a feature, the less weight the difference between the values for that feature will contribute to the distance calculation.
    • the larger the correlation between two features, the less weight they contribute to the distance
  3. The rotation and scaling of the axes are the result of the multiplication by the inverse covariance matrix of the dataset (Σ−1)
  4. 椭圆上的点距离A的Mahalanobis距离一样

image.png