场景一

image.png
权重如何与距离挂钩?倒数
作用:解决

场景二

image.png

编码实现

权重weights参数:

  • uniform:不考虑距离
  • distance:考虑距离 ```python

    默认使用的是欧拉距离

    best_score = 0.0 best_k = -1 best_method = “” for method in [“uniform”, “distance”]:
    for k in range(1, 11):
    1. knn_clf = KNeighborsClassifier(n_neighbors=k, weights=method)
    2. knn_clf.fit(X_train, y_train)
    3. score = knn_clf.score(X_test, y_test)
    4. if score > best_score:
    5. best_k = k
    6. best_score = score
    7. best_method = method

print(“best_method =”, best_method) # best_method = uniform print(“best_k =”, best_k) # best_k = 4 print(“best_score =”, best_score) # best_score = 0.9916666666666667

  1. <a name="wcAqt"></a>
  2. # 距离的定义
  3. <a name="zL4QS"></a>
  4. ## 欧拉距离
  5. ![image.png](https://cdn.nlark.com/yuque/0/2021/png/12405790/1638702360677-01778785-8800-4739-9240-5a8b20d48a61.png#clientId=u6f681875-98ff-4&from=paste&height=88&id=u744f2fc8&margin=%5Bobject%20Object%5D&name=image.png&originHeight=351&originWidth=807&originalType=binary&ratio=1&size=56907&status=done&style=none&taskId=u5518c09d-51c3-4a5f-b9f4-dd17b5160cd&width=202)
  6. <a name="JNMXa"></a>
  7. ## 曼哈顿距离
  8. ![image.png](https://cdn.nlark.com/yuque/0/2021/png/12405790/1638702273499-34ff515a-2933-4e0b-8884-3444d7b31ae5.png#clientId=u6f681875-98ff-4&from=paste&height=222&id=ueedfef59&margin=%5Bobject%20Object%5D&name=image.png&originHeight=887&originWidth=1806&originalType=binary&ratio=1&size=347294&status=done&style=none&taskId=u9535ff0c-1836-4054-8a53-aa3f485c280&width=452)
  9. <a name="d7vYC"></a>
  10. ## 明可夫斯基距离
  11. `Minkowski`前两种距离的更一般形式<br />作用:为KNN算法提供了一个超参数<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/12405790/1638702521969-f298c615-7851-4353-94c5-dfb5a1f980b1.png#clientId=u6f681875-98ff-4&from=paste&height=255&id=u0d63aa76&margin=%5Bobject%20Object%5D&name=image.png&originHeight=1019&originWidth=1284&originalType=binary&ratio=1&size=170135&status=done&style=none&taskId=u87534457-8044-406c-90b1-35dd464a7aa&width=321)
  12. ```python
  13. best_score = 0.0
  14. best_k = -1
  15. best_p = -1
  16. for k in range(1, 11):
  17. for p in range(1, 6):
  18. knn_clf = KNeighborsClassifier(n_neighbors=k, weights="distance", p=p)
  19. knn_clf.fit(X_train, y_train)
  20. score = knn_clf.score(X_test, y_test)
  21. if score > best_score:
  22. best_k = k
  23. best_p = p
  24. best_score = score
  25. print("best_k =", best_k) # best_k = 3
  26. print("best_p =", best_p) # best_p = 2
  27. print("best_score =", best_score) # best_score = 0.9888888888888889
  1. sk_knn_clf = KNeighborsClassifier(n_neighbors=4, weights="distance", p=1)
  2. sk_knn_clf.fit(X_train, y_train)
  3. sk_knn_clf.score(X_test, y_test) # 0.9833333333333333