场景一
场景二
编码实现
权重weights
参数:
-
uniform
:不考虑距离 -
distance
:考虑距离 ```python默认使用的是欧拉距离
best_score = 0.0 best_k = -1 best_method = “” for method in [“uniform”, “distance”]:
for k in range(1, 11):knn_clf = KNeighborsClassifier(n_neighbors=k, weights=method)
knn_clf.fit(X_train, y_train)
score = knn_clf.score(X_test, y_test)
if score > best_score:
best_k = k
best_score = score
best_method = method
print(“best_method =”, best_method) # best_method = uniform print(“best_k =”, best_k) # best_k = 4 print(“best_score =”, best_score) # best_score = 0.9916666666666667
<a name="wcAqt"></a>
# 距离的定义
<a name="zL4QS"></a>
## 欧拉距离
![image.png](https://cdn.nlark.com/yuque/0/2021/png/12405790/1638702360677-01778785-8800-4739-9240-5a8b20d48a61.png#clientId=u6f681875-98ff-4&from=paste&height=88&id=u744f2fc8&margin=%5Bobject%20Object%5D&name=image.png&originHeight=351&originWidth=807&originalType=binary&ratio=1&size=56907&status=done&style=none&taskId=u5518c09d-51c3-4a5f-b9f4-dd17b5160cd&width=202)
<a name="JNMXa"></a>
## 曼哈顿距离
![image.png](https://cdn.nlark.com/yuque/0/2021/png/12405790/1638702273499-34ff515a-2933-4e0b-8884-3444d7b31ae5.png#clientId=u6f681875-98ff-4&from=paste&height=222&id=ueedfef59&margin=%5Bobject%20Object%5D&name=image.png&originHeight=887&originWidth=1806&originalType=binary&ratio=1&size=347294&status=done&style=none&taskId=u9535ff0c-1836-4054-8a53-aa3f485c280&width=452)
<a name="d7vYC"></a>
## 明可夫斯基距离
`Minkowski`前两种距离的更一般形式<br />作用:为KNN算法提供了一个超参数<br />![image.png](https://cdn.nlark.com/yuque/0/2021/png/12405790/1638702521969-f298c615-7851-4353-94c5-dfb5a1f980b1.png#clientId=u6f681875-98ff-4&from=paste&height=255&id=u0d63aa76&margin=%5Bobject%20Object%5D&name=image.png&originHeight=1019&originWidth=1284&originalType=binary&ratio=1&size=170135&status=done&style=none&taskId=u87534457-8044-406c-90b1-35dd464a7aa&width=321)
```python
best_score = 0.0
best_k = -1
best_p = -1
for k in range(1, 11):
for p in range(1, 6):
knn_clf = KNeighborsClassifier(n_neighbors=k, weights="distance", p=p)
knn_clf.fit(X_train, y_train)
score = knn_clf.score(X_test, y_test)
if score > best_score:
best_k = k
best_p = p
best_score = score
print("best_k =", best_k) # best_k = 3
print("best_p =", best_p) # best_p = 2
print("best_score =", best_score) # best_score = 0.9888888888888889
sk_knn_clf = KNeighborsClassifier(n_neighbors=4, weights="distance", p=1)
sk_knn_clf.fit(X_train, y_train)
sk_knn_clf.score(X_test, y_test) # 0.9833333333333333