Movatterモバイル変換


[0]ホーム

URL:


Menu
×
Sign In
+1 Get Certified For Teachers Spaces Plus Get Certified For Teachers Spaces Plus
   ❮     
     ❯   

Python Tutorial

Python HOMEPython IntroPython Get StartedPython SyntaxPython CommentsPython VariablesPython Data TypesPython NumbersPython CastingPython StringsPython BooleansPython OperatorsPython ListsPython TuplesPython SetsPython DictionariesPython If...ElsePython MatchPython While LoopsPython For LoopsPython FunctionsPython LambdaPython ArraysPython OOPPython Classes/ObjectsPython InheritancePython IteratorsPython PolymorphismPython ScopePython ModulesPython DatesPython MathPython JSONPython RegExPython PIPPython Try...ExceptPython String FormattingPython User InputPython VirtualEnv

File Handling

Python File HandlingPython Read FilesPython Write/Create FilesPython Delete Files

Python Modules

NumPy TutorialPandas TutorialSciPy TutorialDjango Tutorial

Python Matplotlib

Matplotlib IntroMatplotlib Get StartedMatplotlib PyplotMatplotlib PlottingMatplotlib MarkersMatplotlib LineMatplotlib LabelsMatplotlib GridMatplotlib SubplotMatplotlib ScatterMatplotlib BarsMatplotlib HistogramsMatplotlib Pie Charts

Machine Learning

Getting StartedMean Median ModeStandard DeviationPercentileData DistributionNormal Data DistributionScatter PlotLinear RegressionPolynomial RegressionMultiple RegressionScaleTrain/TestDecision TreeConfusion MatrixHierarchical ClusteringLogistic RegressionGrid SearchCategorical DataK-meansBootstrap AggregationCross ValidationAUC - ROC CurveK-nearest neighbors

Python DSA

Python DSALists and ArraysStacksQueuesLinked ListsHash TablesTreesBinary TreesBinary Search TreesAVL TreesGraphsLinear SearchBinary SearchBubble SortSelection SortInsertion SortQuick SortCounting SortRadix SortMerge Sort

Python MySQL

MySQL Get StartedMySQL Create DatabaseMySQL Create TableMySQL InsertMySQL SelectMySQL WhereMySQL Order ByMySQL DeleteMySQL Drop TableMySQL UpdateMySQL LimitMySQL Join

Python MongoDB

MongoDB Get StartedMongoDB Create DBMongoDB CollectionMongoDB InsertMongoDB FindMongoDB QueryMongoDB SortMongoDB DeleteMongoDB Drop CollectionMongoDB UpdateMongoDB Limit

Python Reference

Python OverviewPython Built-in FunctionsPython String MethodsPython List MethodsPython Dictionary MethodsPython Tuple MethodsPython Set MethodsPython File MethodsPython KeywordsPython ExceptionsPython Glossary

Module Reference

Random ModuleRequests ModuleStatistics ModuleMath ModulecMath Module

Python How To

Remove List DuplicatesReverse a StringAdd Two Numbers

Python Examples

Python ExamplesPython CompilerPython ExercisesPython QuizPython ServerPython SyllabusPython Study PlanPython Interview Q&APython BootcampPython CertificatePython Training

Machine Learning - K-nearest neighbors (KNN)


KNN

KNN is a simple, supervised machine learning (ML) algorithm that can be used for classification or regression tasks - and is also frequently used in missing value imputation. It is based on the idea that the observations closest to a given data point are the most "similar" observations in a data set, and we can therefore classify unforeseen points based on the values of the closest existing points. By choosingK, the user can select the number of nearby observations to use in the algorithm.

Here, we will show you how to implement the KNN algorithm for classification, and show how different values ofK affect the results.


How does it work?

K is the number of nearest neighbors to use. For classification, a majority vote is used to determined which class a new observation should fall into. Larger values ofK are often more robust to outliers and produce more stable decision boundaries thanvery small values (K=3 would be better thanK=1, which might produce undesirable results.

Example

Start by visualizing some data points:

import matplotlib.pyplot as plt

x = [4, 5, 10, 4, 3, 11, 14 , 8, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]
classes = [0, 0, 1, 0, 0, 1, 1, 0, 1, 1]

plt.scatter(x, y, c=classes)
plt.show()

Result

Run example »

Now we fit the KNN algorithm with K=1:

from sklearn.neighbors import KNeighborsClassifier

data = list(zip(x, y))
knn = KNeighborsClassifier(n_neighbors=1)

knn.fit(data, classes)

And use it to classify a new data point:

Example

new_x = 8
new_y = 21
new_point = [(new_x, new_y)]

prediction = knn.predict(new_point)

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]])
plt.text(x=new_x-1.7, y=new_y-0.7, s=f"new point, class: {prediction[0]}")
plt.show()

Result

Run example »

Now we do the same thing, but with a higher K value which changes the prediction:

Example

knn = KNeighborsClassifier(n_neighbors=5)

knn.fit(data, classes)

prediction = knn.predict(new_point)

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]])
plt.text(x=new_x-1.7, y=new_y-0.7, s=f"new point, class: {prediction[0]}")
plt.show()

Result

Run example »

Example Explained

Import the modules you need.

You can learn about the Matplotlib module in our"Matplotlib Tutorial.

scikit-learn is a popular library for machine learning in Python.

import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier

Create arrays that resemble variables in a dataset. We have two input features (x andy) and then a target class (class). The input features that are pre-labeled with our target class will be used to predict the class of new data. Note that while we only use two input features here, this method will work with any number of variables:

x = [4, 5, 10, 4, 3, 11, 14 , 8, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]
classes = [0, 0, 1, 0, 0, 1, 1, 0, 1, 1]

Turn the input features into a set of points:

data = list(zip(x, y))
print(data)

Result:

[(4, 21), (5, 19), (10, 24), (4, 17), (3, 16), (11, 25), (14, 24), (8, 22), (10, 21), (12, 21)]

Using the input features and target class, we fit a KNN model on the model using 1 nearest neighbor:

knn = KNeighborsClassifier(n_neighbors=1)
knn.fit(data, classes)

Then, we can use the same KNN object to predict the class of new, unforeseen data points. First we create new x and y features, and then callknn.predict() on the new data point to get a class of 0 or 1:

new_x = 8
new_y = 21
new_point = [(new_x, new_y)]
prediction = knn.predict(new_point)
print(prediction)

Result:

[0]

When we plot all the data along with the new point and class, we can see it's been labeled blue with the1 class. The text annotation is just to highlight the location of the new point:

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]])
plt.text(x=new_x-1.7, y=new_y-0.7, s=f"new point, class: {prediction[0]}")
plt.show()

Result:

However, when we changes the number of neighbors to 5, the number of points used to classify our new point changes. As a result, so does the classification of the new point:

knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(data, classes)
prediction = knn.predict(new_point)
print(prediction)

Result:

[1]

When we plot the class of the new point along with the older points, we note that the color has changed based on the associated class label:

plt.scatter(x + [new_x], y + [new_y], c=classes + [prediction[0]])
plt.text(x=new_x-1.7, y=new_y-0.7, s=f"new point, class: {prediction[0]}")
plt.show()

Result:


 
Track your progress - it's free!
 

×

Contact Sales

If you want to use W3Schools services as an educational institution, team or enterprise, send us an e-mail:
sales@w3schools.com

Report Error

If you want to report an error, or if you want to make a suggestion, send us an e-mail:
help@w3schools.com

W3Schools is optimized for learning and training. Examples might be simplified to improve reading and learning.
Tutorials, references, and examples are constantly reviewed to avoid errors, but we cannot warrant full correctness
of all content. While using W3Schools, you agree to have read and accepted ourterms of use,cookie and privacy policy.

Copyright 1999-2025 by Refsnes Data. All Rights Reserved.W3Schools is Powered by W3.CSS.


[8]ページ先頭

©2009-2025 Movatter.jp