How to write a homemade K Nearest Neighbors model

I am currently in my first half of computer science and for this project, I chose to write a data science algorithm. I got to pick from a list and I chose a knn algorithm. Why? Because it is so easy to build and implement. Working on this project I have a much better understanding of how knn models work.

This article is geared toward readers who already have a basic understanding of python and predictive models.

So to start out, we need to build a rough draft of where we are going with our algorithm.

So the first thing we need to do is create a function that can find the euclidean distance between two vectors. According to Wikipedia, the euclidean distance is defined as, “In mathematics, the Euclidean distance or Euclidean metric is the “ordinary” straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space.”

Now that we have our function made, we can start working on building our class.

So in the init function, we are initializing k=3 which means the number of nearest neighbors we are setting to three. After that, we are going to create our fit function, which will be used to fit our data. We can then start on the predict function. We are setting knn_predictions to i. So we are saying, for i in P, run predictions on i. This is where our little helper function comes in. So we are finding the Euclidean distance and then sorting by indices. It is going to return the first item. Then we are going to set our y_train, and then finish by using a counter. What this is doing is finding the single most common item.

So how does this compare to a KNN Classifier from Sklearn?

It is pretty close in its accuracy score, if not the same. I used to iris dataset as it seems to be relatively standard.

Here is the code for the sklearn model (after the data has been processed by train, test, split).

Here is the code for using my knn model using the same processed data.

Here are the predictions for both models. I kept running the code and it keeps outputting the same predictions, so I am happy with the way this turned out.

28. Data Science Student. Nerd. Lambda School DSPT4

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Attention: Fourier Transforms. A giant leap in transformer efficiency.

House Prices Prediction using Andrew Ng’s Machine Learning Algorithm

Adaptive Boosting or Ada Boosting algorithm — Hidden Success behind the many Kaggle Competition

Dog Breed Classification using a pre-trained CNN model.

What kind of dog are you?

A Gentle Introduction Into Machine Learning

EDA On IRIS Dataset

Dimensions of IRIS flower

Using TensorBoard & Callbacks to Inspect & Monitor Deep Learning Models during Training.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jessica Kimbril

Jessica Kimbril

28. Data Science Student. Nerd. Lambda School DSPT4

More from Medium

Data Cleaning Journey for Beginners

Stressed Data Scientist

KNN Algorithm in Machine Learning


Linear Regression: Important terms to know