Same students. Different algorithm. No training — just memory.
KNN asks one simple question: "Who are your closest neighbors, and what are they?"
No formulas during training. No tree to build. It just memorizes all the data and when a new student appears, it finds the K most similar students already seen — and takes a majority vote.
If most of your K neighbors passed → you pass.
If most failed → you fail.
You're new to a city. You want to know what neighborhood you're in. Instead of a map, you just look at the 3 nearest houses.
The ❓ new house looks at its 3 nearest neighbors: 2 nice 🏠, 1 rundown 🏚.
Majority says → nice neighborhood!
That's exactly how KNN classifies students.
| Student | Study Hrs | Attendance % | Prev. Marks | Result |
|---|---|---|---|---|
| A | 2 | 60 | 45 | Fail |
| B | 5 | 75 | 60 | Pass |
| C | 8 | 90 | 80 | Pass |
| D | 1 | 50 | 35 | Fail |
| E | 6 | 80 | 70 | Pass |
KNN stores this entire table. At prediction time, a new student walks in and the algorithm finds their K closest matches from this table.
We need a number to express how similar two students are. The most common way: Euclidean Distance — the straight-line distance between two points in feature space.
Think of it as the ruler between two dots on a graph. Shorter ruler = more similar.
A new student walks in:
| Student | Study Hrs | Attendance % | Prev. Marks | Result |
|---|---|---|---|---|
| X (new) | 4 | 70 | 55 | ??? |
We calculate the distance from X to every student in our data. Closest ones become the neighbors.
Using all 3 features (hours, attendance, marks):
K is the number of nearest neighbors that vote on the prediction. You choose K before running the algorithm.
Odd K is preferred to avoid tie votes (e.g. 2 vs 2).
We pick the 3 nearest neighbors: B, A, E
B (dist 7.14)
E (dist 18.14)
A (dist 14.28)
Majority = PASS → Student X is predicted to Pass! 🎉
With our 5 students, let's see what happens at different K values for Student X:
| K | Neighbors Used | Pass Votes | Fail Votes | Prediction |
|---|---|---|---|---|
| K=1 | B | 1 | 0 | Pass ✅ |
| K=3 | B, A, E | 2 | 1 | Pass ✅ |
| K=5 | B, A, E, D, C | 3 | 2 | Pass ✅ |
In this case, all K values agree. But with different data, a wrong K can flip the result — so choosing K matters. Common approach: try multiple K values and pick the one with best accuracy on test data.
Look at our features: Hours (1–8), Attendance (50–90), Marks (35–80).
Attendance has much bigger numbers. So small differences in attendance dominate the distance calculation, drowning out study hours.
Solution: Normalize all features to the same scale (0 to 1) before computing distances. This gives each feature an equal voice.
New student X = (4 hrs, 70%, 55 marks). We use K = 3.
KNN has the simplest training phase of any algorithm — there is no training. You just store the 5 rows of student data in memory. That's it.
The real work happens at prediction time.
Student X arrives. Calculate distance from X to every stored student:
| Student | Result | Distance from X | Rank |
|---|---|---|---|
| B | Pass | 7.14 | 🥇 #1 |
| A | Fail | 14.28 | 🥈 #2 |
| E | Pass | 18.14 | 🥉 #3 |
| D | Fail | 28.44 | #4 |
| C | Pass | 32.26 | #5 |
Take the top 3: B, A, E. Ignore D and C — they're too far away.
Green = Pass, Red = Fail
The decision is made. No formula was "learned" — we just asked: who are the most similar people I've seen before?
Set the student's features and K value. See which neighbors are chosen and how they vote.