Decision Tree — The Visual Lesson

🎓 The Big Idea

A Decision Tree is just a series of yes/no questions. Like a professor guessing if a student will pass — no complex math, just logic.

The machine learns which question to ask first and in what order — automatically, from data.

📊 Our Training Data

5 students, 3 features, 1 outcome. The tree will learn from this.

Student	Study Hrs/day	Attendance %	Prev. Marks	Result
A	2	60	45	Fail
B	5	75	60	Pass
C	8	90	80	Pass
D	1	50	35	Fail
E	6	80	70	Pass

❓ The Core Question

We have 3 features. Which one should we ask about first?

Not randomly — we pick the feature that best separates Pass from Fail students. That's the Root Node.

To measure "best separation" we use a concept called Entropy (Impurity) →

🔀 What is Impurity (Entropy)?

Entropy tells us how mixed a group is.

All Pass → Entropy = 0 (Pure ✅)
All Fail → Entropy = 0 (Pure ✅)
Half Pass, Half Fail → Entropy = 1 (Maximum chaos)

Goal: Find splits that make groups as pure as possible (low entropy).

Pure Node

Attendance ≥ 70%

All students PASS

H = 0

Impure Node

Hours ≥ 4

Mixed result

H = High

🧮 The Formula

For a node with proportion p of one class and q of the other:

Entropy(node) = −p·log₂(p) − q·log₂(q) Example: 3 Pass, 2 Fail (total 5 students) p(Pass) = 3/5 = 0.6 p(Fail) = 2/5 = 0.4 H = −0.6·log₂(0.6) − 0.4·log₂(0.4) H ≈ 0.971 ← quite mixed

Don't memorize the formula. Remember the concept: lower entropy = purer group = better split.

💡 Analogy

Imagine a bag of colored balls.

All green balls → you know exactly what you'll pick → Entropy 0
Half green, half red → total surprise → Entropy 1

A good split creates bags where you're not surprised by what you pull out.

📈 Information Gain — Picking the Root Node

After splitting on a feature, how much did entropy drop? That drop is the Information Gain.

Info Gain = Entropy(before split) − Weighted Entropy(after split) Higher Info Gain = Better feature to split on first

We calculate this for every feature, then pick the winner.

🏆 Race: Which Feature Wins?

The algorithm tests all 3 features on our 5 students:

Attendance ≥ 70?

|

Pure groups!

HIGH ⭐

ROOT NODE

Hours ≥ 4?

|

Mixed groups

MED

Prev.Marks ≥ 55?

|

Mixed groups

MED

Attendance wins — it creates the two purest groups. So it becomes the Root Node.

🔄 The Loop — How We Move to the Next Node

After the root node splits the data, each branch gets its own subset of students. The exact same process repeats on that subset:

Calculate entropy of the current branch's group
Test every remaining feature on that group
Pick the feature with highest Info Gain
Split again → new nodes
Stop when a group is pure (all Pass or all Fail)

🌳 Building the Tree — Level by Level

1

Root Node — highest Info Gain across all features

Attendance ≥ 70?

↙ NO

❌ FAIL
Students A, D

Entropy = 0
Pure! Stop here.

↘ YES

Hours ≥ 4?

B, C, E still mixed
→ Split again!

↙ NO

❌ FAIL

↘ YES

✅ PASS
B, C, E

Entropy = 0
Pure! Stop.

📋 The Tree as Code

The whole tree is literally just if-else:

if attendance >= 70: if hours >= 4: return "PASS" ✅ else: return "FAIL" ❌ else: return "FAIL" ❌

No magic. Just structured questions learned from data. Scores 100% on training data.

⚠️ But Is It Too Perfect?

100% on training data sounds great — but the tree might have memorized these 5 students instead of learning general rules. This is Overfitting.

Solution: Limit tree depth, require minimum samples per split, or use a Random Forest (many trees vote together).

🔮 Try the Tree — Predict a New Student

Adjust the sliders and see the tree's decision step by step.

Attendance %

65%

Study Hours/day

3 hrs

🗺️ Decision Map

❌ FAIL if:

Attendance < 70% → always Fail
Attendance ≥ 70% BUT Hours < 4

✅ PASS if:

Attendance ≥ 70% AND Hours ≥ 4

🧠 Key Takeaways

Entropy measures how mixed (impure) a group is
Information Gain = entropy drop after a split
Root Node = feature with highest Info Gain on full data
Next nodes = same process on each branch's subset
Stop when a group is pure or max depth is reached
Final result = a chain of if-else rules