Experience E with respect to some
Class of Tasks T and
Performance Measure P
if its performance at tasks in T, as measured by P,
improves with experience E."
Understanding AI, ML, and Deep Learning
Dr. Dhaval Patel • 2025
Concepts
AI, Machine Learning, and Deep Learning
Artificial Intelligence represents humanity's ambitious goal of creating machines that can "think." The field began in the 1950s with a simple but profound question: Can machines exhibit intelligent behavior?
Here's what makes this interesting: AI doesn't necessarily require learning. For decades, the dominant approach was symbolic AI, where programmers explicitly wrote rules. Think of early chess programs that followed thousands of hand-coded strategies like "control the center" or "protect your king."
This approach worked brilliantly for logical, rule-based problems like chess, but struggled with fuzzy, real-world challenges.
How do you write explicit rules for recognizing your grandmother's face in a photo, or understanding sarcasm in speech? These limitations led us to seek a different approach.
Machine Learning emerged as a revolutionary response to symbolic AI's limitations. Instead of programming explicit rules, we realized we could let machines discover patterns themselves.
This represents a fundamental shift in thinking. Traditional programming follows this logic:
But machine learning flips this equation:
This "program" we create is called a model, and it captures the learned patterns from our examples. Once trained, we can apply this model to new, unseen data to make predictions or decisions.
Think of it like teaching a child to recognize animals. Instead of describing every possible feature of a cat, you show them hundreds of pictures labeled "cat" and "not cat." Eventually, they learn to recognize cats they've never seen before. Machine learning works similarly, but with mathematical precision.
Deep Learning represents the most recent breakthrough in machine learning, inspired by how our brains process information through interconnected neurons.
What makes deep learning "deep" is the multiple layers of artificial neurons, each learning different levels of abstraction.
The first layer might detect edges in an image, the second layer combines edges into shapes, the third layer combines shapes into objects, and so on.
Three key factors enabled deep learning's recent success:
Deep learning has achieved remarkable success in areas like image recognition, language translation, and game playing, often surpassing human performance on specific tasks.
Tom Mitchell's Formal Definition
Task (T): The specific behavior or problem we want the system to improve at. This could be making predictions, classifying images, or choosing optimal actions.
Experience (E): The data or information the system uses to learn. This is your training material - the examples from which patterns will be discovered.
Performance (P): How we quantify success. This must be measurable and objective, allowing us to track improvement over time.
Different tasks require different performance measures:
Let's make this concrete with a classic example: teaching a computer to play checkers better.
Notice how this framework forces us to be specific. We can't just say "learn to play checkers better." We must define exactly what constitutes the task, what data we'll use for learning, and how we'll measure improvement.
The Four Essential Steps
Creating any machine learning system requires making four crucial decisions. Think of these as the architectural choices that determine your system's capabilities and limitations.
Step 1 - Training Experience: 100,000 emails labeled as "spam" or "legitimate"
Step 2 - Target Function: A function that maps email features to spam probability
Step 3 - Representation: A mathematical model using word frequencies, sender patterns, and subject line characteristics
Step 4 - Algorithm: A process that adjusts the model's parameters to minimize classification errors on training data
Diagram showing a small amount of labeled data mixed with a large amount of unlabeled data being used together for learningDiagram showing a small amount of labeled data mixed with a large amount of unlabeled data being used together for learningDiagram showing a small amount of labeled data mixed with a large amount of unlabeled data being used together for learningDiagram showing a small amount of labeled data mixed with a large amount of unlabeled data being used together for learningDiagram showing a small amount of labeled data mixed with a large amount of unlabeled data being used together for learningDiagram showing a small amount of labeled data mixed with a large amount of unlabeled data being used together for learningFour Distinct Approaches
Learning from examples with correct answers. Like studying for an exam with an answer key - you know what the right responses should be.
Goal: Learn to predict outputs for new inputs
Finding hidden patterns without knowing the "right" answers. Like exploring a new city without a map - you discover the structure as you go.
Goal: Discover hidden structures or patterns in data
When your output is a category or discrete class. Think of it as sorting things into labeled boxes.
The key insight: you're predicting which category something belongs to. The output has distinct, separate values with no meaningful order between them.
When your output is a continuous number. Think of it as predicting a position on a measuring stick.
The key insight: you're predicting a quantity that can take any value within a range. Small changes in input typically lead to small changes in output.
Finding natural groupings in your data. Imagine having a box of mixed objects and sorting them into piles based on similarity, without knowing ahead of time what groups should exist.
The algorithm discovers that some customers buy luxury items, others focus on discounts, and others prefer eco-friendly products - groups you might not have thought to look for.
Finding relationships between different items or events. Discovering "if this, then that" patterns in your data.
These discoveries help with recommendations, inventory management, and understanding complex relationships in your domain.
Reinforcement Learning is fundamentally different from both supervised and unsupervised learning. Instead of learning from a fixed dataset, an agent learns by taking actions in an environment and receiving feedback.
Think of teaching a child to ride a bicycle. You don't give them a manual or show them thousands of examples. Instead, they try different actions (pedaling, steering, balancing), experience the consequences (staying upright or falling), and gradually learn what works.
The agent's goal is to learn a policy - a strategy that tells it what action to take in any situation to maximize total rewards over time. This approach has achieved remarkable success in complex scenarios like game playing, robotics, and autonomous systems.
Semi-supervised learning addresses a common real-world problem: you have lots of data, but only a small portion is labeled.
Core Assumption: If two data points are close together in a high-density region, they likely have similar labels.
Machine Learning in the Real World
Machine learning has quietly revolutionized nearly every aspect of modern life. Let's explore how these algorithms work behind the scenes to power the applications you interact with daily.
Language of Machine Learning