From Hill Climbing to Finding Global Solutions
Dr. Dhaval Patel β’ 2025
Local search is a fundamentally different approach to solving optimization problems. Unlike traditional search algorithms that maintain a frontier of unexplored nodes and systematically explore the entire search space, local search operates on a completely different principle.
Imagine you're a hiker trying to reach the highest point in a mountain range, but you're caught in thick fog that limits your visibility to just nearby terrain. This perfectly captures the essence of local search.
Goal: Find the sequence of actions to reach a goal state
Examples:
What Matters: The sequence of moves, total cost of path, optimality of route
Goal: Find the best possible arrangement or configuration
Examples:
What Matters: The final configuration quality, not how we arrived there
Before diving into local search algorithms, we need to understand how to convert problems into optimization format. This is crucial because local search works on objective functions rather than goal tests.
The Foundation of Local Search
Hill climbing is the simplest and most intuitive local search algorithm. Think of it as a greedy algorithm that always takes the locally best step.
Start with a random state or use problem-specific heuristics to choose a good starting point. This initial state becomes your "current" state.
Find all neighboring states - states reachable by one legal move. The definition of "neighbor" depends on your problem domain.
Calculate objective function value for each neighbor. This tells us how "good" each potential move is.
Choose the neighbor with the best value (highest for maximization, lowest for minimization). If there are ties, choose randomly among the best.
If no neighbor is better than current state, STOP - you've reached a local optimum. Otherwise, move to the best neighbor and repeat from step 2.
The N-Queens problem is a classic puzzle that perfectly demonstrates local search concepts. Let's start with the simpler 4-Queens version to understand the fundamentals.
Now let's scale up to the classic 8-Queens problem and see how we convert this constraint satisfaction problem into an optimization problem suitable for local search.
State Representation: We can represent each state as a string like "24748552" where position i contains the row number of the queen in column i.
Let's walk through exactly how hill climbing works on the 8-Queens problem, step by step.
Looking at our initial state with h = 17 attacking pairs:
Starting from h = 17, a typical run might progress: 17 β 12 β 8 β 5 β 3 β 2 β 1 β stuck at local minimum
Understanding the Fundamental Limitations
Before diving into specific problems, let's visualize the optimization landscape that hill climbing algorithms must navigate. This landscape metaphor helps us understand why local search can get stuck.
A local maximum is a state that is better than all its neighbors, but not necessarily the best state in the entire search space.
Might get stuck at h = 1 (one attacking pair) where all single moves make things worse, even though h = 0 solutions exist.
A plateau is a flat area of the search space where all neighboring states have the same objective function value.
Standard hill climbing stops because no neighbor is better, but progress might be possible by moving to equally good states.
Plateaus commonly occur in neural network training when loss functions have flat regions, scheduling problems with equivalent arrangements, and game AI when multiple moves have equal evaluation scores.
A ridge is a long, narrow area of high values with steep sides. The challenge is that progress along the ridge requires diagonal movement, but hill climbing can only move to direct neighbors.
In neural network training, the loss function often has ridges where optimal progress requires adjusting multiple weights simultaneously, but gradient descent adjusts one dimension at a time.
Ridges are less common in 8-Queens because the discrete nature of the problem and the neighbor definition, but they can occur in continuous optimization problems or problems with more complex neighbor relationships.
Let's examine the empirical performance of hill climbing on the 8-Queens problem to understand its strengths and limitations.
Trade-off Analysis: Hill climbing trades completeness (guaranteed solution finding) for speed and memory efficiency. This trade-off is often worthwhile in real-world applications where "good enough" solutions are acceptable.
Escaping Local Optima
The simplest modification to basic hill climbing is to allow sideways moves - moves to neighbors with equal objective function values. This helps escape plateaus and shoulders.
"If at first you don't succeed, try, try again!" This is the most practical and widely-used enhancement to hill climbing.
Start from a random initial state and run hill climbing until it terminates (either finds solution or gets stuck).
If hill climbing found a satisfactory solution, return it and stop. Otherwise, record the best solution found so far.
Create a completely new random initial state, independent of previous attempts. This gives us a fresh perspective on the search space.
Go back to step 1 and repeat the entire process. Continue until finding a solution or reaching a time/iteration limit.
Pure hill climbing can never be complete (guaranteed to find a solution if one exists) because it can get permanently stuck in local optima.
Combine the efficiency of greedy hill climbing with the completeness of random exploration.
At each step, make a probabilistic choice:
Instead of always choosing the single best neighbor, stochastic variations introduce randomness in the selection process while still biasing toward better moves.
How it works: If we have neighbors with improvements of +2, +5, and +1, we might select them with probabilities 2/8, 5/8, and 1/8 respectively.
Advantage: Useful when the number of neighbors is very large, as we don't need to evaluate all of them.
Simulated annealing mimics the physical process of annealing in metallurgy - slowly cooling molten metal to form perfect crystal structures.
Simulated annealing extends hill climbing by occasionally accepting moves that worsen the objective function, with the probability of acceptance decreasing over time.
Instead of maintaining a single solution, genetic algorithms maintain a population of solutions and evolve them over generations using principles inspired by natural selection.
Start with a diverse population of random solutions (typically 50-1000 individuals). Each solution is encoded as a string (like DNA). The first image shows various initial N-Queens configurations.
Evaluate each individual using a fitness function (opposite of cost function). Better solutions get higher fitness scores and higher reproduction probability.
Choose pairs of parents for breeding based on fitness. Common methods: roulette wheel selection, tournament selection, rank-based selection.
Combine genetic material from two parents to create offspring. For strings, this might mean swapping segments at random crossover points.
Randomly change small parts of offspring to maintain genetic diversity and explore new areas of the search space.
Let's see how genetic algorithms work on 8-Queens with a concrete example of one generation.
Each individual is a string of 8 digits representing queen positions:
Selection probabilities based on fitness:
Parent 1: "24748552" | Parent 2: "32752411"
Crossover point after position 3:
Best For: Quick solutions, limited resources
Best For: Complex problems, high-quality solutions
Real-World Impact: These algorithms power machine learning training, circuit design, scheduling systems, game AI, financial optimization, and countless other applications where finding excellent solutions quickly is more valuable than guaranteeing perfect solutions slowly.
You now understand the foundations of modern optimization - from simple hill climbing to sophisticated evolutionary algorithms. These tools are essential for any AI practitioner tackling real-world optimization challenges!