When everything else fails — heuristics

The last layer: search frameworks with no quality bounds, frequently the most useful at production scale.

When exact methods fail, approximations are too loose, and parameterized algorithms do not apply, the last layer is heuristic search. Simulated annealing accepts worsening moves with temperature-dependent probability, cooling toward greedy. Tabu search maintains a memory of recently visited solutions to force exploration. Genetic algorithms evolve populations. Large neighborhood search alternates destruction and repair. None come with worst-case quality guarantees. All routinely produce strong solutions on real instances when nothing else fits. Modern vehicle-routing solvers are heuristic-based. Modern scheduling SaaS like OptaPlanner combine heuristics with constraint solvers. Rust's argmin library provides a framework for several gradient-free methods. At this layer of the stack, you are doing engineering, not theory. Measure on representative instances. Tune the parameters. Accept that the work is empirical.

Simulated annealing in plain Rust — pick a neighbor, accept improvements always, accept regressions with temperature-dependent probability, cool gradually. For a real codebase reach for argmin to get logging, restarts, and the Metropolis criterion already wired up.

use rand::Rng;

fn anneal<S: Clone>(
    mut state: S,
    cost: impl Fn(&S) -> f64,
    neighbor: impl Fn(&S, &mut dyn rand::RngCore) -> S,
    mut temp: f64,
    cooling: f64,
    iterations: usize,
) -> S {
    let mut rng = rand::thread_rng();
    let mut best = state.clone();
    let mut best_cost = cost(&best);
    for _ in 0..iterations {
        let candidate = neighbor(&state, &mut rng);
        let delta = cost(&candidate) - cost(&state);
        if delta < 0.0 || rng.gen::<f64>() < (-delta / temp).exp() {
            state = candidate;
            let c = cost(&state);
            if c < best_cost { best_cost = c; best = state.clone(); }
        }
        temp *= cooling;
    }
    best
}

cost: impl Fn(&S) -> f64 declares a parameter that accepts any value implementing the trait Fn(&S) -> f64 — any function or closure that takes &S and returns f64. This is impl Trait in argument position, syntactic sugar for a generic with a trait bound (<F: Fn(&S) -> f64>). Rust has three closely-related function traits: Fn(...) -> R can be called any number of times and only borrows its captures (the most permissive); FnMut(...) -> R can be called any number of times but is allowed to mutate captures; FnOnce(...) -> R can be called exactly once and is allowed to move out of captures (the most restrictive caller). The compiler picks the loosest trait each closure can satisfy. Pick the loosest trait your callee actually needs — Fn here, because the cost function is called many times and reads but does not change its captures. Callers can then pass a regular fn item, a closure, or a function pointer interchangeably.

// Three function traits — pick the loosest one your code actually needs.
//
//   Fn(args) -> R       can be called any number of times.
//                       only borrows its captures.  most permissive caller.
//
//   FnMut(args) -> R    can be called any number of times.
//                       can MUTATE its captures.  caller needs &mut access.
//
//   FnOnce(args) -> R   can be called exactly once.
//                       can MOVE out of captures.  most restrictive caller.


// Page 36's signature:
fn anneal<S: Clone>(
    /* ... */
    cost:     impl Fn(&S) -> f64,                       // called many times
    neighbor: impl Fn(&S, &mut dyn rand::RngCore) -> S, // also Fn-compatible
    /* ... */
) -> S { /* ... */ }


// Caller side — any of these work:

// 1. A free function.
fn distance(s: &Point) -> f64 { (s.x * s.x + s.y * s.y).sqrt() }
anneal(/* ... */, distance, /* ... */);

// 2. A closure capturing nothing.
anneal(/* ... */, |s: &Point| s.x.abs() + s.y.abs(), /* ... */);

// 3. A closure capturing a value by reference (Fn-compatible — borrows only).
let target = 10.0;
anneal(/* ... */, |s: &Point| (s.x - target).abs(), /* ... */);

// 4. A closure that MUTATES — needs FnMut, not Fn.  Page 36's Fn bound rejects:
let mut call_count = 0;
anneal(/* ... */, |s: &Point| { call_count += 1; s.x }, /* ... */);
//                                ─────────────  error: would need FnMut


// impl Trait vs explicit generic — same meaning, different syntax:
fn anneal_v1<S, F: Fn(&S) -> f64>(state: S, cost: F) -> f64 { cost(&state) }
fn anneal_v2<S>(state: S, cost: impl Fn(&S) -> f64) -> f64 { cost(&state) }

// impl Trait in argument position is sugar for an anonymous generic
// parameter.  More concise; the explicit form is sometimes preferred
// when you want to refer to the type name elsewhere (e.g. with a turbofish).

<S: Clone> declares S as a generic type parameter with a trait bound: S can be any type, as long as it implements the Clone trait. Trait bounds are how Rust expresses "any type T that supports these specific operations." Without the bound the function would not compile, because let mut best = state.clone(); calls Clone::clone() and the compiler will not let you call a method you have not proven the type supports. Annealing needs Clone because tracking "best state seen so far" means duplicating the state at the moment it became the best — you cannot keep both the current and best state without duplication. Trait bounds compose with +: <S: Clone + Send> requires both Clone and Send. For complex bounds, a where clause keeps the function signature readable.

// Generic with a single trait bound — S is any type that implements Clone.
fn anneal<S: Clone>(
    mut state: S,
    /* ... */
) -> S {
    let mut best = state.clone();         // .clone() requires S: Clone
    /* ... */
    best
}

// Without the Clone bound:
fn anneal_broken<S>(state: S) -> S {
    let best = state.clone();              // error: no method `clone` found
    best                                    //         (S has no traits to call)
}


// Compose bounds with +:
fn save<S: Clone + std::fmt::Debug>(s: &S) {
    println!("saving {:?}", s);             // requires Debug
    let _backup = s.clone();                 // requires Clone
}


// where clauses — same meaning, more readable for many or long bounds:
fn anneal_wide<S>(
    state: S,
    cost: impl Fn(&S) -> f64,
) -> S
where
    S: Clone + Send + 'static,
{
    /* ... */
}


// Common bounds and what they enable:
//   Clone           .clone() — make a duplicate
//   Copy            implicit by-value copy (page 15)
//   Debug           {:?} formatting
//   Display         {} formatting
//   PartialEq, Eq   == comparisons
//   Hash            use as HashMap or HashSet key
//   Send            safe to move to another thread
//   Sync            safe to share between threads
//   Iterator        be a source for for-loops and combinators
//
// Add a bound only for the operations you actually need.  Asking for
// more than necessary forces callers to implement traits they don't need.

temp *= cooling is a compound assignment — equivalent to temp = temp * cooling. Rust has the standard family from C / C++ / Python / JavaScript: +=, -=, *=, /=, %=, plus the bitwise variants &=, |=, ^=, <<=, >>=. On page 36 this line gradually reduces the temperature each iteration — with cooling = 0.99, the temperature drops to about 36% of its starting value after 100 iterations and to about 0.7% after 500. The acceptance probability (-delta / temp).exp() depends on temp, so as the schedule cools, the algorithm becomes more selective: high temp accepts almost any worsening move; low temp behaves like greedy descent. The smooth interpolation between "explore" and "exploit" is the whole point of simulated annealing.

// temp *= cooling  ≡  temp = temp * cooling

let mut temp: f64 = 100.0;
let cooling: f64 = 0.99;

for _ in 0..100 { temp *= cooling; }
// temp ≈ 36.6

for _ in 0..400 { temp *= cooling; }
// temp ≈ 0.66


// The full family of compound assignment operators:
//
//   +=  -=  *=  /=  %=        arithmetic
//   &=  |=  ^=                  bitwise AND / OR / XOR
//   <<= >>=                     bitwise shift


// Why simulated annealing uses geometric cooling:
//
// The acceptance probability for a worsening move is exp(-delta / temp).
//
//   high temp  →  exp(small magnitude) ≈ 1     ← accepts almost any move
//   low temp   →  exp(large magnitude) ≈ 0     ← accepts only improvements
//
// Geometric cooling (temp *= cooling) gives smooth interpolation between
// "explore" and "exploit" — early iterations bounce around the solution
// landscape, late iterations settle into a local minimum.
//
// Linear, logarithmic, and adaptive schedules exist; geometric is the
// canonical default and tunes with one parameter.


// One detail to note: compound assignment is a single operation, not two.
// Rust's borrow checker can treat `x *= y` differently from `x = x * y`
// for operator-overloaded types (implementing AddAssign separately from
// Add can avoid an intermediate value).  For plain numbers there's no
// observable difference.

In plain terms

When exact methods are too slow, approximation algorithms have ratios too loose, parameterized complexity does not apply, and the problem is too unstructured for specialized algorithms, you reach for heuristic search. This is the bottom layer of the optimization toolkit, the place where engineers admit there is no theoretical bound on solution quality and just measure performance on real inputs.

The classics:

Simulated annealing imitates the physical process of cooling a metal. Start at high temperature, accept any neighboring solution including worse ones. Cool gradually, become more selective. At low temperature, behave like greedy descent. The cooling schedule is the art. Originally applied to VLSI layout and TSP; now used across operations research.

Tabu search keeps a list of recently visited solutions and forbids returning to them, forcing the search to explore beyond local optima. Aspiration criteria allow forbidden moves when they would improve the global best. Tabu search is the production workhorse for many scheduling and routing systems.

Genetic algorithms encode candidate solutions as "chromosomes," select parents by fitness, combine them by crossover, mutate, and iterate. The biological metaphor is overstated; most successful genetic algorithms are essentially stochastic local search with a population.

Large neighborhood search (LNS) is the modern workhorse for vehicle routing. Take a current solution, destroy a piece of it (remove a subset of routes), repair the destroyed piece with an exact or heuristic subroutine, iterate. When the destroy-and-repair subproblem is small enough for exact methods, LNS combines the best of both worlds. Production VRP solvers like Vroom and OR-Tools are LNS-based.

In Rust, the argmin library provides a framework for several gradient-free methods including simulated annealing and particle swarm. For specific metaheuristics tailored to your problem, hand-rolling is common — the algorithms are short and the customization needed is usually problem-specific.

The honest reality at this layer: you are doing engineering, not theory. Bounds do not exist. Quality is an empirical question. The pattern that works is to set up a benchmark of representative real instances, try several methods, tune parameters, measure, ship the best.

When a problem has gone past every theoretical category — past exact, past approximation, past parameterization — accept that you are in the engineering layer and plan accordingly. This is where most production optimization actually lives.