Approximation — provably close to optimal

An approximation algorithm trades exactness for speed — with a number attached to how much you sacrificed.

When exact methods are too slow for a hard problem, an approximation algorithm is the next layer of the toolkit. An approximation is a polynomial-time algorithm with a proven bound on how far its answer can be from optimal — for example, "always within a factor of two." Classic results: 2-approximation for vertex cover (greedy edge picking), 3/2-approximation for metric TSP (Christofides 1976), about-1.22 for bin packing (First-Fit-Decreasing), 1+log(n) for set cover (greedy). The bound is the contract: your team knows exactly how much they are sacrificing. Some problems are not even approximable — independent set, clique, graph-coloring" data-term="graph coloring">graph coloring all resist constant-factor approximations. When sizing a hard-problem feature, the relevant question is not "can we solve this" but "what approximation ratio is acceptable to the business and what algorithm achieves it."

The classic 2-approximation for vertex cover: pick any uncovered edge, add both endpoints, repeat. Twenty lines, formal guarantee of at most twice the optimal cover size. Faster than the ILP from page 27 — use it when 2× is good enough.

use std::collections::HashSet;

fn vertex_cover_2approx(n: usize, edges: &[(usize, usize)]) -> HashSet<usize> {
    let mut covered = vec![false; edges.len()];
    let mut cover = HashSet::new();
    for (i, &(u, v)) in edges.iter().enumerate() {
        if covered[i] { continue; }
        cover.insert(u);
        cover.insert(v);
        let _ = n; // unused; kept for signature parity
        for (j, &(a, b)) in edges.iter().enumerate() {
            if cover.contains(&a) || cover.contains(&b) { covered[j] = true; }
        }
    }
    cover
}

.enumerate() is an iterator combinator that pairs every element with its 0-based index, yielding (usize, item) tuples. Same idea as Python's enumerate(list) or JavaScript's Array.prototype.entries(). Page 33 uses it twice — both times to walk the edges while keeping the index handy for indexing into the covered boolean array. The double destructuring (i, &(u, v)) unpacks both layers in one step: the outer tuple is the (index, &edge) pair from enumerate, and the inner &(u, v) reaches past the reference to copy the two usize endpoints (the same Copy trick from page 15). After binding, i is the edge index, u and v are the endpoints — clean to use, no * dereferences needed in the body.

// .enumerate() — pair each element with its 0-based index.
let names = ["Alice", "Bob", "Carol"];

for (i, name) in names.iter().enumerate() {
    println!("{}: {}", i, name);
}
// 0: Alice
// 1: Bob
// 2: Carol


// Page 33's pattern — destructure index AND inner tuple values at once:
for (i, &(u, v)) in edges.iter().enumerate() {
//   ─  ──────
//   │      │
//   │      └── &(u, v): reach past the reference and copy out the
//   │           two usize values (Copy trick from page 15)
//   └──────── i: the index, usize, from enumerate()
    if covered[i] { continue; }
    cover.insert(u);
    cover.insert(v);
    // ...
}


// Equivalent without enumerate — index-heavy, more typing:
for i in 0..edges.len() {
    if covered[i] { continue; }
    let (u, v) = edges[i];               // separate indexing step
    cover.insert(u);
    cover.insert(v);
}


// .enumerate() chains with other combinators — works on any iterator:
let scores = vec![85, 92, 78, 95];

let above_80: Vec<usize> = scores
    .iter()
    .enumerate()
    .filter(|(_, &s)| s > 80)
    .map(|(i, _)| i)
    .collect();
// = [0, 1, 3]    — indices of scores above 80


// Note the order: (index, item).  Same convention as Python's enumerate.
//
// Family of "yield more info with each element" combinators:
//   .enumerate()         (index, item) pairs
//   .zip(other)          (item, other_item) pairs        (page 32)
//   .peekable()          .peek() looks at the next item without consuming it
//   .step_by(n)          take every nth element

In plain terms

An approximation algorithm is the honest reply to NP-completeness. Instead of finding the optimal answer to a hard problem (which takes forever), you find an answer that is provably close — within a known factor — in polynomial time. The bound is the contract.

The textbook starter is the 2-approximation for vertex cover. Pick any uncovered relationship, add both of its endpoints to your cover, repeat. The result is at most twice the optimal cover size. Twenty lines of code, formal guarantee.

Christofides' 3/2-approximation for metric Traveling Salesman uses a more elaborate construction — a minimum spanning tree, a matching on its odd-degree nodes, an Euler tour, shortcuts — and produces a tour at most 50% longer than optimal. This was the best known bound for forty-four years, finally improved to 3/2 − ε in 2020.

First-Fit-Decreasing for bin packing uses at most 22% more bins than optimal. Greedy set cover gets within a logarithmic factor of optimal. These bounds matter because they let you tell the business exactly what you are giving up.

For some problems, no good approximation is possible. Independent set, clique, and graph-coloring" data-term="graph coloring">graph coloring all resist constant-factor approximation — under standard assumptions, no polynomial-time algorithm can guarantee better than a roughly-input-size factor. For these problems, the right reply to "what can we promise" is "nothing, but here is what works in practice."

The PCP theorem from 1998 is the theoretical engine that proves these limits. It is one of the most consequential results in modern computer science: it shows that many NP-hard problems are also hard to approximate beyond specific thresholds. The thresholds vary by problem and they are tight.

For business planning, the question to ask once you have identified a problem as NP-hard is: what approximation ratio does the business accept? 10% off optimal? 50%? Half-orders-of-magnitude? The answer determines whether you need an exact solver (page 32), an approximation algorithm (this page), or a heuristic with no guarantees (page 36).

The bound is the negotiation. Use it.