Reducibility
Glossary Contents
← Ch V
Chapter VI Set Cover
Ch VII →
  1. 16 — Cover the universe
  2. 17 — Greedy in R
  3. 18 — Where to advertise
17

Greedy in R

R thinks in sets and vectors natively. The greedy heuristic is six lines of idiomatic vector code.

  1. 01
    Start with a universe U and a family F as a list of integer vectors. setdiff/union/intersect are built-ins; we don't need to write any glue.
    U <- 1:10
    F <- list(c(1,2,3,8), c(1,2,3,4,5), c(4,5,7),
              c(5,6,7), c(6,7,8,9,10), c(2,9,10))
  2. 02
    Greedy step: among unused sets, pick the one that covers the most still-uncovered elements. sapply lets you score every candidate in one expression.
    pick_best <- function(F, uncovered) {
      scores <- sapply(F, function(s) length(intersect(s, uncovered)))
      which.max(scores)
    }
  3. 03
    Iterate until the universe is covered. The cover is a vector of indices into F. Watch the "uncovered" set shrink each round — that's the analysis exhibit for the ln(n) bound.
    greedy_cover <- function(U, F) {
      cover <- integer(0); uncovered <- U
      while (length(uncovered) > 0) {
        i <- pick_best(F, uncovered)
        cover <- c(cover, i)
        uncovered <- setdiff(uncovered, F[[i]])
        F[[i]] <- integer(0)
      }
      cover
    }
  4. 04
    Run it. R's REPL gives back the indices; F[cover] is the actual chosen subfamily. For 10⁶-element universes you'd swap to a sparse-matrix backend (Matrix package), but the loop is identical.
    > greedy_cover(U, F)
    [1] 5 2 4
    > F[greedy_cover(U, F)]
    [[1]] 6 7 8 9 10
    [[2]] 1 2 3 4 5
    [[3]] 5 6 7
Grammar — R

R is built around vectors. Almost every operation broadcasts elementwise, which is why the set-cover greedy step fits in one line of sapply.

<-
the canonical assignment operator. = also works but is reserved for argument passing in idiomatic R.
c(1, 2, 3)
combine values into a vector. R's universal "make a list" function.
1:n
integer range vector — same as c(1, 2, ..., n).
list(...)
a heterogeneous container. Access by i (single bracket gives a sublist).
function(x) { ... }
an anonymous function. The body is the value of the last expression.
sapply(F, fn)
simplifying apply — run fn over each element of F, return a vector.
setdiff/union/intersect
built-in set operations on vectors. No imports needed.
which.max(v)
the index of the maximum element. Pair with which.min for the other end.
while (cond) { ... }
imperative loop. R has for and repeat too, but vectorized ops usually win.
Set cover's greedy step: for each candidate set, count how many uncovered elements it would cover. sapply gives one number per candidate; which.max picks the winner.
uncovered <- c(1, 2, 3, 4, 5)
F <- list(c(1, 2),    c(3, 4, 5),  c(1, 5))
scores <- sapply(F, function(s) length(intersect(s, uncovered)))
# scores -> c(2, 3, 2)
best <- which.max(scores)  # -> 2 (the middle set covers 3)
The R Project — downloads and manuals →

Scan to come back to page 17

← 16 Cover the universe
2/3 · 17/63
18 Where to advertise →