17
Greedy in R R thinks in sets and vectors natively. The greedy heuristic is six lines of idiomatic vector code.
01
Start with a universe U and a family F as a list of integer vectors. setdiff/union/intersect are built-ins; we don't need to write any glue.
U <- 1:10
F <- list(c(1,2,3,8), c(1,2,3,4,5), c(4,5,7),
c(5,6,7), c(6,7,8,9,10), c(2,9,10))02
Greedy step: among unused sets, pick the one that covers the most still-uncovered elements. sapply lets you score every candidate in one expression.
pick_best <- function(F, uncovered) {
scores <- sapply(F, function(s) length(intersect(s, uncovered)))
which.max(scores)
}03
Iterate until the universe is covered. The cover is a vector of indices into F. Watch the "uncovered" set shrink each round — that's the analysis exhibit for the ln(n) bound.
greedy_cover <- function(U, F) {
cover <- integer(0); uncovered <- U
while (length(uncovered) > 0) {
i <- pick_best(F, uncovered)
cover <- c(cover, i)
uncovered <- setdiff(uncovered, F[[i]])
F[[i]] <- integer(0)
}
cover
}04
Run it. R's REPL gives back the indices; F[cover] is the actual chosen subfamily. For 10⁶-element universes you'd swap to a sparse-matrix backend (Matrix package), but the loop is identical.
> greedy_cover(U, F)
[1] 5 2 4
> F[greedy_cover(U, F)]
[[1]] 6 7 8 9 10
[[2]] 1 2 3 4 5
[[3]] 5 6 7 Grammar — R
R is built around vectors. Almost every operation broadcasts elementwise, which is why the set-cover greedy step fits in one line of sapply.
<- the canonical assignment operator. = also works but is reserved for argument passing in idiomatic R. c(1, 2, 3) c ombine values into a vector. R's universal "make a list" function.1:n integer range vector — same as c(1, 2, ..., n). list(...) a heterogeneous container. Access by i (single bracket gives a sublist). function(x) { ... } an anonymous function. The body is the value of the last expression. sapply(F, fn) simplifying apply — run fn over each element of F, return a vector.setdiff/union/intersect built-in set operations on vectors. No imports needed. which.max(v) the index of the maximum element. Pair with which.min for the other end. while (cond) { ... } imperative loop. R has for and repeat too, but vectorized ops usually win. Set cover's greedy step: for each candidate set, count how many uncovered elements it would cover. sapply gives one number per candidate; which.max picks the winner.
uncovered <- c(1, 2, 3, 4, 5)
F <- list(c(1, 2), c(3, 4, 5), c(1, 5))
scores <- sapply(F, function(s) length(intersect(s, uncovered)))
# scores -> c(2, 3, 2)
best <- which.max(scores) # -> 2 (the middle set covers 3) The R Project — downloads and manuals →