N or P
Contents Glossary
07

Traversing a network

Whenever the data shape is "things connected to other things," the operations are cheap. Plan accordingly.

Any time your data takes the shape of items connected to other items — a social graph, a dependency tree, an org chart, a file system, a workflow — the basic operations are cheap. Visiting every connected item, finding the shortest chain between two items, detecting cycles, finding clusters — all run in time proportional to the number of items plus the number of connections. No clever data structure or specialized solver is needed. The Rust library petgraph implements every standard traversal as one function call. When a feature request looks like "find everyone this person is connected to within three hops," the engineering estimate is a day. When it looks like "tell me the most influential person in our network," the estimate may also be a day with the right algorithm (PageRank, see page 22).

  1. 01
    Friends-of-friends within three hops — the textbook BFS. petgraph exposes Bfs as a visitor you drive with a loop, so you can stop at any depth without writing the queue or the visited set yourself.
    use petgraph::graph::{NodeIndex, UnGraph};
    use petgraph::visit::{Bfs, Walker};
    
    fn friends_within(g: &UnGraph<&str, ()>, start: NodeIndex, hops: usize) -> Vec<NodeIndex> {
        let mut bfs = Bfs::new(g, start);
        let mut out = Vec::new();
        let mut depth = vec![usize::MAX; g.node_count()];
        depth[start.index()] = 0;
        while let Some(n) = bfs.next(g) {
            if depth[n.index()] > hops { continue; }
            out.push(n);
            for nbr in g.neighbors(n) {
                if depth[nbr.index()] == usize::MAX {
                    depth[nbr.index()] = depth[n.index()] + 1;
                }
            }
        }
        out
    }
  2. 02
    BFS stands for Breadth-First Search — one of the two canonical ways to walk a graph. BFS visits the graph in layers — first the start node, then everything one hop away, then everything two hops away, and so on. Its counterpart is DFS (Depth-First Search), which dives down one branch as far as it can before backing up to try another. For a "friends within N hops" feature the layer number IS the answer, so BFS is the right shape; for "does this graph have a cycle" or "find any path that works" DFS is usually shorter. Both are linear time in the size of the graph, and petgraph exposes both as one-line visitor types (Bfs and Dfs).
    Graph (Alice is the start):
    
                      Alice                ← depth 0
                    /   |   \
                  Bob  Carl  Dee           ← depth 1   (1 hop from Alice)
                  /  \        |
               Eve   Frank  Greta          ← depth 2   (2 hops)
                              |
                            Henry          ← depth 3   (3 hops)
    
    
    BFS order  (layer by layer):
        Alice  →  Bob, Carl, Dee  →  Eve, Frank, Greta  →  Henry
    
    DFS order  (dive, then back up):
        Alice  →  Bob  →  Eve  →  Frank  →  Carl  →  Dee  →  Greta  →  Henry
    
    
    Use BFS  when the question is about distance:
        "friends within 3 hops"
        "shortest unweighted path"
        "which nodes are reachable in <= k steps"
    
    Use DFS  when the question is about structure:
        "is there any cycle"
        "what is a topological order"
        "find any path that works"
Cormen, Leiserson, Rivest, Stein (2009) Introduction to Algorithms, chapter 22. Source →
In plain terms

Graph data shows up everywhere in business systems. Customer relationships, employee org charts, product dependency trees, supply chain links, communication threads, recommendation systems — all graphs. The good news for planning is that basic graph operations are some of the cheapest things a computer does.

There are two standard ways to walk a graph: breadth-first (visit everything one hop away, then two hops, then three) and depth-first (follow one branch as far as it goes, then back up and try another). Both touch every node and every connection exactly once. Both run in time proportional to the size of the graph. Most business questions about a graph reduce to one of these two walks plus some bookkeeping.

Breadth-first is the right answer for distance questions. How many degrees of separation between these two users? Who is within three handoffs of this lead? Which servers can be reached from the compromised host? All breadth-first.

Depth-first is the right answer for structural questions. Are there any circular dependencies in this build graph? Which modules form an isolated subsystem? What is the right order to apply these database migrations? All depth-first.

In Rust, the petgraph library exposes both as one-line calls. Build the graph, ask for a traversal, iterate the result. Your team should not be writing their own.

The estimating implication is that any feature whose core operation is "walk the graph" should be measured in days, not weeks. The work is in the data modeling — defining what counts as a node, what counts as an edge, what metadata each carries — not in the traversal itself.

When a graph problem feels expensive, the question to ask is whether you are doing more than one walk per query. If the same walk is hitting the database every time a user clicks, you need a cache or a precomputed result. The walk itself is cheap. Doing it a hundred million times is not.

← 06 Sorting — almost never your problem
7/39
08 Routing — the shortest path through a network →