The edit distance between two strings is the number of character insertions, deletions, or substitutions to transform one into the other. It is the standard measure of string similarity and the foundation of spell checkers, diff tools, fuzzy search, plagiarism detection, and DNA sequence alignment. The algorithm is a textbook dynamic programming routine that runs in time proportional to the product of the two string lengths. Rust's strsim library implements Levenshtein distance, Damerau-Levenshtein (adds transposition), Jaro-Winkler (weights early matches more), and several other common variants as one-line calls. For very long sequences (DNA), the work scales as a square of the length and becomes expensive — specialized libraries with diagonal-band optimization handle it. For short strings (names, addresses, search queries), the work is microseconds and your team should reach for the library.
18
How different are these two strings
Spell check, diff tools, DNA matching, fuzzy search — all the same algorithm, all polynomial.
-
strsimexposes every common edit-distance variant as a one-liner. Levenshtein for spell check, Damerau-Levenshtein for typos that include swaps, Jaro-Winkler for name matching where the first few characters matter most.use strsim::{damerau_levenshtein, jaro_winkler, levenshtein}; fn best_match<'a>(query: &str, candidates: &'a [&'a str]) -> Option<&'a str> { candidates .iter() .min_by_key(|c| levenshtein(query, c)) .copied() } fn name_score(a: &str, b: &str) -> f64 { jaro_winkler(a, b) // 1.0 = identical, 0.0 = no shared prefix } fn typo_distance(a: &str, b: &str) -> usize { damerau_levenshtein(a, b) // counts adjacent swaps as one edit } - Page 4 introduced
'static— the special "lives forever" lifetime. Page 18 introduces a named lifetime parameter:<'a>. The angle brackets are the same place where type generics live (<T>,<K, V>), and lifetime parameters are exactly that — a kind of generic parameter, on the lifetime axis instead of the type axis. Where a type generic says "this function works for any typeT," a lifetime generic says "this function works for any lifetime'a, and here is how the lifetimes of my parameters relate to each other and to my return value." Convention puts lifetimes first inside the angle brackets:fn foo<'a, 'b, T, U>(...). The compiler infers concrete lifetimes at each call site from the references the caller passes in, exactly the way it infers concrete types for type generics.// Type generic — caller picks (or compiler infers) the type: fn first<T>(v: &[T]) -> Option<&T> { v.first() } let n = first(&[1, 2, 3]); // T = i32 let s = first(&["a", "b"]); // T = &str // Lifetime generic — caller picks (or compiler infers) the lifetime: fn longest<'a>(a: &'a str, b: &'a str) -> &'a str { if a.len() > b.len() { a } else { b } } // The returned reference lives at most as long as the shorter of a, b. // Both kinds in one signature — lifetimes first, types second: fn pick<'a, T>(haystack: &'a [T], i: usize) -> &'a T { &haystack[i] } // Same machinery, different axis: // // type generic lifetime generic // declaration <T> <'a> // used in parameter / return parameter / return // chosen by caller / inference caller / inference // meaning "any type T" "any lifetime 'a" // // Lifetimes do not exist at runtime — the compiler erases them. // They are entirely about compile-time relationships between borrows. -
candidates: &'a [&'a str]reuses the same lifetime label'ain two positions, which ties them together. The outer slice and the inner&strviews must both live at least as long as'a. Why unify them? Because the function returnsOption<&'a str>— a reference whose lifetime is'a. The compiler needs to know the returned string is valid for at least that long, so the caller knows how long they can hang on to it. If the slice itself or any of the strings inside it could disappear sooner than'a, the returned reference would dangle. Tying both to one lifetime says "everything the result might point into shares the same lifetime budget." An alternative —&'a [&'b str]— would use two separate lifetimes; more flexible but more to manage. Unifying is the right default; revisit only if a real caller is blocked by the constraint.fn best_match<'a>(query: &str, candidates: &'a [&'a str]) -> Option<&'a str> // ── ─── ──── ─── // declare 'a slice inner return // lives &strs reference // 'a live 'a lives 'a // What the annotations promise the compiler: // • the slice `candidates` is valid for at least 'a // • each &str inside the slice is valid for at least 'a // • the returned &str is valid for exactly 'a — caller can hold it that long // Two-lifetime alternative — more flexible, more to manage: fn best_match_2<'slice, 'item>( query: &str, candidates: &'slice [&'item str], ) -> Option<&'item str> // result tied to the items, not the slice where 'slice: 'item { // and slice must outlive items candidates.iter().min_by_key(|c| levenshtein(query, c)).copied() } // One-lifetime version (the page 18 code) is the usual default — pick the // stricter shape that's still ergonomic. Reach for two only when a real // caller needs the slice and items to come from different scopes. // Note that `query: &str` has no annotation. Lifetime elision (page 4) // fills in an anonymous one — the function does not return anything tied // to query, so its lifetime is independent and need not be named. - The last line of
best_matchcalls.copied(). The name resembles theCopytrait introduced on page 15, and the two are related but not the same thing.Copyis a property a type can implement, saying "bitwise duplication is safe for me.".copied()is an iterator method that converts an iterator of&Tinto an iterator ofTby copying each item — and it only compiles when the items being copied implement theCopytrait. So the method uses the trait to do its job. For types that need a deeper duplication (String,Vec<T>), the sibling method is.cloned(), which requires only theClonetrait and may allocate. The other unfamiliar type in this step is&&str. Read right-to-left:stris the unsized string-of-bytes type,&stris a reference to one (the type we usually call a "string slice"), and&&stris a reference to a&str— two reference layers stacked. This is not a reference to a slice (those look like&[T]); the two ampersands here are just&applied to&str. The double layer arises because.iter()always yields references to the elements of a collection, and here the elements are themselves references —candidatesis&[&str], so iterating gives&&str..copied()strips one layer back, turning the iterator's output from&&strto&str.// Copy is a TRAIT — a property of a type. // i32, bool, char, f64, &T → implements Copy // String, Vec<T>, Box<T> → does NOT implement Copy // .copied() is a METHOD on Iterator<Item = &T> where T: Copy. // It dereferences and bit-copies each item, yielding an iterator of T. let v: Vec<i32> = vec![1, 2, 3]; let refs: Vec<&i32> = v.iter().collect(); // iter of &i32 let copies: Vec<i32> = v.iter().copied().collect(); // iter of i32 // Why we use it on page 18 — one fewer layer of reference for the caller. // candidates has type &[&str], so each element is already a &str. // .iter() yields references TO each element, stacking another & on top: // // &&str = &(&str) a reference to a string-slice reference. // Read right-to-left. Not a slice reference // (those look like &[T]) — two & layers stacked. // // candidates &[&str] // .iter() iter of &&str // .min_by_key(|c| levenshtein(query, c)) Option<&&str> ← two & layers // .copied() Option<&str> ← one stripped // The siblings: // .copied() → requires Copy. Pure bit-copy; never allocates. // .cloned() → requires Clone. May allocate (e.g. cloning a String). let strings: Vec<String> = vec!["a".into(), "b".into()]; let owned: Vec<String> = strings.iter().cloned().collect(); // allocates each clone // let owned: Vec<String> = strings.iter().copied().collect(); // error: String: !Copy // Short version: // Copy → a trait; a property of a type. // .copied() → a method; uses the Copy trait to turn &T into T. // They live next to each other and they are not the same thing.