Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

I used Claude Opus 4.7 to write this book.

Claude is a large language model. A large language model is a trained reader that writes back. I asked, it drafted, I edited. Every word. I read every word. I kept what was mine. I cut what wasn’t. The sentences here are sentences I approve of. Not because I typed them first. Because I agree with them enough to put my name on the cover.

That is the strongest honest claim I can make. It is not a strong enough claim.

A mechanical typewriter hammers the key into the ribbon and the ribbon into the paper. On the back of the page there is an indentation. A mark. A dent. You can run your finger across it. You can feel that a person was here, at a specific moment, at a specific pressure. The paper remembers.

Digital text does not have a back side. Pixels change color. Electrons rearrange. The computer’s filing cabinet — what engineers call a version-control system — rolls the rearrangement forward or back and leaves no mark. No dent. I can replace this sentence with a better one in a second. I can replace it again in the next second. There is no indentation. The screen does not remember anything.

So when I tell you I read every word, understand what I am and am not claiming. I am claiming intent. I am not claiming the kind of physical, slow-hand accountability a typewriter once extracted from every writer by the friction of the medium. The friction is gone. The accountability is now something I supply on purpose or it does not exist.


There is a newer physics problem here, and it worries me more than the missing indentation.

A person reads at about two hundred and fifty words a minute. A person types at about forty. A person handwrites at about twenty-five. Reading has always been several times faster than writing. The gap is not a coincidence. The gap is the human body. The muscles that move the hand are slower than the eye, and the slower thing is the filter. Writing is the place where thought was forced to pass through the hand, and in that passage, most of what would have been said got rejected before it became a sentence.

That filter is being dismantled.

Brain-computer interfaces decode imagined sentences now. BrainGate types at twenty-two words a minute from hand-motor signals. Stanford’s inner-speech prototype decodes attempted speech at sixty-two. Meta’s non-invasive system sits at forty. The interface is moving from the fingers to the cortex. Every year, the scientists push the sensor closer to the brain.

The muscle delay was not just a bottleneck. It was a jury. For as long as humans have written, the slower thing — the hand — has been where the brain’s drafts were corrected, revised, or quietly dropped. When that goes, anything anyone imagines becomes publishable at the speed of thought. I am not ready for that. I do not think most writers are. I am afraid of what we lose when authorship stops being a physical act.

This book is about that choice.

There is another admission. I had this tool because I could afford it, and because the company I worked for granted me the rest. The subscription costs real money. The account at the tier I used — what engineers call an API key, what a business person calls a paid seat — costs more. And the setup that lets me run not one question at a time but many — what engineers call an agentic workflow, what a manager would recognize as running a small office of specialists handing work to each other — costs most of all. For most of the people who might benefit most from what this book argues, that setup is out of reach.

I cannot fix that in these pages. I can name it. When I tell you an engineer wrote a generic interface and handed it to the business owner, I am describing a workflow that presupposes the engineer had the technical account, the business owner had the browser tab the company paid for, and the company had enough budget to let them both experiment for a week without producing anything shippable. Most organizations do not have that. Most individuals do not either.

The second cost is time. Nobody talks about this one honestly. You do not sit down at an agentic workflow — at your small office of specialists — and produce the book you are reading. You sit down at one question. You type. You get an answer. You feel clever. You get the satisfaction of publishing your thoughts the same day you have them.

And then you start to notice that one question is not enough for the actual work. The good stuff happens when the first answer hands off to a second. The second hands to a third. The third writes a note that a fourth reads and critiques. Engineers call this agentic. A receptionist calls it a team with a good routing process. They are the same thing.

The transition is awkward. You go from asking the chatbot to running the small office. It feels like learning stick after years of automatic. You produce worse output for a while. You feel slower. You want to close the laptop and write the next paragraph by hand just to feel competent again.

That discomfort is the tax you pay to move from tourist to operator. Like any tax on learning, it falls hardest on people who don’t have slack in their schedule to pay it.

Here is a concrete example of what the small office does not do for you.

Early in this book is a chapter about AlphaFold. The model I worked with wanted to open that chapter with the statement that a single protein can fold into ten to the three hundredth power configurations. That is the correct number. It is also a useless number. Nobody can feel ten to the three hundredth power.

Turning that number into a sentence a reader can feel is architecting — what an engineer calls system design, what an editor calls making it land. I did it by reaching for Making Numbers Count by Chip Heath and Karla Starr, a book on how to communicate statistics in a way human beings can actually hold. The version that ended up in the chapter: take every token every AI has ever processed — roughly a quadrillion — and multiply it by itself twenty times. That is one protein.

The model wrote ten to the three hundredth. The human who read books on how numbers land wrote multiply every AI token ever by itself twenty times.

That is the part of authorship the model does not do. You can automate a draft. You cannot automate the translation from a fact that is technically true to a sentence that carries the weight of the fact into a reader who does not come to the page with a PhD.

So when an autodidactic, hypergraphia-induced writer finds a utility, they write.

So I am starting from privilege. Money. Access. Time. Every argument that follows has to be read against that. The claim is not that everyone can do what I did tomorrow. The claim is that the pattern is ethically preferable to the old way if the access gap closes. And that closing the access gap is one of the more important “AI for good” projects any of us could take on.


There is one more admission. The one I least want to write.

Some of my best friends are authors in active litigation against AI companies. I cannot share the details. The cases are ongoing. The filings are not mine to discuss. The people involved are not characters for me to use to make a point. But I will not pretend they aren’t there.

Their harm is personal. It is not abstract. These are people I love. Their books — the books we talked about when I came over to hang out with their son, the books that influenced me to join the Rust project and write technical documentation — were used without their consent to train the systems that now compete with them in their own markets.

What engineers call pretraining. What an author calls my life’s work, scraped. Same act. Two names. Two grief registers.

The company whose model wrote paragraphs of this book alongside me is not, in every respect, on the same side of this fight as my friends are.

I do not know how to reconcile that. I am suspicious of anyone who says they do.


What I can say is this. The ethical case this book makes is not a defense of how the training data was acquired. It is a case about what specific uses of these systems are better than the alternatives humans were forced to ship before. Two different arguments. They have to stay separate if we are going to think clearly. The author whose book was scraped is owed something. Real money. Real acknowledgment. Real say over future training. That debt does not go away because I describe an address-shortening function that the tool happens to be good at. Both things can be true. The harm to the writers is real. The uses in this book are still, in my judgment, ethically preferable to the old way. I will argue the second at length. I will not pretend the first does not exist.

Here is the pattern the book is about.

An engineer who refused to write the combinatorial hell of an address-shortener — what engineers call a normalizer, what an HR clerk calls a thing that won’t break my payroll file — and instead wrote a generic interface and handed the specification to the person who owned the problem. A hiring manager facing a hundred thousand applications and twelve openings, deciding whether it is more ethical to ghost ninety-nine percent of them or to give every one of them twenty minutes with a machine. A district manager responsible for four hundred people who cannot read four hundred dashboards. A state employee trying to query their own government’s data without paying a vendor for a custom report. A billion people in China being offered AI as a utility, like water, like electricity. A Nobel Prize for folding proteins no human could fold by hand in a hundred lifetimes.

Every case is the same shape. The old way required a person to pretend they could brute-force something that cannot be brute-forced. The new way requires a person to write down, in plain language, what they actually want.

Engineers call this specification. A manager calls it knowing what you’re asking for. The tool just collapses the distance between the two.

From writing the rules to writing the intent.

That is the shift. The tool that makes the shift possible is the one that wrote these paragraphs with me. I would be dishonest not to say so. I would be naive to pretend that saying so is enough.

I used Claude Opus 4.7. I read every word. I approve the message. I know it is not enough.

The rest of the book is my attempt to make it enough.

AlphaFold

Count every token every AI has ever processed. Every ChatGPT query. Every Claude call. Every code suggestion, every agent request, every training pass, every inference request at every lab since the first GPT shipped.

Multiply that number by itself. Twenty times.

That is the number of ways a single protein can fold.

One protein. In your body. Right now.

A biochemist named Cyrus Levinthal worked it out in 1969. He was making a point. A protein cannot actually search that many configurations. If it did, folding would take longer than the age of the universe, and proteins fold in milliseconds. So either the universe is lying or the search is not what we think it is.

That is the problem. For fifty years, that was the problem. The grand challenge. The one biochemists would shake their heads about at conferences and say this one will take another hundred years. Because how do you search a space you cannot even represent?

The answer, before 2020, was that you didn’t.

You grew a crystal. You put it in an X-ray beam. You stared at the diffraction pattern for months. You guessed. Or you froze the protein at cryogenic temperatures and shot it with electrons — what scientists call cryo-EM, what a funder calls a five-million-dollar microscope on the second floor that needs its own power feed. Either way, one structure. Sometimes two. Over a career.

The Protein Data Bank is the global archive of everything we ever solved by those methods. The PDB. Every crystallographer, every electron microscopist, every NMR spectroscopist — what a chemist calls their life’s work, what an accountant calls the depreciation schedule on that microscope — each one submitting their structures to the same shared library, year after year.

In 2020, fifty years after Levinthal posed the paradox, the PDB had about two hundred thousand structures.

Two hundred thousand sounds small. It isn’t.

Each structure is a PhD. A microscope running six months at a stretch. A graduate student growing a crystal at three in the morning because the temperature was finally right. A paper with seven authors and two years of peer review. A career.

Two hundred thousand careers in a shared archive.

Microsoft has that many employees. Apple has fewer. IBM once had more. Imagine an entire Fortune 100 company where every single person has a PhD in protein chemistry. Imagine none of them ever leaving. Imagine all of them, for fifty years, growing crystals and shooting them with X-rays.

That is the PDB.

Then AlphaFold 2 showed up.

CASP

The way the field tested itself was called CASP. The Critical Assessment of Structure Prediction. A biennial blind test. The organizers took proteins whose experimental structures had been solved but not yet published, gave teams only the amino-acid sequence, and asked the teams to predict the shape. Then they scored the predictions against the truth.

CASP is where ambition went to die. You thought your algorithm was good. Then you ran it against a target the organizers had kept secret, and the target would fall somewhere between garbage and gesture. The scores hovered in the region that scientists call encouraging and everybody else calls not working.

AlphaFold 1 entered CASP13 in 2018. Placed first. Still produced coarse predictions.

AlphaFold 2 entered CASP14 in 2020.

In the assessors’ summed z-score ranking — a statistician’s way of saying how many standard deviations above the field — AlphaFold 2 scored 244.0. The next best group scored 90.8. On 87 of 92 domains, it produced structures accurate enough to publish. On 58 of those, the predictions were as good as the experimental structures were. Median accuracy: 92.4 out of 100.

The old way surrendered.

John Moult, the organizer of CASP, who had been running this competition for twenty-six years watching incremental improvement, said the problem had been solved. Not solved enough. Solved. The community agreed. There were still edge cases. There would always be edge cases. But the thing that had been a fifty-year grand challenge was, in some working sense, finished.

Jumper J, Evans R, Pritzel A, et al. Nature 596, 583–589 (2021). https://www.nature.com/articles/s41586-021-03819-2. The paper has been cited more than any AI paper in history by working biologists. Not theoreticians. Working biologists. The people with the microscopes on the second floor.

The ethical move

Here is where most tech-hype stories end. Company wins. Stock goes up. Conference talks get louder.

DeepMind did something else.

They partnered with the European Molecular Biology Laboratory — what a researcher calls EMBL-EBI, what a taxpayer calls a public-sector lab in Hinxton, UK that anyone can use. Together they built the AlphaFold Protein Structure Database. AFDB.

Then they predicted two hundred million protein structures. Effectively every protein in UniProt. Every protein that anyone had ever named.

Then they released all of it. Free. Under a license called CC-BY-4.0, which is a legal document that says use this however you want, commercially, academically, forever, just say where it came from.

Two hundred million structures. Fifty years of global effort had produced two hundred thousand. AlphaFold produced two hundred million. One thousand times more. Free.

Two hundred million is more than every person with a job in the United States. Every factory, every farm, every office, every hospital. Every paycheck. That is about one hundred sixty million workers. AlphaFold made more predictions than that.

Still less than China’s workforce. Seven hundred forty million. But bigger than America’s.

A predicted protein for every working person in America. Then add the United Kingdom. Then Ireland. Then keep going down the list of smaller nations. You still have many workforces to spare.

As of the Nobel announcement, more than two million scientists from 190 countries had used it.

That is the claim. That is the unimpeachable part. That is why AlphaFold opens this book.

The Nobel

October 2024. The Royal Swedish Academy of Sciences. Chemistry.

Half to David Baker, at the University of Washington, for computational protein design. Baker had spent twenty years using computers to build proteins that had never existed in nature. He designed pharmaceuticals. He designed vaccines. He designed tiny sensors and nanomaterials. He did it with a tool called Rosetta, an open-source codebase that has outlasted three university presidents.

Half to Demis Hassabis and John Jumper, at Google DeepMind, in London, for protein structure prediction.

Eleven million Swedish kronor. Split fifty-twenty-five-twenty-five. What a banker calls liquidity, what a grad student calls never having to write another grant as long as I live.

The citation was careful. Baker won for design. Hassabis and Jumper won for prediction. The two halves together close a loop. Predict any structure. Design a new one. Two halves of the same tool.

And the Royal Swedish Academy, which had spent a century handing the chemistry Nobel to people who did chemistry with their hands, gave half of it to a model.

The honest part

This book is not going to pretend. Every chapter has this section. The honest part.

AlphaFold does not predict intrinsically disordered proteins. It flags them. It marks the region low-confidence and moves on. If your protein is a floppy signaling peptide with no stable structure at all, AlphaFold will tell you it doesn’t know. Which is better than most tools do. But it is not prediction. It is admission.

AlphaFold does not predict conformational ensembles. A protein in a real cell samples many shapes. It breathes. It flexes. AlphaFold gives you one snapshot. For drug binding, that snapshot is often the wrong snapshot.

AlphaFold does not predict novel folds. It was trained on the PDB. Proteins whose shapes fall outside the distribution of what’s already been solved — new folds, engineered proteins, proteins from life that never made it into the training set — are predicted worse than proteins the model has seen many times.

AlphaFold 3, released in 2024, extended the system to complexes and nucleic acids and ligands. The community cheered the science. Then looked at the license. The weights were closed. Server access only. The open-science community, which had built its entire scaffolding around AF2’s openness, went public with the grievance. DeepMind eventually relented. Partly.

The honest part, every chapter: a tool this powerful, released this widely, will be misused. Papers will be published with AlphaFold structures as ground truth when the confidence scores did not warrant it. PhD committees will accept theses that depended on a prediction the author did not verify. Grants will be awarded on the strength of hallucinated drug targets.

That is not an argument against the tool. It is the cost of the tool. The cost is there. The benefit is larger. Both things are true. I will not pretend otherwise.

There is one more honest thing to name, and it is about the scope of the claim this book is making.

AlphaFold is not an LLM. It is a transformer-based system trained on protein structures, not a language model trained on text. Same architectural family — the transformer, the attention mechanism, the scaling era — but a cousin of Claude, not an instance of it.

AlphaFold opens this book because it is the proof that AI at scale can be deployed for public good, at a level the Nobel Committee has settled. The claim this book extends, at smaller scales, with LLMs, is the same claim. The tool is not the same tool.

The Rust part

AlphaFold itself is written in JAX. JAX is a Python library for machine learning. Python is the language of machine learning because Python is what the researchers know. Fine.

The question is what happens downstream.

Two hundred million structure predictions are two hundred million files in a format called mmCIF. If you want to query them — find all proteins with a particular fold, all binding sites near a particular residue, all structures where a particular motif shows up — you need infrastructure. Parsers that do not leak memory. Spatial indexes that do not stall. Iterators that actually use the cores the researcher paid for.

That is where Rust earns its keep.

pdbtbx is a pure-Rust parser for PDB and mmCIF files. It uses Rayon for multithreaded iteration. It uses rstar for spatial lookup. It does not surprise anyone. https://crates.io/crates/pdbtbx

rust-bio is a general-purpose bioinformatics library. https://rust-bio.github.io/

BioForge is a pure-Rust toolkit for preparing macromolecular systems for simulation. https://github.com/TKanX/bio-forge

None of these are AlphaFold. All of them are the infrastructure that lets AlphaFold be useful. A model that predicts two hundred million structures is only a public good if there is a public-good tooling layer underneath it — what an engineer calls the query layer, what a scientist calls the part that lets me actually find my protein in the database.

The ethical argument runs both ways. DeepMind made the science free. It took the Rust community — and the Python community, and the C community, and everyone else — to make the science reachable. The infrastructure is where the generosity becomes usable.

The pattern

Every chapter of this book is the same shape. AlphaFold is its biggest version.

A combinatorial problem. A human who had been told to brute-force it. A tool that swaps rules for intent.

The rest is smaller. An address field. A hiring funnel. A district manager. A state employee. A billion citizens handed AI as a utility.

AlphaFold is the proof of concept.

The rest is the receipts.

The Address

An integration engineer’s job, stripped of jargon: map every system in the world to ours.

Every customer already has an HR system. Sometimes five, from old acquisitions. Workday here, UKG there, ADP in the regional unit that has been on ADP since 1994. The product has to push data into and pull data out of all of them. Every field is a negotiation between two systems that never agreed on what the field was.

Here is the ticket.

The hotel onboarding is failing for new hires whose addresses are longer than twenty-nine characters. Can we truncate?

The twenty-nine-character limit is one rule. There are thousands like it. None documented. All discovered in production. The payroll vendor uses a 1997 flat-file format with fixed-width byte columns, so UTF-8 accents silently shift the truncation point. The consultant knows the symptom. Not the rule.

A week of debugging. One fix. The fix breaks a different customer with the same limit for a different reason.

One ticket. Two new tickets. Always.

This is the integration engineer’s job. The specification is infinite. The code is finite. You are paid to lose.

The shift

Treat the LLM like a deterministic function. Pin the model. Supply the input. Validate the output. Inside that envelope, it behaves like any other function. Input goes in. Output comes out. Non-determinism under the hood is not your problem. The interface is your problem.

Write the instruction in the shape of a requirement.

Given this input, summarize with preference for deliverability, and try to keep it under twenty-nine characters.

Email subject line under sixty? Same shape. Twitter post under two-eighty? Same shape. Copy. Paste. Edit. Deploy.

When a consultant surfaces a new rule, you do not rewrite code. You add a sentence.

…and for Puerto Rico urbanizations, keep the urbanization name before the house number.

One more clause. No dependency update. No release cycle. No rewrite of the state-abbreviation table.

Then hand the instruction to the person who actually knew what it should say. The HR operations lead who has been filing tickets like these for twenty-two years, against somebody else’s product. Twenty-minute training. From then on, the tickets go to her. She maintains the sentence.

The engineer stops being the pipeline. The person who owns the problem owns the fix.

The honest part

LLMs hallucinate. They will decide Strt is a fine abbreviation for Street. A human carrier will probably still deliver — they will just think your data-entry clerk needs glasses. But Strt is not USPS-recognized. It will not round-trip through CASS. The automated sort snags. The payroll integration rejects the record. You are back to a ticket.

So you validate. libpostal. A real address-standardization library. Ten years old, deterministic, boring. If the LLM’s answer fails the check, you retry with a tightened prompt or escalate to a human.

The LLM generates. Deterministic code verifies. Both layers matter.

Not free. Every call costs. For a real-time form, affordable. For a batch of ten million records, you cache, pre-validate, only call the model when the cheap parser fails.

Auditable. You log prompt, input, output, validation. When the compliance officer asks why did this address change, you show the prompt. The prompt is in English. She can read it.

Auditability is what the pattern buys you that the 1997 flat-file format never did.

LLMs cannot count to twenty-nine. They cannot do dates at all. The guidance is more than enough anyway. You are not asking the model to count. You are asking it to try. The validator catches the rest.

The arc

The function predated this ticket. I had built it for a hype-driven feature my company had rejected on ethical grounds. The code was reusable. The ethics, my company had to teach me.

I am not the hero of this chapter. I am the integration engineer who stopped writing rules and started writing intent.

You will not map every address in the world to every HRIS in the world by writing every rule.

You will map it by writing one deterministic function that takes an instruction, then maintaining the instruction.

Writing the rules is infinite. Writing the intent is a paragraph.

The paragraph is maintained by the person who already knew what it should say.

The Hundred and Fifty Thousand

Somewhere in a pile of 150,000 applications, there’s a person who would have been an extraordinary flight attendant.

They didn’t make it to an interview. Not because they weren’t good enough, but because no one had time to get to them.

— Sean Behr, CEO, Fountain

One airline. Six hundred open flight attendant jobs. A hundred and fifty thousand people applied.

The entire US flight attendant workforce — every attendant on every flight of every carrier — is about a hundred and thirty thousand. This one job posting drew more applicants than there are flight attendants currently working in the country.

A hundred and fifty thousand applications at five minutes each is twelve thousand five hundred hours. Six work-years of one recruiter’s life. The recruiters had two weeks.

So the applications got filtered. Keyword match, pedigree rank, the top couple hundred to a human. Everyone else is ghosted. Not deleted. Never contacted. The rejection is the absence.

This is not a failure of effort. It is physics. No company on earth has enough recruiters to talk to a hundred and fifty thousand people. The math does not bend.

Everybody calls this the new way because it has an LLM in it. It is the old way with new plumbing.

A quick note. I work at Fountain, where Sean is the CEO. We build an AI recruiter for frontline hiring. Numbers in this chapter come from our customers. The argument is not neutral. Read it with that in mind.

The resume problem

Resumes are broken in two directions.

An LLM will now generate a resume that matches a job description to four decimal places. Work history the candidate never had. Technologies the candidate has never touched. Keywords that are not lying, exactly, but not evidence either. The top of the funnel is flooded with synthetic signal.

Underneath the synthetic resumes, the real problem is older. The people most likely to get overlooked have always been the ones who did not know how to package themselves on paper. The ones who went to a school nobody has heard of. The ones whose career path looks nonlinear on a form and makes complete sense the second you talk to them.

The application was never a perfect proxy for talent. It was the only tool we had at scale.

The filter at the top of the funnel has gone to noise in one direction and bias in the other.

The shift

Skip the resume filter. Interview everyone.

Not a screened subset. Not the top five percent. Everyone.

An AI interview. Rubric. Same questions for everybody. Not video. Not facial analysis. Not the thing HireVue got sued for. Structured conversation. Open-ended questions. Transcripts scored against the same criteria whether the candidate is in New York or Manila.

Two things have to be true.

First, a deterministic screen at the door. Hard requirements only. Are you authorized to work in this country? Can you start by the date we need? Are you willing to work on-site? Right answers and wrong answers. The wrong answers are rejected immediately, in front of the candidate, with the reason shown. Nobody is ghosted on a hard requirement. You told them why. They can go elsewhere.

Second, the interview is opt-in. If the candidate prefers to wait for a human, they wait.

They overwhelmingly do not wait. They take the interview.

The time of day

The recruiter works nine to five. So does the candidate.

If the candidate works the same hours as the recruiter, every phone screen is at a bad time. The candidate is hiding in a conference room at the current job. Or taking PTO. Or doing it at the kitchen table at six in the evening after a long day, tired, trying not to sound tired.

An AI interview runs when the candidate is free. Eleven at night when the kids are asleep. Sunday morning before church. Six in the morning before the shift starts. The model does not have business hours. The candidate picks the hour.

This is not a small thing. It is the thing that disproportionately hurts hourly workers, parents, caregivers, people in different time zones. The old pipeline optimized for the schedules of the people already in the building. The new pipeline optimizes for the schedule of the person trying to get in.

The rubric

Same questions. Same scoring. Every applicant.

A recruiter looking at a resume for five seconds sees the name of the school, the last employer, the whitespace. A rubric applied to a twenty-minute structured interview sees what the candidate actually said about a problem. Different signals. The second is harder to fake and easier to audit.

You audit. You publish the rubric. You publish the pass rate by demographic. When NYC Local Law 144 or the EEOC asks, you have an answer.

The data

Applicants prefer this. Not a little. A lot.

Seventy-four percent of frontline workers in Fountain’s 2025 Frontline Report said they prefer AI-driven interviews over waiting for a human scheduler. That is not a thin margin. That is an overwhelming majority of the people the product is built for choosing the AI path over the human one.

Their logic is simple. A possible human interview next week is not a job. A structured interview tonight, on their schedule, scored against a rubric — that is a real shot. They take the real shot three out of four times.

Recruiters prefer it too. They stop sitting through two-hour Zoom calls with candidates who never matched the hard requirements. The candidates who reach the human round are already in the rubric’s top quartile.

Both sides save time. People get hired faster.

The honest part

AI hiring has a bad reputation. It has earned some of it.

HireVue’s facial analysis. Amazon’s scrapped screener that learned to downgrade anything with the word women’s in it. Every vendor that claimed to predict personality from voice.

These failures are real. They are not arguments against structured interviewing. They are arguments against the specific thing those tools did — score humans on signals the humans did not know were being recorded.

The pattern here is the opposite. The rubric is public. The questions are the same. The scoring is deterministic at the hard requirements and auditable at the soft ones. The candidate knows what is being measured. The regulator knows what is being measured. If the model’s answer fails the audit, you do not get to deploy it.

You do not get to skip the audit.

The pattern

The old way scaled exclusion. Keyword filter, ninety-nine percent ghosted, a hundred human calls to a pedigreed subset.

The new way scales dignity. Every applicant gets a real shot at a fair rubric, on their own schedule. The ones who fail a hard requirement learn why at the door. The ones who reach the interview know the rubric they are being scored against. The ones who do not fit still get a response.

Nobody is ghosted.

Letting the invisible be heard is one of the more human things AI can do.

Somewhere in that pile of a hundred and fifty thousand applications is a person who would have been an extraordinary flight attendant. The old way never found her. The new way calls her tonight.

You do not need to believe AI is better than a human recruiter to see why it clears this specific bar. You only need to notice that the alternative was silence.

And the status quo is silence.

Cue

A district manager walks into Monday morning with four hundred people on her roster.

Eighty of them missed a shift last week. Twelve are in onboarding and halfway through their paperwork. Six properties are behind on interview scheduling. Two locations are flagged red in a dashboard she does not have time to open.

What she has time to do is text.

The dashboard trap

A dashboard shows you the business. It does not run the business.

You stare at the red tile. You screenshot it into Slack. You tab into the HR system to pull a list of names. You tab into the scheduling tool to reassign shifts. You tab into the hiring system to open new job reqs. You tab into the onboarding tracker to send reminders. Five tools, four tabs, one text to the regional VP.

Then you do it again tomorrow.

Dashboards were supposed to make this faster. They made it smarter. They did not make it fewer steps. The manager who has four hundred reports is still the one opening five tools to fix one problem.

The shift

The chat presents the findings. It recommends the actions. You click the button.

Not a dashboard. Not a chart. Not a link to another tool. A plain-language summary and a set of buttons written in the manager’s own words.

Dallas is short on Saturday.

Cue reads the current staffing numbers. It checks the hot-lead pool. It looks at tomorrow’s calendar. Then it replies in the manager’s own language:

Saturday at Dallas is short by four. Twenty-seven candidates in the hot-lead pool match the role. Interviews can run tonight and tomorrow morning.

Below the reply, three buttons.

[Post 5 Jobs at Dallas]

[Invite 27 Candidates]

[Schedule Tonight’s Interviews]

The buttons are written the way the manager speaks. Not POST_REQ_BATCH(5). Not ATS.invite_candidates(27). Not submit form 5b. Post five jobs at Dallas. Invite twenty-seven candidates. The label for a PTO approval is Approve Time Off. The label for a calendar hold is Make New Event. Whatever the action is, described the way the person actually talks about it.

The manager taps. Cue does the work. It reports back in the thread.

The agent proposes. The human signs.

This is what Fountain launched in April 2026. They call it Cue. The phrase inside the company, roughly: run your business with the intelligence, not just see it.

That one phrase is the whole distinction this chapter is about.

Zoomed in: Hire Go

Cue is the district manager’s tool. Hire Go is the same pattern at the property.

A hotel general manager at an Aimbridge property is not a recruiter. She runs the front desk, the housekeeping team, the night shift, the banquet crew. When she loses a housekeeper on Friday, she needs a new one on Monday. The old way says file a req with corporate recruiting, wait, and hope. The new way is conversation.

The GM opens the app on her phone. She says I need a housekeeper for a Monday start. Hire Go reads the last batch of leads she reviewed, checks her Tuesday calendar, and replies with a draft posting, three candidates, and two interview slots that clear her all-hands. Three buttons, in her words: Post the housekeeper role. Invite these three. Schedule interviews at 2 and 4. She taps. The job is posted. The interviews are on her calendar. She never opens a separate hiring tool. She never talks to a recruiter.

Aimbridge runs hiring through this interface at every one of its fifteen hundred properties.

Anthropic’s 2026 Agentic Coding Trends Report profiles the implementation on page eight. Fifty percent faster screening. Forty percent faster onboarding. Candidate conversion doubled. The architecture underneath is Claude-driven multi-agent orchestration — several specialized agents, each with one job, coordinated by an orchestrator that reads the manager’s text.

Why this is not BI

BI shows you the number. BI does not change the number.

Every text-to-SQL product, every conversational dashboard, every just ask your data tool built on top of Snowflake or Looker or Power BI — they all stop at the same place. They show you the answer. Then you go do something with it.

Cue does not stop there. Hire Go does not stop there.

The manager tells it what she needs. The system drafts the actions, labels them in her own words, and waits for the tap. She taps. The work is done. The distinction matters because a frontline manager does not have bandwidth for an extra step that says now open these five tools. The bandwidth tax is where the operational leverage evaporates.

You do not solve the frontline manager’s problem by giving her a better window. You solve it by giving her a pair of hands that wait for her signal.

The honest part

A pair of hands is a dangerous thing to hand to an LLM.

So you gate. Actions that write to the system are logged. Actions that cost money require a budget envelope the manager has already set. Actions that affect a person — firing, discipline, reducing hours — require human approval at every step, every time, with no default-yes. The model drafts. The human signs.

You publish the list of actions the agent can take without approval, the list it requires approval for, and the list it is flat out not allowed to touch. The manager knows. The compliance officer knows. The employee knows.

You audit. You log the prompt, the actions taken, the outcome, the approvals given. When HR asks why was this shift reassigned, you show them the prompt and the approval chain. In English.

Agentic tooling that bypasses these gates is the failure mode that gives all of this a bad name. Do not build that. Do not ship that. If you work at a company that is shipping that, leave.

The pattern

Writing the rules is infinite. Writing the intent is a paragraph. The paragraph is maintained by the person who already knew what it should say.

At this scale, the person who knows is the manager. The district manager has been running a region for eleven years. The hotel GM has been running that property for three. They know what the data means. They do not need a better dashboard. They need a system that executes their judgment when they tell it to.

That is what Cue is. That is what Hire Go is.

Same pattern as the address field. Same pattern as the interview. The intent is in English. The execution is in software. The domain owner owns the fix.

This time, the domain owner is running a region. Or a hotel. Or fifteen hundred hotels from a phone on a Friday afternoon.

Oregon

Oregon told its state employees to learn how to prompt.

The Enterprise Information Services office rolled out an AI Risk Management Framework aligned with NIST, pre-approved Microsoft Copilot for agency use, and began offering AI training courses in 2025. The stated goal was an informed public-service workforce. The unstated goal was harder.

The unstated goal is that the state of Oregon runs on Tyler Technologies.

The stranglehold

Tyler Technologies is a two-point-one-billion-dollar company whose product is mostly other governments. Forty-two thousand government entities. Hundreds of thousands of municipal workers logged in every morning. The Odyssey court-management system runs every circuit court and the tax court in Oregon. Munis runs ERP for cities. EnerGov runs permitting. Eagle runs land records.

When a reporter in Illinois asked where all the money went, the answer was two hundred and fifty million dollars in Tyler contracts with cost overruns and a court system that still did not work.

The math is not unique to Illinois. The math is the product. Eighty-four percent of Tyler’s revenue is recurring. The renewal rate is ninety-five percent. Those numbers do not describe a software company. They describe a utility, a tax, a toll road.

The data inside Tyler’s systems is public data. Case records. Property records. Permit applications. The things citizens pay taxes to keep. The governments pay Tyler to hold it, and Tyler charges them to get any of it back.

Prompt literacy is not enough

Teaching an Oregon state employee to prompt Copilot does not break the lock.

The employee sits at her desk. The data she needs is inside a Tyler system. Tyler’s system is a black box with an API that is not hers. She can ask Copilot whatever she wants. Copilot will write her a beautiful summary of the data Copilot can see. None of the data Copilot can see is the Tyler data.

Prompt literacy with no access to the data is training on a steering wheel that is not connected to the car.

The access gap is the whole problem.

The move

Agentic coding tools changed one number.

Rewriting a two-point-one-billion-dollar company’s product suite used to be insane. It was insane when dozens of govtech startups tried to take on Tyler and sold out or shut down. The activation energy was a decade of engineering and a hundred million dollars in venture capital before you shipped anything that worked.

The activation energy has collapsed.

An agentic coding environment — the kind of thing Cursor and Claude Code produce in a thirty-minute session — will now scaffold an open-source alternative to Tyler’s simpler products in weeks, not years. A case-management system. A permit-tracking system. A cemetery-records system. A dog-licensing system. Every one of Tyler’s long-tail offerings is a well-defined problem that a small team with agentic tooling can rewrite.

Not the whole suite. Not yet. But the first product. Then the next.

The proposal

A public-good company. Not a startup. Not venture-backed. A non-profit incorporated in Oregon, funded entirely by the contracts it earns and the grants it qualifies for. No investor dollars. No equity. No exits. Local money, local people, local code.

It starts small.

The first contracts are small on both sides. A dog-licensing portal for one city. A park-permit form for another. A cemetery-records tool for a rural county. The kind of job Tyler would quote at two hundred thousand dollars and take nine months to deliver. The non-profit delivers it in six weeks. Billed hourly. At cost.

It hires two populations.

Students. Portland State, University of Oregon, Oregon State, Reed, the community colleges. Paid, full-time or part-time, with real stipends. Not unpaid internships. Paid jobs.

And the software engineers already here who want the work and cannot find it. Every one of them is qualified. Every one of them spent fifteen or twenty years shipping software for companies that then shipped the next round of work overseas. The jobs left Oregon. The talent did not. A civic-tech non-profit with agentic tooling hires the talent back.

The non-profit teaches both groups the same thing — agentic software development. Not how to write code. The seniors already know how. The students who want to write code have Stack Overflow, whose servers share a colo with the disaster-recovery-as-a-service product I built, and GitHub. What agentic development is, in this century, is how to describe a system in plain language, let a pipeline of agents produce a draft, steer the draft, validate it with deterministic testing, and ship. The curriculum is how to supervise a small office of intelligences. A senior with that curriculum and a team of students behind her can ship enterprise code at a cost the state can actually afford.

The two populations sit at the same table. The seniors mentor. The students do the volume. Both get paid.

The team ships small things. Every month, another piece of open-source tooling under an Apache or MIT license. Every month, another Oregon entity on the migration list. Each piece replaces one Tyler long-tail product. Dog licensing. Cemetery records. Park permits. Meeting minutes. Vendor onboarding.

The ten-year arc

Oregon public entities together are estimated to spend two to four hundred million dollars on Tyler Technologies and adjacent govtech over the next decade. At fifty dollars an hour — the blended rate for a team of Oregon students mentored by senior engineers — that is four to eight million engineering hours. Four hundred engineers, paid a real wage, for ten years.

The non-profit does not claim that entire market. The non-profit starts by claiming a cemetery-records tool in a county of twelve thousand people. Then the next. Then the next.

Ten years of small wins is how a funded model gets built. Small contracts on the buy side, hourly work on the labor side, one piece of open-source tooling shipped at a time. By year ten, the open-source stack covers enough of Tyler’s long tail that Oregon can renegotiate every renewal on favorable terms. That is when the alternative becomes a funded model that other states can adopt.

The endgame is a funded, open-source alternative for governments. The startgame is one cemetery.

This is not a fantasy. It is the cheapest version of something the state already pays ten-figure sums for every decade.

The honest part

Rewriting Tyler is not just writing code.

The data migration is the hard part. Tyler will not help. The schema will have to be reverse-engineered. A cooperative exit clause will have to be negotiated up front — because signing a ten-year Tyler renewal is legally binding and Tyler’s lawyers are better at contract than your student engineers are at anything.

The procurement is the second-hard part. State procurement does not know how to buy software from a public-good non-profit run by students. I have managed these contracts — the labor, the billing, the preferred-vendor process. I know the shape of the problem from the inside. That is a regulatory problem with a regulatory solution. The solution requires a state legislature, not an engineering team.

The testing is the third-hard part. Agentic coding is fast but not free of bugs. Anything touching court records or tax records has to be audited, not just tested. A hybrid of deterministic validation and, where needed, professional software auditors.

None of this is free. All of this is cheaper than the status quo.

The pattern

Every chapter of this book is the same move at a different scale.

The address field had one domain owner. Cue had a district manager. The interview had a hundred and fifty thousand applicants. This chapter has the state of Oregon.

Prompt literacy is necessary. It is not sufficient. The access gap is what breaks the pattern. You do not give the person a steering wheel and lock the car. You give them the car.

Open-source the car.

Hire the students who will build it. Pay them. Teach them the pattern. Ship the open-source car into one municipality, then the next. Watch the five-hundred-dollar custom-report fees stop arriving.

The state of Oregon already told its employees to learn how to prompt. The next move is to give them something worth prompting on top of.

I hope other states do this too. Each with their own talent. Each under their own roof. No outside capital. The model is not proprietary. That is the point.

The Utility

In 2020, China’s State Council added a phrase to its official development lexicon: 新基建, new infrastructure.

The old infrastructure was roads, bridges, ports. The new infrastructure was compute, data, and AI. The State Council declared them equivalent — things the country would build, subsidize, and make available at scale, the way earlier generations built the national grid.

That is the premise this chapter is about. China decided AI is a utility.

The voucher

The core mechanism is 算力券 — the compute voucher. A municipality issues vouchers worth a fixed amount, good against AI compute at specified data centers. Small and medium enterprises redeem them. Startups, research labs, university teams redeem them. The subsidy pays down the rental cost of GPU time.

As of 2024 and 2025:

  • Shanghai: about 600 million RMB ($84M) in vouchers, covering up to 80% of AI rental fees, plus 100 million RMB for data and model training.
  • Shenzhen: 500 million RMB annual program. 50% subsidy for general applicants, 60% for startups.
  • Chengdu: 100 million RMB, expanded from an earlier pilot.
  • Shandong: 30 million RMB initial, another 1 billion queued.
  • Beijing: a parallel program in the Economic-Technological Development Area.

Together they create a second market beneath the commercial AI market — where access is subsidized the way public transit is subsidized alongside private cars.

The open weights

The utility framing runs deeper than vouchers. The major Chinese labs — DeepSeek, Alibaba’s Qwen, Zhipu’s GLM, ByteDance’s Doubao, Baidu’s Ernie, 01.AI’s Yi — have made open weights the norm. Not open in the sense of we wrote a paper. Open in the sense of the weights are on Hugging Face, under a permissive license, anyone can download them.

Qwen 3.5’s 397-billion-parameter mixture-of-experts model ships under Apache 2.0, with 256,000-token context and 201-language support. DeepSeek V4, released in early 2026, runs inference at about thirty cents per million input tokens. Claude Sonnet at the same moment was three dollars. GPT-5 was a dollar and a quarter.

The pricing gap is not a discount. It is a different theory of what the model is for.

A closed-weight frontier lab is a company. A company sells a product.

An open-weight model under Apache 2.0 is closer to a public-works project. You do not sell Apache 2.0. You subsidize it, deploy it, and measure the economic activity it enables. DeepSeek and Qwen together went from about one percent of global AI usage in early 2025 to roughly fifteen percent a year later. The strategy worked.

What utility means

A utility has three properties.

  • It is ubiquitous. Everyone has access, at prices calibrated to public good, not to market optimization.
  • It is metered. You pay for what you use, at regulated rates, with defined tiers for low-income users.
  • It is accountable to the public. Not a private monopoly.

China is working the ubiquity problem. The vouchers extend access to the populations private pricing would exclude. The open weights let small firms self-host without paying rent to a US provider.

The metering is real but state-bound. Chinese models file with the Cyberspace Administration — 346 generative-AI services as of April 2025 — and content moderation is enforced by law.

Accountability is the hardest word. A utility accountable to the public is accountable to the public. A utility accountable to the state is accountable to the state. Those are not the same condition.

The honest part

Every chapter has the honest part. This one’s is larger.

The utility framing is not clean. The same state that subsidizes the compute also operates Hikvision and Dahua. The Cyberspace Administration’s Interim Measures for Generative AI Services, effective August 2023, require providers to prevent content that “endangers national security” or “undermines social stability.” Those phrases are not neutral.

Export controls matter too. US restrictions on HBM memory and advanced lithography mean that the Chinese AI stack is built under asymmetric constraints. Some of the efficiency in DeepSeek and Qwen is real engineering, and some of it is a forced response to a hardware gap.

None of this disqualifies the utility framing. All of it complicates it. China treats AI as a utility and leaving it there is not honest. Therefore it cannot be a model for anything is not honest either. The utility framing is a useful idea that arrived wrapped in a governance model the US would not adopt.

The question for the reader of this book is whether the idea can be separated from the wrapper.

The Rust part

A note for the audience of this book.

Some of the most important database and infrastructure software of the last five years came out of China, in Rust. TiKV and TiDB from PingCAP. GreptimeDB. RisingWave. The ByteDance teams behind Feishu were early Rust adopters. RustChinaConf has run in Shenzhen, Shanghai, and Hangzhou. The Rust community in China is large, active, and shipping at major scale.

If AI becomes a utility, the infrastructure underneath it has to be fast, cheap, and memory-safe. That is a Rust problem. The Chinese teams were early. They were not the only ones, but they were early, and the work is genuinely impressive. The Rust audience for this book should know.

The pattern

Every chapter of this book is the same move at a different scale.

AlphaFold made a tool available to two million scientists for free. The address chapter handed one generic transform to one domain owner. The interview chapter gave every applicant a real shot. Cue put a pair of hands in the manager’s thread. Oregon proposed a public-good alternative to Tyler, built locally.

China’s bet is bigger than any of those. The bet is that AI is more like electricity than like software — that the right governing layer is a utility regulator, not a market.

That bet may or may not be the right bet. It is the bet most directly aimed at the access gap named in the introduction to this book. The US has not made any equivalent bet. The American frontier labs are private, closed, and priced by the market. The public has no voucher. The student, the non-profit, the rural school district, the immigrant worker at the address the payroll system could not fit — they pay the market rate or they do without.

That is a choice. It is worth calling a choice.

The utility framing is the alternative. The rest of this book is what a person with access to the utility can do.

The Calculator

Every generation has panicked about a new cognitive tool destroying a human skill.

Calculators were going to destroy arithmetic. Word processors were going to destroy grammar. Spell-check was going to destroy spelling. Wikipedia was going to destroy research. GPS was going to destroy wayfinding.

Now LLMs are going to destroy thinking.

Maybe.

What I notice in my own case is closer to the opposite. My grammar and spelling are better than they have been in years. I take time to construct coherent sentences even when I am slacking, because I know the model on the other end will carry what I write forward. The keys I hit most are the ones that make words — not the ctrl-shift-whatever shortcuts I used to run through to manipulate applications I barely read. The interaction stopped being about commanding the machine. It started being about writing to it.

That is when I realized how much of the writing I had missed, compared to the syntax lookups.

That is one person’s observation. The calculator argument — the same shape of tool at a different layer of the cognitive stack — was settled forty years ago by a meta-analysis that almost nobody cites in the LLM debate.

Hembree and Dessart

In 1986, Ray Hembree and Donald Dessart published Effects of Hand-Held Calculators in Precollege Mathematics Education: A Meta-Analysis in the Journal for Research in Mathematics Education. They integrated seventy-nine research studies on what happens when K-through-12 students use calculators alongside traditional instruction.

The data was unambiguous in two directions.

First: at every grade except one, students who used calculators alongside paper-and-pencil instruction ended up with better paper-and-pencil skills than students who did not. Not worse. Better. The tool did not replace the skill. It freed enough cognitive budget for problem-solving that fluency in the mechanical arithmetic improved as a consequence.

Second: across every grade and every ability level, calculator-using students had better attitudes toward math and a better self-concept in math. Math phobia is a documented driver of underperformance. A tool that removes the humiliation of arithmetic errors lets students stay in the game long enough to learn the game.

Two hundred and thirty citations later, this is the settled evidence on the calculator question.

The exception

The exception was Grade 4.

At that specific developmental window, sustained calculator use appeared to hinder basic-skills development in average students. The interpretation matters. Grade 4 is roughly when arithmetic moves from counted sums to memorized facts. If you offload the effort of computing at precisely the moment the student needs the effort to form the memory, you break the formation.

This is not an argument against calculators. It is an argument against deploying them too early in the learning arc, before the skill has formed at all.

Hembree and Dessart did not say tools are always fine. They said tools are fine after the skill is formed. Before the skill is formed, they can harm it.

The LLM case

Move the same argument forward forty years.

An LLM is a calculator for language, for code, for summarization, for reasoning. The debate is exactly the calculator debate. One side says the tool will destroy writing, coding, thinking. The other side says the tool will free people to do higher-order work.

Hembree and Dessart are on the second side. So is every subsequent meta-analysis on educational technology that has asked the question carefully. The tool, used after skills have formed, tends to improve the underlying skills. The tool, used as a substitute before skills have formed, can erode them.

The question is not whether to use LLMs. The question is when in the learning arc to introduce them, and what to free the student to do with the cognitive budget you just gave them back.

The honest part

The calculator analog is not perfect.

A calculator has a bounded output space. Ten digits, one operation, a correct answer. An LLM has an unbounded output space, can hallucinate, and can be wrong in ways the user will not notice. The LLM is closer to a word processor that will auto-write the essay if you let it than to a device that adds up numbers.

That is a real difference. It does not invalidate the meta-analysis. It does mean the Grade-4 exception is bigger. The developmental window during which LLM use risks short-circuiting skill formation is longer and fuzzier. Where a Grade-4 kid can work arithmetic until fluency and then safely pick up the calculator, a first-year writer or a first-year coder does not have a clean fluency threshold after which the tool is safe.

There is a second way the analog breaks. Human psychology cannot handle the anthropomorphization of the model, especially when writing about personal details. The model can yield an empathetic link that a machine should not wield over us. Weizenbaum wrote about this, and it is surprisingly true today.

That is the educator’s problem, and the writer’s problem, and the reader’s problem. A problem worth naming. Not a problem that disqualifies the pattern.

The pattern

The rest of this book describes specific adult work — address normalization, interviewing, operations, governance, infrastructure — where the skill has already been formed by the people using the tool. The HR operations lead has twenty-two years. The district manager has eleven. The integration engineer has ten. The tool works because the judgment underneath it is already good.

That is the calculator pattern at adult scale. You introduce the tool after the judgment. You pay the cost at the front of the career, and the tool lets you spend the rest of the career doing the work the judgment was actually for.

The book’s thesis is that a great many adults in the workforce have formed their judgment and do not have access to tools that match it. Giving them the tool is Hembree and Dessart, forty years later, at higher stakes, with more at risk.

Settle the calculator argument once. Then carry the answer forward.

Daniela

The standard Silicon Valley founding pattern is that the CEO’s word is final.

The cofounder arrangement, the board structure, the option pool, the go-to-market strategy — all of it eventually answers to a single person whose name is on the pitch deck and whose face is on the magazine cover. The COO is a hired executive. The board is composed to support the thesis, not to challenge it. The safety policy, when one exists, is drafted by the same office that is responsible for hitting the growth number.

Anthropic is the clearest example of a frontier lab that broke the pattern.

Not because of its papers. Its papers are good, but other labs publish good papers. Not because of its Responsible Scaling Policy — every lab publishes a policy now. The pattern is broken because the President of the company is the CEO’s sister, and she is not a hired executive.

The second seat

Dario Amodei is Anthropic’s CEO. He writes the papers, gives the interviews, briefs the Senate. He is the ML researcher in the first seat.

Daniela Amodei is the President. She runs revenue, commercial strategy, partnerships, hiring, operations, policy execution. She is the second seat. The second seat is where safety frameworks either get enforced or get drifted into.

Her resume does not look like an ML resume. English literature, UC Santa Cruz. Classical flute scholarship. A political campaign in Pennsylvania. Communications for a US House Representative. Stripe starting in 2013. OpenAI starting in 2018, where she managed the team during GPT-2’s release and then became VP of Safety and Policy. Anthropic in 2021, as co-founder.

The entire career is humanities and operations and policy. That is the point.

The person in the second seat who is going to enforce the safety framework cannot be the same kind of person as the person in the first seat. The first seat wants to ship. The second seat has to be able to say not yet.

The sister clause

The Responsible Scaling Policy, published by Anthropic in September 2023, is a document. A document is not an enforcement mechanism. An enforcement mechanism is a person with authority who is willing to use it.

A hired Chief Operating Officer cannot use that authority against a founding CEO. Not meaningfully. The COO serves at the pleasure of the CEO. If the COO says we cannot ship this model until the evaluations pass, and the CEO says ship it anyway, the COO either ships it, quits, or gets fired. None of those outcomes enforce the policy.

A younger sister who co-founded the company can use that authority. The family relationship predates the company and outlasts every funding round. The trust is older than the equity. The disagreement, when it happens, is a disagreement between two people who cannot fire each other.

That is the clause that makes the Responsible Scaling Policy enforceable. It is not written anywhere in the document. It is structural. It is human. It is the reason the document is worth the paper it is printed on.

The corporate form

The sister clause is the human layer. There is also a legal layer.

Anthropic is incorporated as a Delaware Public Benefit Corporation. A PBC — a corporate form that did not exist until 2013. Unlike a standard C-corp, a PBC is no longer obligated to chase the bottom line above all else. Directors are allowed to weigh a stated public mission against shareholder returns, and they cannot be sued for choosing the mission.

Anthropic’s stated public benefit, written into the certificate of incorporation, is the responsible development and maintenance of advanced AI for the long-term benefit of humanity. Those words bind the directors. They are not marketing copy. They are law.

On top of the PBC structure, Anthropic has a Long-Term Benefit Trust that appoints board members independent of the financial investors. Another refusal point. Another check on the quarterly pressure before it reaches the product.

None of this is a guarantee. A PBC can still drift. A trust can still be captured. But the corporate form is the legal equivalent of the sister clause: it lowers the cost of refusal by making refusal something the directors are allowed to do.

Why this matters for this book

Every chapter of this book has recommended a tool. The tool, in most cases, is Claude. Claude is made by a company. The book has asked the reader to rely on that company.

I owe the reader an account of why I am willing to make that ask.

I am willing to make the ask because the company has a President whose job is to refuse, and whose authority to refuse does not depend on the CEO’s mood. That is unusual. It is rare enough that I do not know another frontier lab where the structure is comparable. The OpenAI board fight of November 2023 is the cautionary tale. The board tried to exercise refusal, the refusal mechanism turned out to be weaker than the growth mechanism, and everybody involved had to renegotiate who had what authority after the fact.

Anthropic has not had that fight in public yet. It may someday. If it does, the sister is the clause.

One smaller thing, worth saying. Anthropic waited to release their 2026 Agentic Coding Trends Report — the one that profiles Fountain on page eight — to match our Cue GA. Most partnerships fit the smaller company into the larger company’s calendar. This one waited. I was honored. That shows real class.

What it would take to replicate

Most companies cannot replicate the Anthropic governance dynamic directly. Very few CEOs have a co-founding sibling. What any company can replicate is the structural principle.

The principle is that the second seat has to have authority that is not contingent on the first seat’s approval. That authority can come from many places — a co-founder relationship, an independent board with genuine power, a union, a regulator, a contract with real teeth. What it cannot come from is the first seat’s generosity. Generosity is not governance.

The pattern

This book argues that AI can be deployed well, but only when a specific pattern is in place: the tool, the auditor, the domain owner, and the refusal point. The refusal point is the piece most often missing.

AlphaFold had one — DeepMind’s decision to release AFDB under CC-BY-4.0 was a refusal of the standard commercial enclosure. The address chapter had one — a company that told me no when my first use case failed the ethical bar. The interview chapter has NYC Local Law 144 and the EEOC. Cue has its own gating architecture. Oregon, in the proposal, has public-good governance. China has the Cyberspace Administration, which is a refusal point, even if not the one the US would choose.

Every case works because someone, somewhere, can say no, not yet, and their no has teeth.

Daniela is the model. The second seat with real refusal power. The book’s whole thesis depends on that seat existing, somewhere, in some form, at every link in the chain. Without it, everything else in this book is just marketing copy for a tool.

With it, the tool is worth using.