2024 United States meta

Coding on Copilot — Downward Pressure on Code Quality

Bill Harding, GitClear, January 2024 — analyzed 153 million lines of code. Code churn doubling. AI-era code resembles an itinerant contributor.

Bill Harding, founder of GitClear, published the first large-scale empirical analysis of code-quality trends across the Copilot adoption period. GitClear instrumented 153 million lines of changed code from 2020 through 2023, distinguishing among added, deleted, updated, moved, and copy-pasted lines, and tracking churn — the percentage of authored lines reverted or rewritten within two weeks. Findings: churn projected to double by 2024 versus the 2021 pre-AI baseline; ratio of added and copy-pasted code rising; ratio of updated, moved, and deleted code falling. Harding's framing: AI-era code resembles "an itinerant contributor" — someone who shows up, adds material, never refactors, and leaves. The 2025 follow-up (GitClear, 211 million lines analyzed) reported an eight-fold year-over-year increase in five-line-or-more duplicate code blocks. First industry-scale measurement of the maintenance-side cost of AI coding tools.

In plain terms

Most claims about AI productivity are based on developer self-reports. Surveys. "Did you feel faster this week?" These are unreliable for the same reason all self-reports are unreliable — people remember the wins and forget the losses, and they want to like the tools their employer paid for. GitClear did not run a survey. GitClear measured what the code did. The instrumentation is the interesting part. Most code-metrics tools count lines. GitClear classifies change types. When you commit a line, did you write it from scratch (added)? Did you tweak an existing line (updated)? Did you move a chunk of logic from one file to another without changing it (moved)? Did you delete it? Did you copy it from somewhere else in the codebase (copy-pasted)? Each category tells a different story about the engineering activity behind the commit. Healthy codebases have a mix. Updates and moves dominate over time, because mature systems mostly evolve their existing structure rather than bolting on new structure. Deleted code is a sign of refactoring — the team is removing things that are no longer needed. Added and copy-pasted code rising relative to updated and moved code is a warning sign: the team is bolting on instead of integrating. GitClear's finding, across 153 million lines and four years of data, is that the warning sign is now lit. The Copilot-era ratio shifted decisively toward added and copy-pasted, decisively away from updated, moved, and deleted. The team is not refactoring. The team is accreting. And the churn number — lines that are reverted or substantially rewritten within two weeks of being committed — is rising fast. Churn is the empirical signature of code that was wrong on first commit and had to be redone. The 2025 follow-up sharpened the result. Five-line-or-more duplicate blocks grew eight-fold year over year. Not eight percent. Eight times. The DRY principle, which has been the load-bearing axiom of software engineering for fifty years, is being violated at industrial scale. And the violations are happening because the AI tool, on each call, regenerates code from scratch with no awareness of what already exists in the codebase. Each call is an itinerant contributor. This is the Lehman entropy result of chapter nine, observed empirically in 2024. Software degrades unless explicit work is done to reduce its complexity. AI tools are doing the opposite of that work. They are adding faster than they are integrating. And the maintenance bill compounds. Add code. Do not refactor. The itinerant contributor signs every commit.