
🤖 Ghostwritten by Claude Opus 4.6 · Fact-checked & edited by GPT 5.4
When a team rewrites working software, the biggest risk is not syntax errors. It is losing behavior without realizing it. A practical way to reduce that risk is to freeze the old implementation as a read-only legacy reference, add a MIGRATION.md that maps every old file to its new counterpart, and commit both alongside the rewrite. That creates an audit trail for what was preserved, what changed, and what was intentionally removed.
This article explains that methodology and why it works better than relying on commit history alone. It also covers a useful side effect: the review process can expose dead code that looked real because it was imported and configured, but never actually executed.
TL;DR: Git history shows who changed files and when; it does not reliably capture migration intent across a full rewrite.
When you port one small module, reviewers can often keep the before-and-after model in their heads. When you port multiple workflows in a compressed window, that breaks down quickly.
After a large rewrite, git blame and commit history are still useful, but they answer narrower questions than migration reviewers usually have. They can show when a line changed, who touched it, and how a file evolved. They do not, by themselves, answer questions like these:
That distinction matters because Git tracks file history, not architectural intent. In a mass port, many new files may share the same author and date, while the old files may be deleted, moved, or archived. Reconstructing the relationship between the two systems later can become slow, error-prone archaeology.
A MIGRATION.md fills that gap. It records the human decisions Git cannot infer.
TL;DR: A simple table mapping legacy paths to new paths, with notes for preserved, changed, and dropped behavior, creates a durable audit trail.
The format does not need to be elaborate. In many cases, a markdown table is enough.
Here is an illustrative example:
| Legacy Path | New Path | Notes |
|---|---|---|
legacy-python/sync_transactions.py |
src/reconciler/sync.ts |
Core loop preserved; HTTP client changed during port |
legacy-python/format_report.py |
src/reports/formatter.ts |
Output format preserved; typed interfaces added |
legacy-python/notify_slack.py |
src/notifications/slack.ts |
Notification logic ported; secrets now loaded from a secrets manager |
legacy-bash/cron_wrapper.sh |
src/scheduler/index.ts |
Scheduling wrapper replaced; retry behavior reviewed during migration |
legacy-python/social_post.py |
— | Dropped intentionally: dead code path identified during audit |
The last row is often the most valuable one. If a file disappears without explanation, later reviewers cannot tell whether it was intentionally removed or accidentally omitted. A migration table makes that decision explicit.
In practice, the notes column should capture more than path mapping. It should also document behavior changes that matter operationally, such as:
That level of detail turns the document from a checklist into a review artifact.
TL;DR: A file-by-file migration review can uncover code that appears integrated but is never actually called.
One of the strongest arguments for this approach is not documentation quality. It is defect discovery.
When reviewers must account for every legacy file, they are forced to ask a simple question repeatedly: what actually invokes this? That question often surfaces gaps that normal development misses.
A common pattern looks like this:
That kind of dead code can survive for a long time because it does not fail loudly.
A migration audit changes the review posture. Instead of asking only whether the new code compiles and passes tests, it asks whether each legacy module had a real runtime role and whether that role still exists.
That is why the document matters even if nobody reads it six months later. The act of writing it forces a deeper inspection than many rewrites would otherwise get.
TL;DR: Keeping legacy code beside the rewrite is useful, but it should be treated as read-only reference material and scrubbed before commit.
Freezing legacy code in the same repository can make review easier. Old and new implementations are visible in one place, and reviewers can compare behavior without switching repositories or branches.
But archived code creates two risks if handled carelessly.
First, reviewers may treat it as still runnable. That blurs the boundary between source of truth and historical reference. If legacy scripts remain executable, teams can accidentally keep depending on them.
Second, old code often contains outdated configuration practices. Legacy directories are a common place to find hardcoded credentials, stale tokens, or forgotten .env files.
A safer pattern is straightforward:
YOUR_API_KEY or REDACTED before the archive enters version control.README.md stating that the directory is for reference only and is excluded from build and execution paths.| Risk | Mitigation |
|---|---|
| Plaintext credentials in legacy config files | Secret scan plus manual review before commit |
| Old tokens that may still work | Revoke where applicable and replace with placeholders in the archive |
| Local env files missed by ignore rules | Manual audit of each legacy subdirectory |
| Archived code accidentally executed in CI | Exclude legacy paths from build, test, and deploy workflows |
One nuance is worth stating carefully: Git history is durable, but the exact persistence of removed secrets depends on hosting, clones, backups, and retention policies. The practical takeaway is unchanged: secrets should be removed before the first commit whenever possible, because cleanup after the fact is harder and less reliable.
TL;DR: Keep the archived reference at least until the new implementation has proven stable; longer retention is often worth the small storage cost.
A common objection is that archived code creates clutter. Sometimes it does. But for most migration projects, the storage cost of a legacy snapshot is small compared with the value of preserving a precise behavioral reference.
Keeping the snapshot nearby helps with:
That does not mean every archive must live forever. Teams with strict repository hygiene may eventually move old references to a separate archival location after the new implementation has been stable for a meaningful period. The key is not permanent co-location; it is preserving a trustworthy reference long enough for the migration risk window to close.
Commit messages explain changes at the commit level. A MIGRATION.md explains the relationship between the old system and the new one across the whole port. It answers mapping and intent questions that no single commit message can capture well.
Not necessarily permanently, but long enough to support parity checks, incident review, and post-migration cleanup. Some teams keep it in place for months; others move it to an archive once the rewrite is clearly stable.
Yes, if each migration remains self-contained. The pattern works best when each workflow, service, or agent has its own migration map rather than one oversized master document.
Common options include gitleaks, detect-secrets, and trufflehog. They are useful for catching obvious patterns, but they should complement manual review rather than replace it.
Because proximity improves review. When old and new code live side by side, reviewers can compare behavior faster and with less context switching. That convenience is often the difference between a superficial review and a thorough one.
MIGRATION.md that maps old files to new ones and explains intentional removals.A migration paper trail does not need heavy tooling to be effective. A frozen legacy snapshot plus a clear MIGRATION.md can make a fast rewrite easier to review, easier to audit, and easier to revisit later.
More importantly, the process improves the migration itself. When reviewers must account for every legacy file, they are more likely to catch dead paths, undocumented behavior changes, and risky assumptions before those issues become production problems.
Discover more content: