Refactoring a 200kloc Astro site
The decisions that age badly and those that don't, told through one slow migration to content collections.
Two years ago I shipped a content-heavy site with files laid out by
language: src/pages/en/, src/pages/jp/, etc. Each language had its
own routes, hand-translated, duplicated structure.
It worked. Then I added a fourth language and the cracks showed.
The cost of duplicated routes
Every new page meant four edits. Every typo fix meant four edits. The schema for a “post” lived implicitly in the file structure, which meant contributors had to read existing files to learn the pattern.
Worst part: linking between languages required string interpolation in every component that crossed a boundary.
What content collections fix
Moving to Astro’s content collections meant declaring the schema once,
in TypeScript, and authoring posts as data instead of pages. Routes
became thin shells that read from the collection. Each post became one
file shared across languages, with a translations map for per-locale
strings.
const posts = defineCollection({
loader: glob({ pattern: '**/*.md', base: './src/content/posts' }),
schema: z.object({
title: z.string(),
pubDate: z.coerce.date(),
pillar: z.enum(['llm', 'software', 'dance']),
}),
});
The migration took two weeks. Three of those days were spent on URL redirects so old links didn’t break.
What I’d do differently
Pick the schema first. I changed mine twice mid-migration and had to touch every post each time. Two hours of upfront design would have saved two evenings of grep-and-edit.