Skip to content
All writing
Software · 1 min read

Refactoring a 200kloc Astro site

The decisions that age badly and those that don't, told through one slow migration to content collections.

Two years ago I shipped a content-heavy site with files laid out by language: src/pages/en/, src/pages/jp/, etc. Each language had its own routes, hand-translated, duplicated structure.

It worked. Then I added a fourth language and the cracks showed.

The cost of duplicated routes

Every new page meant four edits. Every typo fix meant four edits. The schema for a “post” lived implicitly in the file structure, which meant contributors had to read existing files to learn the pattern.

Worst part: linking between languages required string interpolation in every component that crossed a boundary.

What content collections fix

Moving to Astro’s content collections meant declaring the schema once, in TypeScript, and authoring posts as data instead of pages. Routes became thin shells that read from the collection. Each post became one file shared across languages, with a translations map for per-locale strings.

const posts = defineCollection({
  loader: glob({ pattern: '**/*.md', base: './src/content/posts' }),
  schema: z.object({
    title: z.string(),
    pubDate: z.coerce.date(),
    pillar: z.enum(['llm', 'software', 'dance']),
  }),
});

The migration took two weeks. Three of those days were spent on URL redirects so old links didn’t break.

What I’d do differently

Pick the schema first. I changed mine twice mid-migration and had to touch every post each time. Two hours of upfront design would have saved two evenings of grep-and-edit.

Tags #astro #refactoring #content-collections