An MDX content pipeline grows three kinds of complexity

The first MDX tutorial you read tells you how to render one file. The second one tells you how to render a folder. By the third post on your site, you’ve started noticing that the rendering bit was the easy part. There’s a layer of complexity the tutorials don’t cover, because the tutorials never have three posts to think about.

I noticed it the day my homepage stopped linking correctly to my own tag pages. The /writing page’s tag filter built URLs like /writing/tag/build log/, with a literal space. The static export only knew about /writing/tag/build-log/. Every multi-word tag link on the site was dead, and had been since the day I shipped the tag system.

Reading back through docs/devlog.md, I noticed the same shape of bug surfacing in three different blocks of the production audit across two days. The pattern was bigger than one bug. An MDX content pipeline grows three kinds of complexity as it scales, and the bugs in each layer look superficially similar but have fundamentally different fixes.

1. Routing complexity

The first wall. Most MDX tutorials show you pages/[slug].tsx and call it a day. In practice, every additional pivot you add to the content (tags, dates, featured-state, language, author, draft status) multiplies the route surface area.

On captainrandom.co.uk the route surface today is:

/writing/ (index, sorted featured-first then newest)
/writing/[slug]/ (one per post)
/writing/tag/[tag]/ (one per tag, including tags that don’t have any posts yet)

That third route was the source of the multi-word-tag bug. On 2026-05-19 I caught two related defects in the same block of the audit:

2026-05-19 — fix(writing-index): broken tag slugs, broken active-pill class, trailing slashes, soften copy, pin featured. Tag-filter URLs were being built via ${tag.toLowerCase()}, so the display tag "Build Log" produced /writing/tag/build log/ with a literal space, while the static export generates /writing/tag/build-log/ (hyphenated). Every multi-word tag link on the index was dead.

The fix wasn’t one line; it was the existence of a slugifyTag() helper that the whole codebase agreed on. Once one file does its own slugification, you have a routing complexity problem. Two files doing slightly different slugification is a bug waiting to ship.

The high-leverage fix at this layer: a single tag-slug function in src/lib/mdx.ts that every route, every chrome surface, and every component imports. Lowercase, non-alphanumerics to hyphens, trim. Five lines of code that prevent the same bug from existing in fifty places.

The deeper insight: anticipated slugs. The Footer + Features sections on this site link to tag pages like /writing/tag/retail/ that don’t have any posts yet. A naïve generateStaticParams that only returns tags-with-posts would 404 those links. The fix is to return the union of (tags-actually-in-posts) and a hand-curated list of anticipated slugs, and have the route render a graceful empty state for the latter:

export async function generateStaticParams() {
  const existing = getAllTagSlugs()
  return Array.from(new Set([...existing, ...ANTICIPATED_TAG_SLUGS]))
    .map((tag) => ({ tag }))
}

That single pattern unblocked four downstream features (Footer links, Features cards, Workshop tag links, post-cover tag chips) without any of them needing to coordinate.

2. Authoring complexity

The second wall. With three posts, you can keep their relative importance in your head. With ten, you can’t. With thirty, you need a system.

Three sub-problems hide inside “authoring complexity,” and they look similar but want different solutions:

Featured-post ordering. Pure newest-first is wrong, because you have a flagship post that should sit at position one regardless of date. Pure manual ordering is wrong, because you don’t want to re-rank every post every time you publish. The middle path is a boolean featured field in the MDX frontmatter, and a two-pass sort: featured-first (newest-first within), then non-featured (newest-first within). Six lines of code in the index page.

Tag taxonomy. Frontmatter accepts strings; strings are typos waiting to happen. By post ten you’ll have AI Tools in some files and AI tools in others and ai-tools in one rogue file that’s now invisible to the tag index. The fix isn’t runtime validation. It’s a display-label override table in src/data/tags.ts that maps every slug to a canonical display form. Slugify on the way in (write the post however you want); resolve to canonical labels on the way out (chrome always reads from the table).

Per-tag empty-state pages. A reader who clicks a Footer link to /writing/tag/retail/ shouldn’t see a 404 just because no post is tagged Retail yet. The empty state needs its own component: a single “no posts under this tag yet, here’s the index” surface that the dynamic route renders when getPostsByTagSlug(slug) returns []. This is the same fix as the anticipated-slugs pattern from layer one; they reinforce each other.

The high-leverage move at this layer: treat frontmatter as user-input you don’t fully trust. Validate slugs on read, canonicalise labels on render, and never let the absence of content become a 404.

3. Citation complexity

The third wall, and the one that doesn’t exist for most blogs. Only for ones where the articles reference each other or reference an external source-of-truth like a build log.

By the time captainrandom.co.uk had published two long-form articles, both of which cited specific dated entries in docs/devlog.md, I had a question I couldn’t easily answer: “Which entries does each article cite? Which articles already cover X? What’s an honest candidate for the next post?” Skimming the article markdown for 2026-05- references worked for two articles. It wouldn’t work for twenty.

The solution shipped as the DVLAW skill (DevLog Article Workflow). The relevant piece for this article is the citation graph: a SQLite table that records, per published article, which devlog entries it backlinks to. Population happens automatically on a /dvlaw_ship. The Python parses the article’s MDX, extracts every docs/devlog.md reference containing an ISO date, joins those to entries by date, and inserts into article_citations.

The result is a query I run before every new article’s thesis-clearing pass:

SELECT t.tag, COUNT(DISTINCT e.id) AS uncited
FROM entry_tags t JOIN entries e ON e.id = t.entry_id
LEFT JOIN article_citations ac ON ac.entry_id = e.id
WHERE ac.article_id IS NULL
GROUP BY t.tag
ORDER BY uncited DESC;

That’s /dvlaw_inventory. It tells me which devlog tags have the most entries that no published article has cited: the fertile ground for the next post. Without it, I’d be re-litigating decisions about which topics to write about. With it, the corpus tells me.

The high-leverage move at this layer: bidirectional links between source (devlog) and surface (articles), enforced by tooling rather than by author discipline. A SQLite index is overkill for ten articles. It earns its keep at the point where you can’t remember which post covered which decision.

What this leaves out

This isn’t the complete story of MDX at scale. I’ve deliberately skipped:

Build performance. Static generation gets slower as posts multiply. There are real fixes (incremental builds, ISR, build-time data fetching). I haven’t hit the wall yet on this site, so I haven’t written the post.
Custom MDX components. The mdx-components.tsx mapping that turns <pre> into the macOS-window code-block chrome on this site is its own essay. It belongs in a separate post about design-system integration, not this one about content architecture.
Multi-author / multi-language. Captain Random is single- author and single-language. The complexity story is different again when those constraints lift.

What I’ve described is the common path: a single-author build-in-public site that adds tags, then a featured-post mechanic, then cross-references between articles and a build log. Three walls, in approximately that order.

What this teaches

The walls grow non-linearly. Each adds more complexity than the last, but they appear in roughly fixed order, and each has a single high-leverage fix that prevents a class of bugs from existing in your codebase rather than just fixing the one you noticed.

The mechanical record of building each of these fixes lives in docs/devlog.md: routing complexity in the 2026-05-18 to 2026-05-19 entries, authoring complexity in the same window, citation complexity from 2026-05-21. This essay is the synthesis. The devlog is the audit trail.