llms.txt: What It Is, Why Your Static Version Is Already Wrong, and How to Fix It
I already had an llms.txt. It was stale the moment I added the second blog post. Here's what I replaced it with — and what actually matters for AI search visibility.
Before I understood what llms.txt was actually for, I added one to this site. It was a static text file sitting in public/, manually written, listing a few URLs. It had no description format, no blog posts, and it was already wrong by the time I published my next article.
That's the version most sites have right now. Here's why it matters, what the spec actually says, and how I replaced it with something that stays in sync automatically.
Why traditional SEO isn't enough anymore#
Googlebot crawls, indexes, and ranks. AI answer engines — Perplexity, ChatGPT, Google AI Overviews — synthesize. They don't return ten blue links; they return one answer, assembled from passages they've decided are credible enough to quote.
The implication: you can rank #1 on Google and still be invisible to AI search. Relevance no longer lives at the page level — it lives at the sentence level. A page that clearly states a fact in its first paragraph gets cited. A page that buries its answer in keyword-padded prose gets skipped.
llms.txt is one response to this shift. It's not a magic ranking signal. It's a structured map you give AI crawlers so they don't have to guess what your site is about and where the useful content lives.
What llms.txt actually is#
The spec was proposed by Jeremy Howard (founder of fast.ai and Answer.AI) in late 2024. It's a markdown file at yourdomain.com/llms.txt — similar to robots.txt in concept, but written for large language models rather than traditional crawlers.
The format is deliberately simple:
# Your Name or Site
> One-sentence description of who you are and what you publish.
## Blog
- [Post Title](/en/blog/slug): Short description of this post.
- [Another Post](/en/blog/other-slug): What this one covers.
## Projects
- [Project Name](https://project.com): What it does.
The > blockquote at the top is the canonical description LLMs may use verbatim when introducing you. The ## sections group your content. Each line is a markdown link with a description. That's the whole spec.
What my original file was doing wrong#
My public/llms.txt looked like this:
# llms.txt for wonsukchoi.co
Canonical: https://wonsukchoi.co
Owner: Wonsuk Choi
Type: Personal website and portfolio
Updated: 2026-04-21
## Priority URLs
- Home: https://wonsukchoi.co/en
- Blog index: https://wonsukchoi.co/en/blog
Three problems:
- Wrong format. It used key-value pairs instead of the spec's
> descriptionand markdown link structure. An LLM scanning this gets metadata, not an understanding of what the site is about. - No content links. Pointing to
/en/blogtells a crawler "there's a blog." Listing individual posts with descriptions tells it "here's what I've written and why each piece exists." Those are very different signals. - Static. Every new blog post I published meant this file was immediately out of date. And I was never going to remember to update it manually.
Replacing it with a dynamic route#
In Next.js App Router, public/ files are served as static assets and take priority over route handlers at the same path. So the first step was deleting public/llms.txt.
Then I created app/llms.txt/route.ts:
import { createClient } from "@supabase/supabase-js";
export const revalidate = 3600;
const BASE = "https://wonsukchoi.co";
export async function GET() {
const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.NEXT_PUBLIC_SUPABASE_ANON_KEY!,
);
const { data: posts } = await supabase
.from("blog_posts")
.select("title, slug, excerpt")
.eq("published", true)
.order("created_at", { ascending: false })
.limit(30);
const blogLines = (posts ?? [])
.map((p) => {
const desc = p.excerpt ? `: ${p.excerpt.slice(0, 120)}` : "";
return `- [${p.title}](${BASE}/en/blog/${p.slug})`;
})
.();
body = ;
(body, {
: {
: ,
: ,
},
});
}
About 40 lines. It rebuilds every hour via ISR, pulls every published post with its excerpt, and formats it to spec. Every new post I publish is in the file within an hour, automatically.
The revalidate = 3600 is the key detail — without it, the file would be generated once at build time and go stale again. With it, Next.js refreshes it in the background on the first request after the TTL expires.
What else matters for AI citability#
llms.txt is one signal. These matter more:
Passage-level clarity. LLMs extract sentences, not pages. Put the answer in the first paragraph, not paragraph four. "This reduced LCP from 4.2s to 1.8s" gets cited. "This can potentially improve certain performance characteristics" does not.
Structured data. Schema.org markup — Article, FAQPage, HowTo — helps AI systems understand what type of content they're reading and extract structured answers. Add it if you haven't.
Topical depth. A site with 20 posts on one subject gets cited before a general site with one post on the same subject. Consistent publication in a narrow area builds authority that compounds.
Server-rendered HTML. JavaScript-heavy SPAs that need rendering to produce content are harder to crawl reliably. If your content is in the DOM on first load, you're easier to index.
The llms-full.txt extension#
The spec also defines llms-full.txt — a single file containing the actual content of your key pages concatenated as markdown. Useful for sites where content is gated, JavaScript-rendered, or otherwise hard to crawl.
For a public blog with server-rendered HTML, it's optional. But for a SaaS docs site or a knowledge base, generating one programmatically from your content source is worth doing.
Does any of this actually move the needle?#
Honest answer: it's early.
There's no public study showing a direct correlation between llms.txt compliance and AI citation rates. The major LLM providers haven't confirmed they actively use the file (though Perplexity has indicated awareness of the spec).
What I can say: implementing a proper llms.txt took me less than an hour, including deleting the broken static version. The mindset shift it represents — writing for passage extraction, not keyword density — is the more significant change. And that shift is already measurable in traditional SEO too.
If you're checking your AI referral traffic in GA4 (filter for sources containing "perplexity", "chatgpt", "bing"), you'll see it growing. The sites that adapt their content structure now will have a head start that's hard to close later.
What to do this week#
- Check if you have
llms.txtatyourdomain.com/llms.txt. If not, add one. If you have a static version, check whether it follows the spec format. - If your site has dynamic content (a blog, a changelog, a docs site), replace the static file with a dynamic route that stays in sync automatically.
- Audit your top 5 posts: does each one state its main claim in the first paragraph?
- Check your structured data with Google's Rich Results Test. Fix anything broken.
The rules of search are being rewritten. The spec exists, the tooling is trivial, and a stale static file is almost worse than nothing — it signals to AI systems that your site's content map doesn't match reality.