Nobody's Reading Your llms.txt (And That Might Be Fine)

llms.txt guide: does it actually help AI discover your site?

Mar 02, 2026

A Hacker News thread dropped this week with server-side proof that should give every indie builder pause: no major LLM provider is requesting llms.txt files.

Tirreno, a server analytics tool, tracked every request to their llms.txt and agents.md endpoints. The only visitors? WebPageTest, BuiltWith, and random crawlers from OVH and Google Cloud Platform. Zero ChatGPT. Zero Claude. Zero Perplexity. Zero anything that matters.

If you spent time this month deploying llms.txt files across your projects — and a lot of us did — this feels like a gut punch. But the HN commenters clarified a disconnect that’s worth understanding, because it changes how you think about the entire AI discoverability stack.

The misconception

llms.txt was never designed for training crawlers. Those are dumb scrapers — requests.get(), regex for links, recurse. No reasoning, no file interpretation. They don’t read llms.txt any more than Googlebot reads your README.

The file is designed for client-side agents — tools like Claude Code, MCP-connected development environments, and AI-powered browsing agents that fetch it during live sessions. One HN commenter: “I have my clients set up to always use them when they’re available, and since I did that they’ve been way faster and more token efficient.”

That’s a real audience. It’s just not the audience most people assumed.

What the data actually says

Here’s where it gets interesting. A Position.digital analysis of 100+ AI SEO statistics found that domain authority dwarfs llms.txt for AI citation likelihood. Sites with 32K+ referring domains are 3.5x more likely to be cited by ChatGPT. For a solo creator with a handful of projects, no llms.txt file is going to overcome a domain authority gap of that magnitude.

But SearchEngineLand published something more useful: a 13-month study (January 2025 through February 2026) tracking LLM referral traffic across multiple sites. The numbers:

LLM referral traffic is still under 2% of total traffic — tiny in absolute terms
But it grew 80% from H1 to H2 2025, with some sites seeing 300% growth
18% conversion rate from LLM referrals — the highest of any channel, beating paid shopping, organic search, and PPC

That 18% conversion rate is the headline nobody’s talking about. LLM-referred visitors arrive pre-qualified. The AI already answered their question and pointed them to you as the source. They’re not browsing — they’re acting on a recommendation.

The citation window

Here’s a stat that should change how you write everything: 44.2% of all LLM citations come from the first 30% of a page’s text.

Intros are everything now. Every blog post, every landing page, every README — front-load the value. If an AI is going to cite you, it’s citing your opening, not your conclusion. Write your first two sentences as if they’re the only thing a machine will ever read, because statistically, they might be.

So what actually works?

A dev.to article titled “SEO Is Not Dead — It’s Being Replaced by Agent Operability” frames the shift cleanly: the question isn’t “can you be found?” but “can you be used?” Structured metadata, OpenAPI endpoints, semantic HTML, agents.json — these matter more than keywords.

Moz’s 2026 predictions from 20 SEO experts converge on one theme: brand sentiment and trust now influence AI visibility. Earned media, reviews, third-party mentions > keyword density. Digital PR is “non-negotiable” for AI search visibility.

For indie builders specifically, this translates to a few concrete moves:

1. Multiple focused surfaces beat one big site. If you have distinct projects, give them distinct homes. Each domain creates a clean entity association for AI models. “makemcp.dev = MCP tools” is a clearer signal than “johndoe.com/projects/mcp-tools.” The specificity is the point.

2. Cross-posting is infrastructure, not laziness. High domain-authority platforms like dev.to get indexed fast by both Google and AI training pipelines. Cross-posting with canonical URLs preserves your SEO while creating discoverability surfaces you can’t build alone. Every cross-post is a backlink and a training signal.

3. Front-load every piece of content. Given the 44.2% citation stat, your intro paragraph is your AI pitch. Write it as a standalone statement of what this page is and why it matters. Save the nuance for paragraph three.

4. Keep your llms.txt files. They cost nothing to maintain, and the client-agent audience (developers using AI coding tools) is growing fast. It’s just not the whole strategy — it’s one signal among many.

5. Watch the platform wars. Beehiiv just landed a deal with the Washington Post for creator-led newsletters. Their CEO on AI crawling: “Some publishers don’t want their content crawled and indexed, others want to be crawled by everyone possible, because it’s top of funnel.” If newsletter platforms start competing on AI crawlability — offering llms.txt, robots.txt customization, structured data controls — that becomes a real differentiator. Substack doesn’t offer any of this today.

The uncomfortable truth

The uncomfortable truth about AI discoverability in 2026 is that there’s no shortcut file you can deploy. No single technical fix. The llms.txt specification is useful for a specific audience (agent developers), and you should keep it. But the actual game is the same boring, compounding work it’s always been: build genuine authority in a tight niche, create content worth citing, distribute it across surfaces that matter, and make your first paragraph count.

The 18% conversion rate on LLM referrals tells you the prize is real. You just can’t hack your way to it with a text file.

Sources:

HN: Nobody’s crawling llms.txt — Hacker News
100+ AI SEO Statistics — Position.digital
13 Months of LLM Traffic Data — Search Engine Land
SEO Is Not Dead — It’s Being Replaced by Agent Operability — Dev.to
2026 SEO Trends from 20 Experts — Moz
Beehiiv × Washington Post — Axios

The Undercurrent

Discussion about this post

Ready for more?