When AI Can't See You: A Case Study in LLM Access Failure
A fully public website. No authentication. 200 OK on every test. ClaudeBot user-agent accepted. And yet: WebFetch blocked. Here is what happened, why it happened, and what it means for AEO.
A fully public website. No authentication. 200 OK on every test. ClaudeBot user-agent accepted. And yet: Claude.ai returned "WebFetch blocked." This is a documented case, run on this domain, and it reveals something important about how AI visibility actually works — and how easily it can be misdiagnosed.
What happened
During AEO validation tests on aeostudio.io, Claude.ai failed to retrieve content from the site. The error was unambiguous: "WebFetch blocked." The site was publicly accessible, with no login wall, no CDN filtering, no user-agent restrictions. The failure looked like a server-side block. It wasn't.
A systematic elimination process confirmed the following:
- Browser access: success, full page load
- HTTP request with standard user-agent (
Mozilla/5.0): 200 OK, full HTML returned - HTTP request with ClaudeBot user-agent: 200 OK, full HTML returned
- Vercel firewall: Bot Protection inactive, zero custom rules
- Cloudflare: DNS-only mode, no proxy, no WAF active
robots.txt: explicit Allow for all major AI crawlers including ClaudeBot and anthropic-ai
Nothing on the server was blocking the request. The failure was elsewhere.
Where the failure actually lived
Anthropic support confirmed the diagnosis:
"The issue is likely our internal filtering rather than your server blocking us."
The domain was being blocked inside Claude.ai's WebFetch layer by an internal filtering rule — likely automated, likely a false positive in domain categorisation. Not a crawling block. Not an indexing block. A retrieval layer block specific to Claude.ai's interactive WebFetch tool.
That distinction matters more than it looks.
What this case does and does not prove
It is tempting to read "WebFetch blocked" as a comprehensive AI visibility failure. It isn't.
What is proven: Claude.ai's interactive WebFetch could not access the domain. Anthropic's internal filtering was the cause.
What is not proven: whether ClaudeBot was actively crawling the domain. Whether the domain was indexed internally. Whether the domain was eligible for citation in generated answers. Whether the same block applied to other Anthropic systems beyond the interactive WebFetch tool.
The WebFetch tool in Claude.ai is one surface among many. A failure there does not automatically propagate to crawling, to indexing, or to answer generation. Treating it as a total visibility failure would lead to the wrong diagnosis — and the wrong fix.
What it does reveal about LLM architecture
LLMs do not operate as neutral web readers. They actively decide what to access, what to ignore, and what to trust — through filtering layers that are opaque, platform-specific, and often automated.
The practical consequences are non-trivial:
- A domain can pass every standard web accessibility check and still be blocked at the model layer
- The same domain can behave differently across bot testing, interactive chat, and answer generation
- An access failure in one surface does not automatically propagate to others — but it signals structural risk
For SEO, the assumption is: if a crawler can access it, it can index it. For AEO, that assumption is wrong. Access is conditional, platform-specific, and often opaque.
The failure taxonomy
Not all AI visibility failures are the same. The biggest diagnostic mistake is treating them as if they are. Different failure modes require different interventions.
Blocked. The model or retrieval layer refuses to access the URL. Signals: "WebFetch blocked", domain rejected before content retrieval. This is a vendor-layer problem, not a content problem. Response: escalate to the platform, request manual review, build external-source redundancy so the brand appears even if the domain is inaccessible.
Ignored. The URL is accessible, but the model does not use it. No error, no citation, competitor sources used instead. Response: improve entity clarity, strengthen third-party corroboration, restructure the source.
Hallucinated. The model mentions the brand but inaccurately. Wrong positioning, invented services, false claims. Response: tighten entity definition, publish explicit factual anchors, increase cross-source consistency.
Partially parsed. The model accesses the page but extracts only fragments. Title understood, body ignored. Response: improve server-rendered HTML, simplify content structure, ensure critical information appears early in the document without JavaScript dependency.
The taxonomy determines whether the solution is technical remediation, source redesign, entity clarification, citation building, or vendor escalation. Skipping the diagnosis step and jumping to fixes is how organisations waste months on the wrong problem.
What to do when you hit this error
If Claude.ai returns "WebFetch blocked" for your domain, two tracks run in parallel.
Track A — Platform escalation. Contact Anthropic support. Document the discrepancy: show that the domain returns 200 OK, that ClaudeBot is not blocked, that robots.txt explicitly allows access. Request manual review and ask which internal rule triggered the block. The resolution is vendor-side, not server-side. No amount of technical optimisation on your end will remove a filtering rule inside Anthropic's systems.
Track B — Independent visibility validation. While the escalation is open, test whether the brand appears in AI answers regardless of the WebFetch block. Run structured prompts across ChatGPT, Perplexity, and Gemini. Test whether the domain is cited directly or whether a third-party source carries the citation. A WebFetch block in Claude.ai does not necessarily mean invisible across the AI ecosystem.
A practical 8-layer access checklist
If you suspect your domain has an LLM access problem, work through these layers in order. Each one can produce a false negative if tested in isolation.
Layer 1 — Raw access. Browser and HTTP test. Confirm 200 OK. Rule out server errors, redirects, and authentication walls.
Layer 2 — Bot user-agent access. Test with ClaudeBot, GPTBot, PerplexityBot user-agents via curl. Check whether the server returns different content or headers for bot requests. Confirm robots.txt allows access for each relevant crawler.
Layer 3 — HTML readability. Confirm that headings and descriptive paragraphs are present in source HTML without JavaScript execution. Critical entity-defining information should be visible near the top of the document. A model that cannot retrieve JS-rendered content gets an empty page.
Layer 4 — In-product retrieval. Test inside the actual LLM interface. Does the model fetch the URL when asked? Does it cite the page? Does it reflect the correct positioning? This is where WebFetch failures become visible.
Layer 5 — Answer-level visibility. Test whether the brand appears in generated answers even if a direct fetch fails. Run prompts about the category, the problem the brand solves, and the brand name directly. A citation can come from training data or indexed third-party sources, not only from live fetch.
Layer 6 — Comparative platform testing. Run the same prompts across ChatGPT, Claude, Perplexity, and Gemini. Which systems access, cite, or distort the source? Platform-specific failures narrow the diagnosis.
Layer 7 — Context sensitivity. Test whether access depends on conversation framing. Mention the URL earlier in the conversation before asking the model to fetch it. In some cases, prior mention of the URL reduces blocking.
Layer 8 — Escalation readiness. Collect raw HTTP results, bot user-agent results, in-product failure screenshots, and vendor responses in one place. If you need to escalate, the evidence needs to show clearly that the issue is not server-side.
The structural shift
This case is not an edge case. It is a demonstration of a structural property of how AI systems work. Visibility is gated at multiple points — HTTP access, bot crawling, internal indexing, retrieval layer filtering, answer generation weighting — and a pass at one layer does not guarantee a pass at the next.
Serious AEO work treats each layer as a separate diagnostic question. The brands that manage AI visibility well are not just the ones with clean HTML and good structured data. They are the ones that know which layer their problem lives in — and fix the right thing.