Agent Readiness
The attribute nobody has — and everybody's going to need
I've spent most of my career inside B2B data products — databases with hundreds of millions of company records and thousands of attributes per record. Firmographics, technographics, intent signals, org charts. The works.
None of them have this one.
Agent Readiness scores how ready a business's website is for AI agent consumption. Can an agent read it? Is there structured data? An API? A manifest that says "hey, I'm here, here's what I can do"? It turns out most businesses score terribly — and the ones that don't are going to have a real advantage as the web shifts from human-first to human-and-agent.
The gap in every database
Traditional B2B databases are built for human researchers and sales reps. They tell you what a company does, who works there, what technology they use. That's valuable. But they don't tell you anything about whether that company's digital presence is actually consumable by the AI agents that are increasingly doing the research, the outreach, and the buying.
This matters if you're building AI workflows, selling AI tools, or just trying to figure out which prospects are sophisticated enough to care about what you're selling. A company with a robots.txt, a sitemap.xml, JSON-LD structured data, and an OpenAPI spec is a very different prospect than one with a Squarespace splash page and no meta tags.
And here's the thing — you don't need to run this on 150 million records. You run it on the companies you actually care about. Enrich on demand, score what matters, skip what doesn't. That's the whole point of building attributes that are fast and cheap enough to run in real time.
What we measure
Six categories, scored 0–10. No LLM required — it's all deterministic HTTP checks. Takes 2–5 seconds per domain.
Crawlability
10 ptsCan agents discover the site? robots.txt, sitemap.xml, AI bot access rules.
Machine Readability
10 ptsIs the content structured for machines? JSON-LD, Schema.org, semantic HTML.
API Readiness
10 ptsCan agents query programmatically? .well-known/ai-plugin.json, OpenAPI spec, API docs.
Agentic Commerce
10 ptsCan agents transact? MCP manifests, payment APIs, product feeds, structured pricing.
Content Access
10 ptsCan agents actually read it? SSR, clean text, reasonable page size, no bot blocks.
Agent Signals
10 ptsDoes the site explicitly support AI? llms.txt, ai-plugin.json, MCP manifest.
We scanned 2,681 businesses
Every Boise business in our dataset that has a website. Here's where things actually stand.
Signal adoption
2,681 Boise businesses with websites — full scan, April 2026
What this actually tells you
The basics are covered. Most businesses have robots.txt and server-rendered HTML — that's table stakes and has been for a decade. About half have JSON-LD, mostly because their website builder adds it automatically.
The drop-off starts at the AI-specific signals. Only 21% have llms.txt. Under 12% have anything resembling API documentation. And fewer than 10% have an OpenAPI spec, MCP manifest, or ai-plugin.json.
That bottom tier is where it gets interesting for GTM. If you're selling AI tooling, integration platforms, or agent-based workflows, the 3.7% of businesses that score an A are your early adopters. They've already done the work. The 44% in the C range are reachable but need education. The 20% at F — they're not ready for what you're selling yet.
That's a segmentation you can't get from any existing database.
Enrich what you need, when you need it
The old approach to business data is: collect everything on every company, store it, keep it updated. That works for stable attributes like address, employee count, industry code. It doesn't work for attributes that change constantly or only matter in specific contexts.
Agent readiness changes every time a company updates their website. Running it on 150 million records weekly would be expensive and pointless. But running it on the 500 companies in your pipeline? That takes about 40 minutes and costs nothing.
This is how we think about new attributes in general. You have your base layer — the stuff a big database does well. Then you add context-specific scoring on top, on demand, for the companies you're actually working. Agent readiness is one of those attributes. We've built others — buyer persona, regulatory exposure, revenue model, seasonality — that run on local LLMs at $0 cost.
Agent readiness is different in one important way: it doesn't need a model at all. It's deterministic HTTP checks. That means it's fast, cheap, perfectly reproducible, and you can run it at scale without worrying about inference costs or model drift.
Fixing your own score
We ran it on our own site first and scored a 6 out of 10. Had robots.txt and llms.txt but nothing else. So we added:
Took about 30 minutes. Most of it is boilerplate once you know what to add.
So what
The web is quietly growing a second audience. Humans still browse, but AI agents are increasingly the ones doing research, comparing options, pulling data, and making recommendations. The businesses that are set up for both audiences will get found. The ones that aren't, won't.
It reminds me of the early SEO days — not in a hype-y "you need to optimize for AI!" way, but in a practical one. In 2005, if you didn't have a sitemap and decent meta tags, Google couldn't index you properly. It wasn't complicated, it just wasn't on anyone's radar yet. That's where we are with agent readiness right now. The fixes are straightforward. Most people just haven't thought about it.
We built a scanner. You can run it on any domain. If you're interested in adding agent readiness as an attribute to your own data, reach out.
How we built this
This whole thing runs on a Mac Studio M4 Max sitting under my desk. We call it Stu. 36GB of unified memory, running Ollama with a handful of open-source models — gemma4, llama3.1, gemma2. Total cloud cost for the enrichment pipeline: $0.
The agent readiness scanner itself is just Python making HTTP requests. No model needed. It checks ~15 endpoints per domain and scores what it finds. We ran 2,681 domains in about 7 hours overnight. The scanner runs in parallel with our LLM benchmarks — so while Stu was classifying 3,197 businesses across buyer persona, regulatory exposure, and four other attributes, the laptop was scanning websites for agent signals.
The whole project — defining the attribute, building the scanner, running 2,681 scans, writing this post — happened in one night. My co-founder Jody and I have been building Product Hacker this way for months: pick a problem, ship something that works, measure it against real data. We've shipped 9 apps this way. The B2B data enrichment pipeline is one of the more interesting ones.
I spend my days working in B2B data at scale. Product Hacker is where I get to experiment with the stuff that's too new or too weird for a large organization to try yet. The ideas cross-pollinate. That's the whole point.