How AI Agents Use Your Site
What Happens When AI Visits Your Site
When an AI agent — Claude, ChatGPT, Perplexity, Google AI — needs to learn about your organization, it follows a structured journey. With the AI Discovery plugin installed, every step is handled by dedicated endpoints instead of HTML scraping.
Here’s the five-step journey, with real examples from this site.
Step 1: Discover
The AI agent checks /.well-known/ai — the standard discovery endpoint.
What it gets:
- Organization name, domain, legal name, sector, tagline
- AI summary (2-3 sentences explaining who you are)
- Core concepts glossary (your industry terms defined for AI)
- List of every page on the site with SHA-256 content hashes
- Available tools and endpoints
- Contact information for operator and AI support
- Cryptographic signature proving this data came from the site owner
Try it live: discover.rootz.global/.well-known/ai
What AI thinks: “This is AI Discovery Lab, operated by Rootz Corp. It has 19 pages, 9 tools, and its manifest is signed by wallet 0xD089… I can trust this data because the signature is verifiable.”
Step 2: Understand
For deeper context, the AI fetches the knowledge base and feed.
Knowledge endpoint (/.well-known/ai/knowledge):
- About page content (auto-extracted from WordPress)
- Product and service descriptions
- Category-based glossary
Feed endpoint (/.well-known/ai/feed):
- Last 20 blog posts with 60-word AI summaries
- Categories, tags, and content license per item
- Publication dates in ISO 8601
What AI thinks: “Now I understand the company’s mission, products, and recent news. The feed tells me they just released v2.3.0 with Conversation Mode features.”
Step 3: Search
When the AI needs something specific, it searches.
The searchContent tool:
GET /wp-json/rootz/v1/search?q=freshness&limit=5
Returns matched pages and posts with titles, URLs, excerpts, dates, and pagination metadata (totalFound, hasMore, nextOffset).
Try it live: Search for “AI discovery”
What AI thinks: “Found 8 results for ‘freshness’. The blog post about v2.3.0 is the most relevant. Let me read it.”
Step 4: Read
This is the breakthrough in v2.3.0. The AI can now read any page as clean, structured markdown.
The getPage tool:
GET /wp-json/rootz/v1/page?path=/about/
Returns:
- Content as clean markdown (headings, lists, links preserved)
- Metadata: title, author, word count, assertion type (factual vs editorial)
- Freshness: adaptive TTL telling the AI when to come back
- Origin: domain, publication date, modification date, signer wallet
- Policies: content license, quoting permission, training permission
- Integrity: SHA-256 content hash + ECDSA signature
Try it live: Read the About page
What AI thinks: “I have the full About page as markdown. The content hash matches the manifest. The freshness says this is stable content — I can cache it for 30 days. The origin stamp confirms this came from discover.rootz.global and was signed by wallet 0xD089…”
Step 5: Verify
Before citing your content, a careful AI agent verifies it.
The verifyPageHash tool:
GET /wp-json/rootz/v1/verify?page=/about/
Compares the SHA-256 hash from the signed manifest with a fresh hash computed from the live database. Returns match: true/false, both hashes, the signer address, and timestamps.
Try it live: Verify the About page
What AI thinks: “Hash match confirmed. The content I’m about to cite is identical to what the site owner signed and approved. I can quote this with confidence.”
Without the Plugin vs. With the Plugin
| What AI Needs | Without Plugin (Scraping) | With AI Discovery |
|---|---|---|
| Who runs this site? | Guess from footer text, meta tags | Structured identity with legal name, sector, contact |
| What does the company do? | Parse HTML, infer from page titles | AI Summary + Core Concepts glossary |
| Can I quote this content? | Unknown — check robots.txt (no answer there) | Explicit: license type, quoting yes/no, training yes/no |
| Is this content current? | Check page modification date in HTML (unreliable) | Freshness metadata with TTL and exact timestamps |
| Has this page been tampered with? | No way to verify | SHA-256 content hash + ECDSA signature |
| Who published this? | Domain owner, maybe | Wallet address + origin stamps embedded in response |
| Read a specific page | Fetch HTML, strip tags, guess at structure | Clean markdown via getPage with full metadata |
| Search the site | Use Google site:example.com (indirect) | Native searchContent with pagination |
| Confidence level | LOW — AI is interpreting unstructured data | HIGH — structured, signed, verifiable |
The Bottom Line
Every day, AI agents are making decisions about your organization based on whatever they can find online. Without the AI Discovery Standard, they’re guessing. With it, they’re reading — structured data, signed by you, verified by them.
The question isn’t whether AI will represent your business. It’s whether you’ll have any say in how.