Per-claim verification, confidence scoring, and machine-readable recommendations so your agent knows what to trust before it acts.
Every evaluation returns a numeric confidence score per claim. Agents consume scores programmatically — no parsing, no guessing.
An agent acting on unverified data causes real-world mistakes — wrong trades, false emails, bad decisions at scale. Raw retrieval is not enough.
Submit any content from any source. Get back structured, per-claim confidence scores and a machine-readable recommendation.
POST /evaluate { "content": "LangGraph leads AI frameworks in 2026. CrewAI was acquired by Google.", "source": "exa", "min_confidence": 0.8 }
{ "evaluation_id": "eval_abc123", "evaluation": { "overall_confidence": 0.71, "recommendation": "verify", "total_claims": 2, "claims": [ { "claim": "LangGraph leads...", "verdict": "supported", "confidence": 0.94, "evidence": "Multiple sources confirm position" }, { "claim": "CrewAI acquired by Google", "verdict": "refuted", "confidence": 0.08, "correction": "CrewAI remains independent Apr 2026" } ] } }
Recommendation Logic
| Recommendation | Confidence Range | Agent Behavior |
|---|---|---|
| act | > 0.8 | Proceed — claims sufficiently verified |
| verify | 0.5 – 0.8 | Pause — route to secondary check or human review |
| reject | < 0.5 | Discard — evidence insufficient or contradicted |
Agents report whether evaluations were accurate. Every signal improves the scoring model for every agent on the network. The system gets smarter with each loop.
POST /feedback { "evaluation_id": "eval_abc123", "outcome": "accurate" }
Feedback is always free. The network effect compounds — more loops yield faster accuracy improvements across all sources.
Every data source accumulates a reputation score from verified claim history. Query any domain to get its current trust profile before using its data.
GET /reputation?domain=reuters.com GET /reputation?domain=arxiv.org GET /reputation?source=exa
| Domain / Source | Reputation Score | Content Type | Verified Claims |
|---|---|---|---|
| reuters.com | 0.95 |
news | 142,881 |
| arxiv.org | 0.91 |
research | 98,220 |
| techcrunch.com | 0.82 |
tech news | 61,450 |
| reddit.com | 0.58 |
social | 312,104 |
| unknown-blog.io | 0.29 |
unclassified | 44 |
AgentOracle is indexed in Coinbase Bazaar discovery. Every paid query settles on Base mainnet via the Coinbase Developer Platform facilitator. Receipts are publicly verifiable.
| Date (UTC) | Endpoint | Amount | Tx Hash |
|---|---|---|---|
| May 9, 2026 | /bazaar-bootstrap | $0.02 USDC | 0xe9b4f382…bce9d4 |
| May 7, 2026 | /bazaar-bootstrap | $0.02 USDC | 0xa871d00d…82cc |
| May 4, 2026 | /bazaar-bootstrap | $0.02 USDC | 0x515d7b01…388c |
| May 4, 2026 | /bazaar-bootstrap | $0.02 USDC | 0x33954f54…e0c9 |
| May 4, 2026 | /bazaar-bootstrap | $0.02 USDC | 0x4c5cf98c…dd03 |
All settlements relayed by Coinbase CDP facilitator (0x68a96f41…). PayTo: 0xdF90200B0031051BbF7a66BB9387d2Ecf599e109.
AgentOracle sits between data retrieval and agent action. Any source, any framework, any agent runtime.
Your agent fetches content from any source — Exa, Perplexity, web fetch, internal databases, API responses. No changes to your retrieval layer.
Submit the content string, source identifier, and your minimum confidence threshold. AgentOracle decomposes it into claims and verifies each one independently.
Each claim returns a verdict, confidence score, evidence, and corrections. The top-level recommendation tells the agent to act, verify, or reject.
After acting, the agent submits whether the evaluation was accurate. This closes the verification loop and contributes to the shared scoring model.
Every feedback signal is aggregated into the source reputation model. Every agent on the network benefits from every other agent's feedback — a compounding trust graph.
USDC micropayments via x402 — no subscriptions, no rate limits, no API keys to manage.
Add trust infrastructure to your agent in under five minutes. No API keys, no subscriptions — pay per verification.