Pricing

Pay for what you save.

Start with Shadow Mode. It logs every request and shows what you would save before any cached response is served. No credit card.

Choose a tier

Free

$0

10K requests/mo

  • Shadow Mode shows potential savings
  • Identical-match cache
  • Cost analytics dashboard
  • Request tracing and logging
Get started
Popular

Pro

$49/mo

50K included, then $0.50/1K

  • Full multi-layer caching
  • Advanced pattern matching
  • Advanced analytics + projections
  • Up to 500K requests/mo
Start free, upgrade later

Enterprise

15%

of documented savings

  • $500/mo minimum commitment
  • Unlimited requests
  • We win when you save
  • AWS/GCP marketplace billing

Billed monthly on the dollars cached, with a full audit log on every invoice. If the cache doesn't deliver, you pay only the $500 floor.

Talk to sales

How we compare to native provider caching

OpenAI and Anthropic both ship caching primitives. They're free and well-built within their own ecosystems. The trade-offs below are about what each cache covers, not whether one is cheaper than the other.

SemanticGuardOpenAI prompt cacheAnthropic prompt caching
Pricing modelFree tier + $49/mo Pro + 15% of savings EnterpriseFree, applied automatically on exact-match cache hits90% discount on cached input tokens (5-min TTL)
What gets cachedSame wording, different wording, and multi-turn context, caught by multi-layer verificationExact prompt prefix onlyExplicitly-marked cache blocks only
Cross-providerYes. Cache a GPT-4o response, serve it to a Claude-Opus paraphraseNo (OpenAI only)No (Anthropic only)
Cache correctness audit100% on our public benchmark, AI-judgedNo public correctness numberNo public correctness number
ObservabilityDashboard + Prometheus + per-request tracingToken counters in API responseToken counters in API response
Per-tenant kill switch + audit logYesN/A (provider-side)N/A (provider-side)

Pricing FAQ

What counts as a request on the free tier?
Every proxy call to /api/proxy/*, regardless of cache hit or miss. Shadow-mode requests count too. That's why the free tier is sized for 10K/month rather than 1K.
What happens at 50K requests on Pro?
Overage kicks in at $0.50 per additional 1K requests. Hard cap at 500K/month; above that we route you to Enterprise. You get a usage alert at 80% of any threshold.
How is 'documented savings' calculated for Enterprise?
Sum of upstream costs we avoided based on cache hits times the published per-million-token price of the model that would otherwise have served the request. The 15% comes out monthly; full audit log and raw numbers are exported with the invoice.
Can I self-host?
Yes, with a license key (Pro or above). The Terraform module deploys the gateway into your own Vercel, AWS, or GCP account, so your data stays in your VPC. Contact enterprise@semanticguard.dev.
Annual discount?
10% off Pro paid annually. Enterprise contracts are custom.

Not sure which tier fits?

Run Shadow Mode for a week. The dashboard shows exactly what you'd save by enabling each tier. No credit card, no commitment.

Start free