Documentation

Integration guides and API reference.

Quick Start

SemanticGuard is an OpenAI-compatible proxy. Point your client at SemanticGuard instead of the provider, add your SG API key, and all requests are cached, logged, and tracked.

curl https://semanticguard.dev/api/proxy/v1/chat/completions \
  -H "Authorization: Bearer your-openai-api-key" \
  -H "x-sg-api-key: sg-your-key-here" \
  -H "x-sg-project: my-project" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

AI SDK Integration

Using the Vercel AI SDK? Add a fetch wrapper to any provider. Works with OpenAI, Anthropic, Vertex AI, and any provider that accepts a custom fetch function.

import { createOpenAI } from "@ai-sdk/openai";
import { withSemanticGuard } from "@semanticguard/ai-sdk";

const openai = createOpenAI({
  apiKey: "your-openai-key",
  fetch: withSemanticGuard({
    gatewayUrl: "https://semanticguard.dev",
    apiKey: "sg-your-key-here",
    projectId: "my-project",  // optional
  }),
});

const result = await generateText({
  model: openai("gpt-4o-mini"),
  prompt: "Hello",
});

Authentication

Every request needs two keys:

Your LLM API key (passed to the upstream provider via Authorization: Bearer or x-api-key)
Your SemanticGuard key (via x-sg-api-key header). Generate one from the API Keys page in the dashboard.

Supported Providers

Provider	Auth Header	Models
OpenAI	`Authorization: Bearer sk-...`	gpt-4o, gpt-4o-mini, gpt-4.1-*, o3, o4-mini
Anthropic	`x-api-key: sk-ant-...`	claude-sonnet-4, claude-opus-4, claude-haiku-4
Google	`Authorization: Bearer ...`	gemini-2.5-flash, gemini-2.5-pro
Azure OpenAI	`Authorization: Bearer <azure-key>`	gpt-4o, gpt-4o-mini (via x-sg-provider: azure)
AWS Bedrock	`x-sg-aws-access-key + x-sg-aws-secret-key`	amazon.titan-, meta.llama3-, cohere.command-r-*

Azure requires x-sg-provider: azure, x-sg-azure-resource, and x-sg-azure-deployment headers. Bedrock requires x-sg-aws-access-key and x-sg-aws-secret-key. Other providers (Mistral, etc.) work via the passthrough proxy.

Response Headers

Header	Example	Description
`x-sg-cache`	hit-exact, hit-semantic, miss	Cache result. Includes the layer that matched.
`x-sg-latency`	12ms	Total proxy processing time
`x-sg-provider`	openai, anthropic, google, azure, bedrock	Detected upstream provider
`x-sg-score`	0.97	Similarity score (semantic hits only)
`x-sg-confidence`	0.872	Confidence score (0-1). Factors: similarity, age, template completeness, model recency.
`x-sg-prompt-category`	factual, code, creative, extraction, instruction, general	Auto-classified prompt category. Code and creative prompts use stricter matching thresholds.

Cache Pipeline

Requests pass through multiple cache layers in order. The first match wins.

Exact match - Normalized, lowercased SHA-256 hash lookup in Redis. Fastest layer.
Conversation match - For multi-turn chats: tries full history hash, then a sliding window (last 4 messages), then system prompt + last message. Enables partial-match hits for long conversations.
Template match - Extracts entities (emails, names, prices, orgs, places) via regex + NER and looks up the skeleton hash. Catches prompts that differ only in entity values.
Template substitution - If the skeleton matches a verified response template, replaces entity placeholders with new values. Subject to confidence scoring.
Semantic match - Vector similarity search on the skeleton text. Threshold adapts by prompt category (stricter for code/creative). Subject to entity-hash guard and confidence scoring.

On miss, responses are stored across all layers. Entity extraction uses both regex patterns and compromise.js NER (people, organizations, places). Per-tenant custom entities are learned automatically from sampled misses.

Safety mechanisms

Entity-hash guard: semantic matches with different entities are rejected
Confidence gate: rejects stale entries, incomplete substitutions, and cross-generation model mismatches
Category-adaptive thresholds: code prompts require 0.97 similarity, creative prompts 1.0 (effectively disabled)
Vector TTL: entries expire and are lazily evicted on lookup or via scheduled garbage collection