Documentation

Integration guides and API reference.

Quick Start

SemanticGuard is an OpenAI-compatible proxy. Point your client at SemanticGuard instead of the provider, add your SG API key, and all requests are cached, logged, and tracked.

curl https://semanticguard.dev/api/proxy/v1/chat/completions \
-H "Authorization: Bearer your-openai-api-key" \
-H "x-sg-api-key: sg-your-key-here" \
-H "x-sg-project: my-project" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello"}]
}'

AI SDK Integration

Using the Vercel AI SDK? Add a fetch wrapper to any provider. Works with OpenAI, Anthropic, Vertex AI, and any provider that accepts a custom fetch function.

import { createOpenAI } from "@ai-sdk/openai";
import { withSemanticGuard } from "@semanticguard/ai-sdk";
const openai = createOpenAI({
apiKey: "your-openai-key",
fetch: withSemanticGuard({
gatewayUrl: "https://semanticguard.dev",
apiKey: "sg-your-key-here",
projectId: "my-project", // optional
}),
});
const result = await generateText({
model: openai("gpt-4o-mini"),
prompt: "Hello",
});

Authentication

Every request needs two keys:

  • Your LLM API key (passed to the upstream provider via Authorization: Bearer or x-api-key)
  • Your SemanticGuard key (via x-sg-api-key header). Generate one from the API Keys page in the dashboard.

Supported Providers

ProviderAuth HeaderModels
OpenAIAuthorization: Bearer sk-...gpt-4o, gpt-4o-mini, gpt-4.1-*, o3, o4-mini
Anthropicx-api-key: sk-ant-...claude-sonnet-4, claude-opus-4, claude-haiku-4
GoogleAuthorization: Bearer ...gemini-2.5-flash, gemini-2.5-pro
Azure OpenAIAuthorization: Bearer <azure-key>gpt-4o, gpt-4o-mini (via x-sg-provider: azure)
AWS Bedrockx-sg-aws-access-key + x-sg-aws-secret-keyamazon.titan-*, meta.llama3-*, cohere.command-r-*

Azure requires x-sg-provider: azure, x-sg-azure-resource, and x-sg-azure-deployment headers. Bedrock requires x-sg-aws-access-key and x-sg-aws-secret-key. Other providers (Mistral, etc.) work via the passthrough proxy.

Response Headers

HeaderExampleDescription
x-sg-cachehit-exact, hit-semantic, missCache result. Includes the layer that matched.
x-sg-latency12msTotal proxy processing time
x-sg-provideropenai, anthropic, google, azure, bedrockDetected upstream provider
x-sg-score0.97Similarity score (semantic hits only)
x-sg-confidence0.872Confidence score (0-1). Factors: similarity, age, template completeness, model recency.
x-sg-prompt-categoryfactual, code, creative, extraction, instruction, generalAuto-classified prompt category. Code and creative prompts use stricter matching thresholds.

Cache Pipeline

Requests pass through multiple cache layers in order. The first match wins.

  1. Exact match - Normalized, lowercased SHA-256 hash lookup in Redis. Fastest layer.
  2. Conversation match - For multi-turn chats: tries full history hash, then a sliding window (last 4 messages), then system prompt + last message. Enables partial-match hits for long conversations.
  3. Template match - Extracts entities (emails, names, prices, orgs, places) via regex + NER and looks up the skeleton hash. Catches prompts that differ only in entity values.
  4. Template substitution - If the skeleton matches a verified response template, replaces entity placeholders with new values. Subject to confidence scoring.
  5. Semantic match - Vector similarity search on the skeleton text. Threshold adapts by prompt category (stricter for code/creative). Subject to entity-hash guard and confidence scoring.

On miss, responses are stored across all layers. Entity extraction uses both regex patterns and compromise.js NER (people, organizations, places). Per-tenant custom entities are learned automatically from sampled misses.

Safety mechanisms

  • Entity-hash guard: semantic matches with different entities are rejected
  • Confidence gate: rejects stale entries, incomplete substitutions, and cross-generation model mismatches
  • Category-adaptive thresholds: code prompts require 0.97 similarity, creative prompts 1.0 (effectively disabled)
  • Vector TTL: entries expire and are lazily evicted on lookup or via scheduled garbage collection