AI Features

Velocity includes built-in AI capabilities powered by a provider-agnostic LLM gateway. Features range from smart search to autonomous agents, with usage tracking and per-workspace budgets.

Overview

AI features in Velocity are plan-gated and can be enabled or disabled per workspace. Every AI call is tracked for token usage and cost, and workspaces have monthly spending budgets determined by their plan tier.

Highlights

AI Search

Hybrid keyword + semantic search across issues and documents. Understands meaning, not just keywords.

Smart Issue Creation

Describe an issue in natural language. AI extracts title, priority, team, labels, and project for you.

Available Features

Feature	Description	Min Plan
AI Search	Semantic search across issues and documents using vector embeddings. Results are merged with keyword matches in a hybrid approach. See AI Search.	Pro
AI Triage / Smart Create	Parse natural language into structured issue fields (title, description, priority, labels, team, project). Constrained to valid workspace values. See Smart Issue Creation.	Pro
AI Description Writer	Generate detailed issue descriptions from a brief title or summary.	Pro
AI Comment Drafting	Draft responses and updates for issue comments.	Pro
AI Summarization	Summarize long issue threads, projects, or documents into concise overviews.	Pro
AI Document Drafting	Generate full document drafts from a topic or outline.	Business
AI Release Notes	Auto-generate release notes from completed issues in a cycle or project.	Business
AI Agent	Autonomous agent that can research, plan, and execute multi-step tasks on your behalf.	Business
Bring Your Own Key	Use your own API keys for AI providers instead of Velocity’s built-in credits.	Enterprise

Plan Tiers & AI Budgets

Each plan includes a monthly AI spending budget. Usage is tracked in real time and visible in Settings → Billing.

Plan	AI Enabled	Monthly Budget	Monthly Requests	Concurrent Streams
Free	No	—	—	1
Pro	Yes	$5	500	3
Business	Yes	$20	2,000	5
Enterprise	Yes	$100	Unlimited	10

LLM Gateway

Velocity uses a provider-agnostic LLM gateway that supports multiple AI providers. The gateway automatically routes requests through available providers and handles failover if one is unavailable.

Supported Providers

Provider	Default Model	Used For
MiniMax	MiniMax-M1	Primary completions and streaming
Kimi (Moonshot)	moonshot-v1-auto	Fallback completions
Anthropic	claude-sonnet-4-5-20250929	AI Agent, complex tasks
OpenAI	gpt-4o-mini	Embeddings (text-embedding-3-small), fallback

Gateway Features

Streaming responses — real-time token-by-token output with a typing cursor
Concurrent stream limits — controlled per plan tier to prevent abuse
Automatic retry — retries with exponential backoff on transient failures (429, 5xx)
Provider fallback — if the primary provider fails, automatically tries the next
Workspace overrides — workspaces can configure a preferred provider and model
BYOK support — Enterprise workspaces can use their own API keys

Embedding Pipeline

Velocity generates vector embeddings for issues and documents to power semantic search. Embeddings use OpenAI’s text-embedding-3-small model (1536 dimensions) and are stored in PostgreSQL via the pgvector extension with HNSW indexing.

Embedding generation is fire-and-forget — when an issue or document is created or updated, an async request is dispatched to the embedding service. This means entity creation is never blocked by the embedding process. The embedding service deduplicates requests using content hashing to avoid unnecessary API calls.

Usage Tracking

Every AI API call is logged with:

Feature name (search, triage, agent, embedding, etc.)
Model and provider used
Input and output token counts
Cost in cents (calculated from per-model pricing)
Duration in milliseconds
User who triggered the request

Monthly aggregates are available via the aiUsage GraphQL query with per-feature breakdowns. Budget status is available via aiBudgetStatus.

Budget Enforcement

When a workspace approaches its monthly AI budget:

80% used — a soft warning is shown to workspace admins
100% used — AI features are blocked until the next billing cycle or an upgrade

Enterprise workspaces can set custom budget overrides via the setAiBudgetOverride mutation.

Streaming & Concurrent Limits

AI features that use streaming (like the AI agent and description writer) deliver responses in real time with a typing cursor animation. Each user has a concurrent stream limit based on their workspace’s plan tier. If you exceed the limit, the request is rejected with a 429 status — wait for an existing stream to finish before starting a new one.

Feature Flag Overrides

Workspace administrators can override plan-tier AI access using feature flags. Feature flags use the ai:<feature_name> key convention (e.g., ai:ai_search) and can be managed in the admin dashboard under Feature Flags.

Tip

Feature flag overrides take precedence over plan tier defaults, allowing you to grant specific AI features to individual workspaces regardless of their plan.