AI Features

Velocity includes built-in AI capabilities powered by a provider-agnostic LLM gateway. Features range from smart search to autonomous agents, with usage tracking and per-workspace budgets.

Overview

AI features in Velocity are plan-gated and can be enabled or disabled per workspace. Every AI call is tracked for token usage and cost, and workspaces have monthly spending budgets determined by their plan tier.

Highlights

Available Features

FeatureDescriptionMin Plan
AI SearchSemantic search across issues and documents using vector embeddings. Results are merged with keyword matches in a hybrid approach. See AI Search.Pro
AI Triage / Smart CreateParse natural language into structured issue fields (title, description, priority, labels, team, project). Constrained to valid workspace values. See Smart Issue Creation.Pro
AI Description WriterGenerate detailed issue descriptions from a brief title or summary.Pro
AI Comment DraftingDraft responses and updates for issue comments.Pro
AI SummarizationSummarize long issue threads, projects, or documents into concise overviews.Pro
AI Document DraftingGenerate full document drafts from a topic or outline.Business
AI Release NotesAuto-generate release notes from completed issues in a cycle or project.Business
AI AgentAutonomous agent that can research, plan, and execute multi-step tasks on your behalf.Business
Bring Your Own KeyUse your own API keys for AI providers instead of Velocity’s built-in credits.Enterprise

Plan Tiers & AI Budgets

Each plan includes a monthly AI spending budget. Usage is tracked in real time and visible in Settings → Billing.

PlanAI EnabledMonthly BudgetMonthly RequestsConcurrent Streams
FreeNo1
ProYes$55003
BusinessYes$202,0005
EnterpriseYes$100Unlimited10

LLM Gateway

Velocity uses a provider-agnostic LLM gateway that supports multiple AI providers. The gateway automatically routes requests through available providers and handles failover if one is unavailable.

Supported Providers

ProviderDefault ModelUsed For
MiniMaxMiniMax-M1Primary completions and streaming
Kimi (Moonshot)moonshot-v1-autoFallback completions
Anthropicclaude-sonnet-4-5-20250929AI Agent, complex tasks
OpenAIgpt-4o-miniEmbeddings (text-embedding-3-small), fallback

Gateway Features

  • Streaming responses — real-time token-by-token output with a typing cursor
  • Concurrent stream limits — controlled per plan tier to prevent abuse
  • Automatic retry — retries with exponential backoff on transient failures (429, 5xx)
  • Provider fallback — if the primary provider fails, automatically tries the next
  • Workspace overrides — workspaces can configure a preferred provider and model
  • BYOK support — Enterprise workspaces can use their own API keys

Embedding Pipeline

Velocity generates vector embeddings for issues and documents to power semantic search. Embeddings use OpenAI’s text-embedding-3-small model (1536 dimensions) and are stored in PostgreSQL via the pgvector extension with HNSW indexing.

Embedding generation is fire-and-forget — when an issue or document is created or updated, an async request is dispatched to the embedding service. This means entity creation is never blocked by the embedding process. The embedding service deduplicates requests using content hashing to avoid unnecessary API calls.

Usage Tracking

Every AI API call is logged with:

  • Feature name (search, triage, agent, embedding, etc.)
  • Model and provider used
  • Input and output token counts
  • Cost in cents (calculated from per-model pricing)
  • Duration in milliseconds
  • User who triggered the request

Monthly aggregates are available via the aiUsage GraphQL query with per-feature breakdowns. Budget status is available via aiBudgetStatus.

Budget Enforcement

When a workspace approaches its monthly AI budget:

  • 80% used — a soft warning is shown to workspace admins
  • 100% used — AI features are blocked until the next billing cycle or an upgrade

Enterprise workspaces can set custom budget overrides via the setAiBudgetOverride mutation.

Streaming & Concurrent Limits

AI features that use streaming (like the AI agent and description writer) deliver responses in real time with a typing cursor animation. Each user has a concurrent stream limit based on their workspace’s plan tier. If you exceed the limit, the request is rejected with a 429 status — wait for an existing stream to finish before starting a new one.

Feature Flag Overrides

Workspace administrators can override plan-tier AI access using feature flags. Feature flags use the ai:<feature_name> key convention (e.g., ai:ai_search) and can be managed in the admin dashboard under Feature Flags.

Tip

Feature flag overrides take precedence over plan tier defaults, allowing you to grant specific AI features to individual workspaces regardless of their plan.