Integrations

Your entire AI stack. One Quality Index.

15 connectors across LLM providers, eval platforms, vector databases, guardrails, and experiment tracking. Plug in what you use, ignore what you don't.

View Integration Guides

How It Fits

qualityindex.ai in your infrastructure

Whether you run a simple RAG pipeline or a multi-agent system with 50 tools, we plug into every layer and give you one score.

Connect everything. Get one score.

qualityindex.ai plugs into every layer of your AI stack — LLM providers, eval tools, guardrails, and experiment trackers — and distills it all into a single Quality Index.

Before & After

Stop guessing. Go from scattered tools and unanswered questions to a single dashboard that tells you exactly where you stand — and what to do next.

How it works — in 3 steps

Connect your tools in 5 minutes. We analyze traces, evals, and guardrail events. You ship with confidence, knowing your Quality Index has your back.

15 Connectors

Plug into where you already work

Each connector has a step-by-step guide with copy-paste configs.

Core Connectors

The foundation. These three cover 80% of teams on day one.

OpenTelemetry

Industry-standard observability protocol for traces, metrics, and logs.

Universal coverage — if your tool emits OTel, we ingest it automatically.

LangSmith

LLM tracing, dataset management, and evaluation by LangChain.

Sync eval scores and datasets directly into your Quality Index.

GitHub

Code hosting, pull requests, and CI/CD workflows.

Link deploys to quality changes and auto-generate remediation PRs.

Eval & Tracing

Plug in your evaluation framework. Scores flow straight into your Quality Index.

Langfuse

Open-source LLM observability with tracing and prompt management.

Forward traces and user feedback to build a complete quality picture.

Promptfoo

CLI-first eval framework for testing LLM outputs in CI/CD.

Run evals on every PR and feed pass/fail results into the Task Quality pillar.

Ragas

Automated evaluation metrics purpose-built for RAG pipelines.

Measure faithfulness, relevance, and context precision per retrieval.

Braintrust

Experiment tracking and eval platform for LLM applications.

Sync experiment results so every iteration is tracked in the Quality Index.

DeepEval

Unit-test style evaluation framework for LLM outputs.

Run 14+ metrics (hallucination, bias, toxicity) and report as pillar scores.

Arize Phoenix

Open-source observability platform for LLM traces and embeddings.

Visualize embedding drift and annotation scores alongside your index.

Safety & Guardrails

Every blocked prompt and validated output feeds the Safety pillar.

Guardrails AI

Programmable guardrails with 50+ validators for LLM I/O.

Per-validator pass/fail rates feed directly into your Safety score.

Lakera Guard

Real-time prompt injection and data leakage detection.

Every blocked attack attempt raises your Safety score automatically.

NeMo Guardrails

NVIDIA's toolkit for controllable and safe LLM conversations.

Track rail activations and topical guardrail compliance over time.

Gateways & Experiment Tracking

Route through any gateway. Track experiments across any platform.

LiteLLM

Unified API gateway supporting 100+ LLM providers.

One integration captures cost, latency, and errors across all your models.

Weights & Biases

ML experiment tracking, model registry, and dataset versioning.

Sync run metrics so model upgrades are reflected in your Quality Index.

MLflow

Open-source platform for ML lifecycle management.

Track experiment runs and model versions alongside quality trends.

Works with your entire stack

Any tool that has a REST API or emits telemetry works with qualityindex.ai — even without a dedicated connector.

OpenAI

GPT-4o, o1, embeddings

Anthropic

Claude 3.5 Opus & Sonnet

AWS Bedrock

Managed LLM gateway

Pinecone

Vector database

Hugging Face

Open-source model hub

LangChain

LLM app framework

LlamaIndex

Data framework for RAG

Qdrant

Vector similarity search

Weaviate

AI-native vector DB

Chroma

Embedding database

CrewAI

Multi-agent orchestration

LangGraph

Stateful agent workflows

DSPy

Programmatic prompt optimization

pgvector

Postgres vector extension

Haystack

NLP pipeline framework

AutoGen

Multi-agent conversations

Don't see your tool? If it has an API, we can connect it. View all integration guides

Connect in minutes. Score in 30.

Pick your connectors, follow the guides, and see your first Quality Index today.

Integration Guides