MLflow AI Gateway
MLflow AI Gateway does not support Windows.
MLflow AI Gateway provides a unified interface for deploying and managing multiple LLM providers within your organization. It simplifies interactions with services like OpenAI, Anthropic, and others through a single, secure endpoint.
The gateway server excels in production environments where organizations need to manage multiple LLM providers securely while maintaining operational flexibility and developer productivity.
Unified Interface
Access multiple LLM providers through a single endpoint, eliminating the need to integrate with each provider individually.
Centralized Security
Store API keys in one secure location with request/response logging for audit trails and compliance.
Provider Abstraction
Switch between OpenAI, Anthropic, Azure OpenAI, and other providers without changing your application code.
Zero-Downtime Updates
Add, remove, or modify endpoints dynamically without restarting the server or disrupting running applications.
Cost Optimization
Monitor usage across providers and optimize costs by routing requests to the most efficient models.
Team Collaboration
Shared endpoint configurations and standardized access patterns across development teams.
Getting Started
Choose your path to get up and running with MLflow AI Gateway:
Setup
Install MLflow, configure environment, and start your gateway server
Configuration
Configure providers, endpoints, and advanced gateway settings
Usage
Query endpoints with Python client and REST APIs
Integration
Integrate with applications, frameworks, and production systems
Quick Start
Get your AI Gateway running with OpenAI in under 5 minutes:
- 1. Install
- 2. Configure
- 3. Start Server
- 4. Test
Install MLflow with gateway dependencies:
pip install 'mlflow[gateway]'
Set your OpenAI API key:
export OPENAI_API_KEY=your_api_key_here
Create a simple configuration file config.yaml
:
endpoints:
- name: chat
endpoint_type: llm/v1/chat
model:
provider: openai
name: gpt-3.5-turbo
config:
openai_api_key: $OPENAI_API_KEY
Start the gateway server:
mlflow gateway start --config-path config.yaml --port 5000
Your gateway is now running at http://localhost:5000
Test your endpoint:
curl -X POST http://localhost:5000/gateway/chat/invocations \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Hello!"}]
}'
Supported Providers
MLflow AI Gateway supports a comprehensive range of LLM providers:
Provider | Chat | Completions | Embeddings | Notes |
---|---|---|---|---|
OpenAI | ✅ | ✅ | ✅ | GPT-3.5, GPT-4, text-embedding models |
Azure OpenAI | ✅ | ✅ | ✅ | Enterprise OpenAI with Azure integration |
Anthropic | ✅ | ✅ | ❌ | Claude models via Anthropic API |
Cohere | ✅ | ✅ | ✅ | Command and embedding models |
AWS Bedrock | ✅ | ✅ | ✅ | Claude, Titan, and other Bedrock models |
PaLM | ✅ | ✅ | ✅ | Google's PaLM models |
MosaicML | ✅ | ✅ | ❌ | MPT models and custom deployments |
MLflow Models | ✅ | ✅ | ✅ | Your own deployed MLflow models |
Core Concepts
Understanding these key concepts will help you effectively use the AI Gateway:
Endpoints
Endpoints are named configurations that define how to access a specific model from a provider. Each endpoint specifies the model, provider settings, and access parameters.
Providers
Providers are the underlying LLM services (OpenAI, Anthropic, etc.) that actually serve the models. The gateway abstracts away provider-specific details.
Routes
Routes define the URL structure for accessing endpoints. The gateway automatically creates routes based on your endpoint configurations.
Dynamic Updates
The gateway supports hot-reloading of configurations, allowing you to add, modify, or remove endpoints without restarting the server.
Next Steps
Ready to dive deeper? Explore these resources: