OpenAI Proxy Server

A simple, fast, and lightweight OpenAI-compatible server to call 100+ LLM APIs in the OpenAI Input/Output format

Endpoints:

/chat/completions - chat completions endpoint to call 100+ LLMs
/models - available models on server

info

We want to learn how we can make the proxy better! Meet the founders or join our discord

Usage

$ git clone https://github.com/BerriAI/litellm.git

$ cd ./litellm/openai-proxy

$ uvicorn main:app --host 0.0.0.0 --port 8000

Auth - LLM API keys

This server allows you to store LLM API keys in the environment variables. Example OPENAI_API_KEY, AZURE_API_KEY. More Info required variables for each provider
Pass auth params api_key,api_base, api_version etc to the /chat/completions endpoint

Replace openai base

import openai 
openai.api_base = "http://0.0.0.0:8000" # proxy url
openai.api_key = "does-not-matter"
# call cohere
response = openai.ChatCompletion.create(
    model="command-nightly", 
    messages=[{"role":"user", "content":"Hey!"}],
    api_key="your-cohere-api-key", # enter your key here
)

# call bedrock 
response = openai.ChatCompletion.create(
    model = "bedrock/anthropic.claude-instant-v1",
    messages = [
        {
            "role": "user",
            "content": "Hey!"
        }
    ],
    aws_access_key_id="",
    aws_secret_access_key="",
    aws_region_name="us-west-2",
)

print(response)

info

Looking for the CLI tool/local proxy? It's here

Deploy on Google Cloud Run

Click the button to deploy to Google Cloud Run

On a successfull deploy your Cloud Run Shell will have this output

Testing your deployed proxy

Assuming the required keys are set as Environment Variables

https://litellm-7yjrj3ha2q-uc.a.run.app is our example proxy, substitute it with your deployed cloud run app

OpenAI
Azure
Anthropic

curl https://litellm-7yjrj3ha2q-uc.a.run.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

curl https://litellm-7yjrj3ha2q-uc.a.run.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "azure/<your-deployment-name>",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'

curl https://litellm-7yjrj3ha2q-uc.a.run.app/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "claude-2",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7,
   }'

Set LLM API Keys

Environment Variables

More info here

In the Google Cloud console, go to Cloud Run: Go to Cloud Run
Click on the litellm service
Click Edit and Deploy New Revision
Enter your Environment Variables Example OPENAI_API_KEY, ANTHROPIC_API_KEY

OpenAI Proxy Server

Endpoints:​

Usage​

Auth - LLM API keys​

Replace openai base​

Deploy on Google Cloud Run​

Testing your deployed proxy​

Set LLM API Keys​

Environment Variables​