🚅 LiteLLM Proxy
Introduction
The LiteLLM Proxy is a service that allows you to access a variety of language models (LLMs) through a single endpoint. This makes it easy to switch between different models without having to change your code. The LiteLLM Proxy is compatible with a number of popular LLM libraries, including OpenAI, Langchain, and LlamaIndex.
Contact Trey Saddler for access to the LiteLLM Proxy. You will be given a key that will allow you to make requests to the proxy.
Making Requests
There are a few different methods for making requests to the LiteLLM Proxy.
import openai
= openai.OpenAI(
client ="sk-1234", # Format should be 'sk-<your_key>'
api_key="http://litellm.toxpipe.niehs.nih.gov" # LiteLLM Proxy is OpenAI compatible, Read More: https://docs.litellm.ai/docs/proxy/user_keys
base_url
)
= client.chat.completions.create(
response ="azure-gpt-4o", # model to send to the proxy
model= [
messages
{"role": "user",
"content": "this is a test request, write a short poem"
}
]
)
print(response)
import os, dotenv
from llama_index.llms import AzureOpenAI
from llama_index.embeddings import AzureOpenAIEmbedding
from llama_index import VectorStoreIndex, SimpleDirectoryReader, ServiceContext
= AzureOpenAI(
llm ="azure-gpt-4o", # model_name on litellm proxy
engine=0.0,
temperature="https://litellm.toxpipe.niehs.nih.gov", # litellm proxy endpoint
azure_endpoint="sk-1234", # litellm proxy API Key
api_key="2024-02-01",
api_version
)
= AzureOpenAIEmbedding(
embed_model ="text-embedding-ada-002",
deployment_name="http://litellm.toxpipe.niehs.nih.gov",
azure_endpoint="sk-1234",
api_key="2024-02-01",
api_version
)
= SimpleDirectoryReader("llama_index_data").load_data()
documents = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
service_context = VectorStoreIndex.from_documents(documents, service_context=service_context)
index
= index.as_query_engine()
query_engine = query_engine.query("What did the author do growing up?")
response print(response)
Langchain expects you to set the API key in the environment variable OPENAI_API_KEY
. You can find more information about using this library here: https://python.langchain.com/v0.2/docs/tutorials/llm_chain/
import getpass
import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.output_parsers import StrOutputParser
"OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ[
= ChatOpenAI(model="gpt-4", base_url="https://litellm.toxpipe.niehs.nih.gov")
model
= [
messages ="Translate the following from English into Italian"),
SystemMessage(content="hi!"),
HumanMessage(content
]
= StrOutputParser()
parser = model.invoke(messages)
result parser.invoke(result)
Models Available
The following models are available on the LiteLLM Proxy:
OpenAI Models
azure-gpt-4o
azure-gpt-4-turbo-20240409
azure-gpt-4-turbo-preview
azure-gpt-4
text-embedding-ada-002
Anthropic Claude Models
claude-3-5-sonnet
claude-3-haiku
claude-3-sonnet
claude-3-opus