Supported Models
Comprehensive guide to AI models and parameters supported by Julep
Overview
Julep leverages LiteLLM to seamlessly connect you to a wide array of Language Models (LLMs). This integration offers incredible flexibility, allowing you to tap into models from various providers with a straightforward, unified interface.
With our unified API, switching between different providers is a breeze, ensuring you maintain consistent functionality across the board.
Available Models
While we provide API keys for quick testing and development, you’ll need to use your own API keys when deploying to production. This ensures you have full control over your usage and billing.
Looking for top-notch quality? Our curated selection of models delivers excellent outputs for all your use cases.
Anthropic
Here are the Anthropic models supported by Julep:
Model Name | Context Window | Best For |
---|---|---|
claude-3-opus | 200K tokens | Complex reasoning, analysis |
claude-3-sonnet | 200K tokens | General purpose tasks |
claude-3-haiku | 200K tokens | Quick responses |
claude-3.5-haiku | 200K tokens | Improved reasoning |
claude-3.5-sonnet | 200K tokens | Improved reasoning |
claude-3.5-sonnet-20240620 | 200K tokens | Enhanced reasoning capabilities |
claude-3.5-sonnet-20241022 | 200K tokens | Computer Use Capabilities and one of the latest models |
claude-3.7-sonnet | 200K tokens | Reasoning abilities and the latest model from Anthropic |
Here are the Google models supported by Julep:
Model Name | Context Window | Best For |
---|---|---|
gemini-1.5-pro | 1M tokens | Complex tasks |
gemini-1.5-pro-latest | 1M tokens | Cutting-edge performance |
OpenAI
Here are the OpenAI models supported by Julep:
Model Name | Context Window | Best For |
---|---|---|
gpt-4-turbo | 128K tokens | Advanced reasoning |
gpt-4o-mini | 128K tokens | Balanced performance |
gpt-4o | 128K tokens | Balanced performance |
o1-mini | 200K tokens | Quick tasks |
o1-preview | 200K tokens | Testing features |
o1 | 200K tokens | General tasks |
o3-mini | 200K tokens | Suited for reasoning tasks |
Groq
Here are the Groq models supported by Julep:
Model Name | Context Window | Best For |
---|---|---|
llama-3.1-70b | 8K tokens | Long-form content |
llama-3.1-8b | 8K tokens | Quick processing |
OpenRouter
Here are the OpenRouter models supported by Julep:
Model Name | Context Window | Best For |
---|---|---|
mistral-large-2411 | 128K tokens | High performance |
qwen-2.5-72b-instruct | 131K tokens | Complex instructions |
eva-llama-3.33-70b | 128K tokens | Story writing and creative fiction |
l3.1-euryale-70b | 128K tokens | Poetry and artistic writing |
l3.3-euryale-70b | 128K tokens | Advanced creative writing and roleplay |
magnum-v4-72b | 8K tokens | Content generation and brainstorming |
eva-qwen-2.5-72b | 8K tokens | Creative problem solving and ideation |
hermes-3-llama-3.1-70b | 8K tokens | Narrative design and worldbuilding |
deepseek-chat | 32K tokens | Conversational AI |
Cerebras
Here are the Cerebras models supported by Julep:
Model Name | Context Window | Best For |
---|---|---|
cerebras/llama-3.1-8b | 8K tokens | Quick creative writing and basic text generation |
cerebras/llama-3.3-70b | 8K tokens | Complex creative writing, storytelling, and detailed content generation |
Embedding
Here are the embedding models supported by Julep:
Model Name | Embedding Dimensions | Best For |
---|---|---|
text-embedding-3-large | 1024 | High-quality vectors |
voyage-multilingual-2 | 1024 | Cross-language tasks |
voyage-3 | 1024 | Advanced embeddings |
Alibaba-NLP/gte-large-en-v1.5 | 1024 | Cost-effective solutions |
BAAI/bge-m3 | 1024 | Cost-effective solutions |
vertex_ai/text-embedding-004 | 1024 | Google Cloud integration |
Though the models mention above support different embedding dimensions, Julep uses fixed 1024 dimensions for all embedding models for now. We plan to support different dimensions in the future.
Supported Parameters
Following are a list of different paramters that can be used to control the behavior of the models.
Not all parameters are supported by every model. Please refer to the LiteLLM documentation for more details.
Best Practices:
- Start with default values and adjust based on your needs
- Use temperature (0.0 - 1.0) for most cases
- Avoid setting multiple penalty parameters simultaneously
- Test different combinations for optimal results
Setting extreme values for multiple parameters may lead to unexpected behavior or poor quality outputs.
Usage Guidelines
Consider Model Selection Criteria
- 1. Your budget and cost constraints
- 2. How fast you need responses
- 3. The quality you’re aiming for
- 4. The context window size you require
Follow Best Practices
- 1. Start with smaller models for development and testing
- 2. Use larger context windows only when necessary
- 3. Keep an eye on token usage to manage costs
For more information, please refer to the LiteLLM documentation.