Quickstart - Julep

Introduction

Julep’s Open Responses is a self-hosted, open-source implementation of OpenAI’s Responses API that works with any LLM backend. It provides a lightweight interface for generating content with Large Language Models (LLMs) without needing to create persistent agents or sessions.

To try it out, just run npx -y open-responses init (or uvx) and that’s it! :)

What is Open Responses?

Julep’s Open Responses lets you run your own server that is compatible with OpenAI’s Responses API, while giving you the freedom to use alternative models like:

Anthropic’s Claude
Alibaba’s Qwen
Deepseek R1
and many others …

It’s essentially a drop-in replacement that you control, with a permissive Apache-2.0 license. As an early release, we welcome your feedback and contributions to help improve it.

Open Responses API Overview

Why Open Responses?

Model Flexibility: Use any LLM backend without vendor lock-in, including local model deployment
Self-hosted & Private: Maintain full control over your deployment on your own infrastructure (cloud or on-premise)
Drop-in Compatibility: Seamlessly integrates with the official Agents SDK by simply pointing to your self-hosted URL
Easy Deployment: Quick setup via docker-compose or our CLI with minimal configuration
Built-in Tools: Automatic execution of tool calls (like web_search) using open & pluggable alternatives

The Open Responses API requires self-hosting. See the installation guide below.
Being in Alpha, the API is subject to change. Check back frequently for updates.
For more context, see the OpenAI Responses API documentation.

Local Installation

This section will guide you through the steps to set up the Julep’s Open Responses API.

Prerequisites

Install Docker

Installation

The Julep’s Open Responses API is a fully microservice-based architecture. It is fully dockerized and can be easily deployed on any infrastructure that supports Docker. There are two ways to install the API:

Docker Installation
CLI Installation

Docker Installation

Create a directory for the project

mkdir julep-responses-api

Navigate to the project directory

cd julep-responses-api

Download and edit the environment variables

wget https://u.julep.ai/responses-env.example -O .env

Edit the .env file with your own values.

Download the Docker Compose file

wget https://u.julep.ai/responses-compose.yaml -O docker-compose.yml

Download the file to the current directory with the name docker-compose.yml. This is the file that will be used to run the Docker containers.

Run the Docker containers

docker compose up --watch

This will start the containers in watch mode.

Verify that the containers are running

docker ps

CLI Installation

The CLI is a lightweight alternative to Docker for those who prefer not to use Docker directly.

Internally, it uses Docker to run the containers.

Install the CLI

You can install the CLI using several package managers:

# Using npx directly
npx open-responses

# Or install globally
npm install -g open-responses

Setup the Environment Variables

npx open-responses setup

Before using any commands, you must run the setup command

Run the CLI

npx open-responses start

This will start the API in watch mode

To learn more about the CLI one can use the checkout the CLI Documentation.

Quickstart Example

With the OpenAI SDK initialized, you can now use the Responses API to generate content.

API Key Configuration

RESPONSE_API_KEY is the API key that you set in the .env file.

Model Selection

While using models other than OpenAI, one might need to add the provider/ prefix to the model name.
For supported providers, see the LiteLLM Providers documentation.

Environment Setup

Add the relevant provider keys to the .env file to use their respective models.

1. Install the OpenAI SDK

pip install openai

2. Initialize the OpenAI client

from openai import OpenAI
openai_client = OpenAI(base_url="http://localhost:8080/", api_key="RESPONSE_API_KEY")

3. Generate a response

import os
from openai import OpenAI

openai_client = OpenAI(base_url="http://localhost:8080/", api_key=os.getenv("RESPONSE_API_KEY"))

response = openai_client.responses.create(
    model="gpt-4o-mini",
    input="How many people live in the world?"
)
print("Generated response:", response.output[0].content[0].text)

Next Steps

You’ve got Open Responses running – here’s what to explore next:

Learn more about the Open Responses API Examples – To learn how to use the Responses API with code examples
Learn more about the Open Responses API Roadmap – To see upcoming features including:
Learn more about Julep - To learn more about Julep and its features
GitHub - To contribute to the project

​Introduction

​What is Open Responses?

​Why Open Responses?

​Local Installation

​Prerequisites

​Installation

​Docker Installation

​CLI Installation

​Quickstart Example

API Key Configuration

Model Selection

Environment Setup

​1. Install the OpenAI SDK

​2. Initialize the OpenAI client

​3. Generate a response

​Next Steps

Introduction

What is Open Responses?

Why Open Responses?

Local Installation

Prerequisites

Installation

Docker Installation

CLI Installation

Quickstart Example

1. Install the OpenAI SDK

2. Initialize the OpenAI client

3. Generate a response

Next Steps