Concepts of Julep Open Responses API
uuid7
) for each response.Option | Type | Description | Default | Status |
---|---|---|---|---|
model | string | The language model to use (e.g., “claude-3.5-haiku”, “gpt-4o”). Check out the supported models for more information. | Required | Implemented |
input | string | array | The prompt or structured input to send to the model | Required | Implemented |
include | array | null | Types of content to include in the response (e.g., “file_search_call.results”) | None | Partially Implemented |
parallel_tool_calls | boolean | Whether to allow tools to be called in parallel | true | Implemented |
store | boolean | Whether to store the response for later retrieval | true | Implemented |
stream | boolean | Whether to stream the response as it’s generated | false | Planned |
max_tokens | integer | null | Maximum number of tokens to generate | None | Implemented |
temperature | number | Controls randomness in response generation (0 to 1) | 1 | Implemented |
top_p | number | Controls diversity in token selection (0 to 1) | 1 | Implemented |
n | integer | null | Number of responses to generate | None | Implemented |
stop | string | array | null | Sequence(s) where the model should stop generating | None | Implemented |
presence_penalty | number | null | Penalty for new tokens based on presence in text so far | None | Implemented |
frequency_penalty | number | null | Penalty for new tokens based on frequency in text so far | None | Implemented |
logit_bias | object | null | Modify likelihood of specific tokens appearing | None | Implemented |
user | string | null | Unique identifier for the end-user | None | Implemented |
instructions | string | null | Additional instructions to guide the model’s response | None | Implemented |
previous_response_id | string | null | ID of a previous response for context continuity | None | Implemented |
reasoning | object | null | Controls reasoning effort (low/medium/high) | None | Implemented |
text | object | null | Configures text format (text or JSON object) | None | Implemented |
tool_choice | "auto" | "none" | object | null | Controls how the model chooses which tools to use | None | Implemented |
tools | array | null | List of tools the model can use for generating the response | None | Partially Implemented |
truncation | "disabled" | "auto" | null | How to handle context overflow | None | Planned |
metadata | object | null | Additional metadata for the response | None | Implemented |
Feature | Sessions | Responses |
---|---|---|
State Management | Maintains conversation history | Stateless (with optional context from previous responses) |
Persistence | Long-lived, for ongoing conversations | Short-lived, for one-off interactions |
Agent Integration | Requires an agent | No agent needed |
Setup Complexity | Requires agent and session creation | Minimal setup (just model and input) |
Use Case | Multi-turn conversations, complex interactions | Quick content generation, processing, or reasoning |
previous_response_id
parameter to link responses together.Field | Type | Description |
---|---|---|
id | string | Unique identifier for the response |
object | string | Always “response” |
created_at | integer | Unix timestamp when the response was created |
status | string | Current status: “completed”, “failed”, “in_progress”, or “incomplete” |
error | object or null | Error information if the response failed |
incomplete_details | object or null | Details about why a response is incomplete |
instructions | string or null | Optional instructions provided to the model |
max_output_tokens | integer or null | Maximum number of tokens to generate |
model | string | The model used to generate the response |
output | array | List of output items (messages, tool calls, reasoning) |
parallel_tool_calls | boolean | Whether tools can be called in parallel |
previous_response_id | string or null | ID of a previous response for context |
reasoning | object or null | Reasoning steps if reasoning was requested |
store | boolean | Whether the response is stored for later retrieval |
temperature | number | Sampling temperature used (0-1) |
text | object or null | Text formatting options |
tool_choice | string or object | How tools are selected (“auto”, “none”, “required”) |
tools | array | List of tools available to the model |
top_p | number | Top-p sampling parameter (0-1) |
truncation | string | Truncation strategy (“disabled” or “auto”) |
usage | object | Token usage statistics |
user | string or null | Optional user identifier |
metadata | object | Custom metadata associated with the response |
output
array contains the actual content generated by the model, which can include text messages, tool calls (function, web search, file search, computer), and reasoning items.