Sessions in Julep are the backbone of stateful interactions between users and agents. They maintain the context and history of conversations, enabling personalized and coherent interactions over extended periods. Whether it’s handling ongoing customer support inquiries or having a conversation with a user, sessions ensure that the agent retains necessary information to provide meaningful responses.
When configuring a session, you can specify recall options to control how context or certain data is recalled during the session. Below are the available options based on search mode:
Parameter
Type
Description
Default
mode
Literal["vector"]
The mode to use for the search (must be “vector”)
"vector"
lang
str
The language for text search (other languages coming soon)
en-US
limit
int
The limit of documents to return (1-50)
10
max_query_length
int
The maximum query length (100-10000 characters)
1000
metadata_filter
object
Metadata filter to apply to the search
{}
num_search_messages
int
The number of search messages to use for the search (1-50)
4
confidence
float
The confidence cutoff level (-1 to 1)
0.5
mmr_strength
float
MMR Strength (mmr_strength = 1 - mmr_lambda) (0 to 1)
0.5
Parameter
Type
Description
Default
mode
Literal["vector"]
The mode to use for the search (must be “vector”)
"vector"
lang
str
The language for text search (other languages coming soon)
en-US
limit
int
The limit of documents to return (1-50)
10
max_query_length
int
The maximum query length (100-10000 characters)
1000
metadata_filter
object
Metadata filter to apply to the search
{}
num_search_messages
int
The number of search messages to use for the search (1-50)
4
confidence
float
The confidence cutoff level (-1 to 1)
0.5
mmr_strength
float
MMR Strength (mmr_strength = 1 - mmr_lambda) (0 to 1)
0.5
Parameter
Type
Description
Default
mode
Literal["text"]
The mode to use for the search (must be “text”)
"text"
lang
str
The language for text search (other languages coming soon)
en-US
limit
int
The limit of documents to return (1-50)
10
max_query_length
int
The maximum query length (100-10000 characters)
1000
metadata_filter
object
Metadata filter to apply to the search
{}
num_search_messages
int
The number of search messages to use for the search (1-50)
4
trigram_similarity_threshold
float
The threshold for trigram similarity matching (must be between 0 and 1)
0.6
Parameter
Type
Description
Default
mode
Literal["hybrid"]
The mode to use for the search (must be “hybrid”)
"hybrid"
lang
str
The language for text search (other languages coming soon)
english_unaccent
limit
int
The limit of documents to return (1-50)
10
max_query_length
int
The maximum query length (100-10000 characters)
1000
metadata_filter
object
Metadata filter to apply to the search
{}
num_search_messages
int
The number of search messages to use for the search (1-50)
4
alpha
float
Weight between text-based and vector-based search (0=pure text, 1=pure vector)
0.5
confidence
float
The confidence cutoff level (-1 to 1)
0.5
mmr_strength
float
MMR Strength (mmr_strength = 1 - mmr_lambda) (0 to 1)
0.5
trigram_similarity_threshold
float
The threshold for trigram similarity matching (must be between 0 and 1)
0.6
k_multiplier
int
Controls how many intermediate results to fetch before final scoring
7
When recall_options is not explicitly set (for instance, it is None), vector search mode is used with default parameters.
The default parameters for each search mode are based on our internal benchmarking. These values provide a good starting point, but you may need to adjust them depending on your specific use case to achieve optimal results.
Traditional full-text search using PostgreSQL’s tsquery/tsrank for keyword matching
Vector-based semantic search using embeddings for contextual understanding
Trigram fuzzy matching for handling typos, spelling variations, and morphological differences
The trigram search capability uses PostgreSQL’s pg_trgm extension enhanced with Levenshtein distance calculations to provide resilient document retrieval even when search terms contain variations or errors. This is especially useful for natural language queries that may contain typos or alternative word forms.
You can control the fuzzy matching behavior using the trigram_similarity_threshold parameter - higher values (e.g., 0.8) require closer matches while lower values (e.g., 0.3) are more lenient. For more details on the advanced search capabilities, see the Documents (RAG) section.
The System Template is a specific system prompt written as a Jinja template that sets the foundational context and instructions for the agent within a session. It defines the background, directives, and any relevant information that the agent should consider when interacting with the user.
For a comprehensive guide on system templates including available variables, customization options, and advanced usage patterns, see the System Templates documentation. For more details on Jinja templates, refer to the Jinja documentation.
Default System Template
Copy
{%- if agent.name -%}You are {{agent.name}}.{{" "}}{%- endif -%}{%- if agent.about -%}About you: {{agent.about}}.{{" "}}{%- endif -%}{%- if user -%}You are talking to a user {%- if user.name -%}{{" "}} and their name is {{user.name}} {%- if user.about -%}. About the user: {{user.about}}.{%- else -%}.{%- endif -%} {%- endif -%}{%- endif -%}{{NEWLINE}}{%- if session.situation -%}Situation: {{session.situation}}{%- endif -%}{{NEWLINE+NEWLINE}}{%- if agent.instructions -%}Instructions:{{NEWLINE}} {%- if agent.instructions is string -%} {{agent.instructions}}{{NEWLINE}} {%- else -%} {%- for instruction in agent.instructions -%} - {{instruction}}{{NEWLINE}} {%- endfor -%} {%- endif -%} {{NEWLINE}}{%- endif -%}{%- if docs -%}Relevant documents:{{NEWLINE}} {%- for doc in docs -%} {{doc.title}}{{NEWLINE}} {%- if doc.content is string -%} {{doc.content}}{{NEWLINE}} {%- else -%} {%- for snippet in doc.content -%} {{snippet}}{{NEWLINE}} {%- endfor -%} {%- endif -%} --- {%- endfor -%}{%- endif -%}
Sessions are integral to maintaining a continuous and coherent interaction between users and agents. Here’s how to create and manage sessions using Julep’s SDKs.
Agents operate within sessions to provide personalized and context-aware interactions. While an agent defines the behavior and capabilities, a session maintains the state and context of interactions between the agent and the user. In other words, the history of a conversation is tied to a session, rather than an agent.
Example:
Copy
agent = client.agents.create( name="David", about="A news reporter", model="gpt-4o-mini", instructions=["Keep your responses concise and to the point.", "If you don't know the answer, say 'I don't know'"], metadata={"channel": "FOX News"},)session1 = client.sessions.create(agent=agent.id, user="user_id", situation="The user is interested in the latest news about the stock market.")session2 = client.sessions.create(agent=agent.id, user="user_id", situation="The user is interested in political news in the United States.")
In this example, the agent David is used in two different sessions, each with a different situation. The agent’s behavior and responses are tailored to the specific situation of each session, and the history of messages in session1 and session2 are separate.
When a user (or more) is added to a session, the session will be able to access information about the user such as name, and about in order to personalize the interaction. Check out the system_template to see how the user’s info is being accessed.
This is how you can create a user and associate it with a session:
Copy
client = Julep(api_key="YOUR_API_KEY")user = client.users.create(name="John Doe", about="A 21 year old man who is a student at MIT.")agent = client.agents.create(name="Mark Lee", about="A 49 year old man who is a retired software engineer.")session = client.sessions.create(agent_id=agent.id, user_id=user.id)
In this example, the user John Doe is associated with the agent Mark Lee in the session. The session will use the user’s information to personalize the interaction, such as using the user’s name in the system prompt.
Sessions have the ability to use Tools. When auto_run_tools is set to true (available in chat calls), if an agent has a tool and the LLM decides to use it, the tool will be executed automatically and the result will be sent back to the LLM for further processing. When auto_run_tools is false (default), tool calls are returned in the response without execution.
Example:
If the agent that’s associated with the session has a tool called fetch_weather, and the LLM decides to use it:
With auto_run_tools=true: The tool executes automatically and returns weather data to the LLM
With auto_run_tools=false: The tool call is returned in the response for manual execution
Copy
# With automatic tool executionresponse = client.sessions.chat( session_id="session_id", messages=[ { "role": "user", "content": "What is the weather in San Francisco?" } ], auto_run_tools=True, # Tools execute automatically recall_tools=True # Tool calls/results included in message history)print("Agent's response:", response.choices[0].message.content)# Agent's response: The weather in San Francisco is 70 degrees and it's sunny. Humidity is 50%, and the wind speed is around 10 mph.# Without automatic tool execution (default)response = client.sessions.chat( session_id="session_id", messages=[ { "role": "user", "content": "What is the weather in San Francisco?" } ], auto_run_tools=False # Default - returns tool calls without execution)# Check if tool calls were madeif response.choices[0].message.tool_calls: print("Tool calls:", response.choices[0].message.tool_calls) # You need to execute these tools manually and send results back
When chatting in a session, the session can automatically search for documents that are associated with any of the agents and/or users that participate in the session. You can control whether the session should search for documents when chatting using the recall option of the chat method, which is set to True by default. You can also set the session’s recall_options when creating the session to control how the session should search for documents.
Copy
# Create a session with custom recall optionsclient.sessions.create( agent=agent.id, user=user.id, recall=True, recall_options={ "mode": "vector", # or "hybrid", "text" "num_search_messages": 4, # number of messages to search for documents "max_query_length": 1000, # maximum query length "alpha": 0.7, # weight to apply to BM25 vs Vector search results (ranges from 0 to 1) "confidence": 0.6, # confidence cutoff level (ranges from -1 to 1) "limit": 10, # limit of documents to return "lang": "en-US", # language to be used for text-only search "metadata_filter": {}, # metadata filter to apply to the search "mmr_strength": 0, # MMR Strength (ranges from 0 to 1) })# Chat in the sessionresponse = client.sessions.chat( session_id=session.id, messages=[ { "role": "user", "content": "Tell me about Julep" } ], recall=True)print("Agent's response:", response.choices[0].message.content)print("Searched Documents:", response.docs)
When running the above code with an agent that has documents about Julep, the session will search for documents that are relevant to the conversation and return them in the response.docs field.
1. Reuse Sessions: Reuse existing sessions for returning users to maintain continuity in interactions.
2. Session Cleanup: Regularly clean up inactive sessions to manage resources efficiently.
3. Context Overflow Strategy: Choose an appropriate context overflow strategy (e.g., “adaptive”) to handle long conversations without losing important information.
Personalization
1. Leverage Metadata: Use session metadata to store and retrieve user preferences, enhancing personalized interactions.
2. Maintain Context: Ensure that the context within sessions is updated and relevant to provide coherent and context-aware responses.
Performance Optimization
1. Efficient Searches: Optimize search queries within sessions to retrieve relevant documents quickly.
2. Manage Token Usage: Monitor and manage token usage to ensure efficient use of resources, especially in long sessions.