input_schema: type: object properties: goal: type: string agent_id: type: string description: The id of the agent to use for the browser automation required: - goal - agent_id
This schema specifies that our task expects a goal string describing what the browser automation should accomplish.
- tool: create_julep_session arguments: agent: $ str(agent.id) situation: "The environment is a browser" recall: 'False'
This step initializes a new Julep session for the AI agent. The session serves as a container for the conversation history and enables the agent to maintain context throughout the interaction.
2
Store Session ID
- evaluate: julep_session_id: $ _.id
After creating the session, we store its unique identifier for future reference.
This step navigates to Google’s homepage to avoid sending a blank screenshot when computer use starts.
8
Start Browser Workflow
- workflow: run_browser arguments: julep_session_id: $ steps[1].output.julep_session_id cdp_url: $ steps[3].output.connect_url messages: - role: "user" content: |- $ f""" <SYSTEM_CAPABILITY> * You are utilising a headless chrome browser to interact with the internet. * You can use the computer tool to interact with the browser. * You have access to only the browser. * You are already inside the browser. * You can't open new tabs or windows. * For now, rely on screenshots as the only way to see the browser. * You can't don't have access to the browser's UI. * YOU CANNOT WRITE TO THE SEARCH BAR OF THE BROWSER. </SYSTEM_CAPABILITY> <GOAL> * + {steps[0].input.goal} + NEWLINE + </GOAL> """
Finally, we initiate the interactive browser workflow with system capabilities and user goal.
Checks if there are any messages to process (len(_.messages) > 0)
If messages exist, recursively calls the run_browser workflow
Passes along the current session context and connection details
Maintains the conversation flow until the goal is achieved
Automatically terminates when no more messages need processing
This recursive pattern ensures that the browser automation continues until either:
The goal is successfully achieved
No more actions are needed
An error occurs that prevents further progress
YAML
# yaml-language-server: $schema=https://raw.githubusercontent.com/julep-ai/julep/refs/heads/dev/schemas/create_task_request.jsonname: Julep Browser Use Taskdescription: A Julep agent that can use the computer tool to interact with the browser.########################################################################### INPUT SCHEMA ###############################################################################input_schema: type: object properties: goal: type: string required: - goal############################################################################### TOOLS ##################################################################################tools:- name: create_browserbase_session type: integration integration: provider: browserbase method: create_session setup: api_key: "YOUR_BROWSERBASE_API_KEY" project_id: "YOUR_BROWSERBASE_PROJECT_ID"- name: get_session_view_urls type: integration integration: provider: browserbase method: get_live_urls setup: api_key: "YOUR_BROWSERBASE_API_KEY" project_id: "YOUR_BROWSERBASE_PROJECT_ID"- name: perform_browser_action type: integration integration: provider: remote_browser method: perform_action setup: width: 1024 height: 768- name: create_julep_session type: system system: resource: session operation: create- name: session_chat type: system system: resource: session operation: chat########################################################################### MAIN WORKFLOW ##############################################################################main:# Step #0 - Create Julep Session- tool: create_julep_session arguments: agent: $ str(agent.id) situation: "Juelp Browser Use Agent" recall: 'False'# Step #1 - Store Julep Session ID- evaluate: julep_session_id: $ _.id# Step #2 - Create Browserbase Session- tool: create_browserbase_session arguments: project_id: "c35ee022-883e-4070-9f3c-89607393214b"# Step #3 - Store Browserbase Session Info- evaluate: browser_session_id: $ _.id connect_url: $ _.connect_url# Step #4 - Get Session View URLs- tool: get_session_view_urls arguments: id: $ _.browser_session_id# Step #5 - Store Debugger URL- evaluate: debugger_url: $ _.urls.debuggerUrl# Step #6 - Navigate to Google# Navigate to google to avoid sending a blank # screenshot when computer use starts- tool: perform_browser_action arguments: connect_url: $ steps[3].output.connect_url action: "navigate" text: "https://www.google.com"# Step #7 - Run Browser Workflow- workflow: run_browser arguments: julep_session_id: $ steps[1].output.julep_session_id cdp_url: $ steps[3].output.connect_url messages: - role: "user" content: |- $ f""" <SYSTEM_CAPABILITY> * You are utilising a headless chrome browser to interact with the internet. * You can use the computer tool to interact with the browser. * You have access to only the browser. * You are already inside the browser. * You can't open new tabs or windows. * For now, rely on screenshots as the only way to see the browser. * You don't have access to the browser's UI. * YOU CANNOT WRITE TO THE SEARCH BAR OF THE BROWSER. </SYSTEM_CAPABILITY> <GOAL> * + {steps[0].input.goal} + NEWLINE + </GOAL> """######################################################################### RUN BROWSER SUBWORKFLOW #########################################################################run_browser:# Step #0 - Agent Interaction- tool: session_chat arguments: session_id: $ _.julep_session_id messages: $ _.messages recall: $ False# Step #1 - Evaluate the response from the agent- evaluate: content: $ _.choices[0].message.content tool_calls: |- $ [ { 'tool_call_id': tool_call.id, 'action': load_json(tool_call.function.arguments)['action'], 'text': load_json(tool_call.function.arguments).get('text'), 'coordinate': load_json(tool_call.function.arguments).get('coordinate') } for tool_call in _.choices[0].message.tool_calls or [] if tool_call.type == 'function']# Step #2 - Perform the actions requested by the agent- foreach: in: $ _.tool_calls do: tool: perform_browser_action arguments: connect_url: $ steps[0].input.cdp_url action: $ _.action if not (str(_.get('text', '')).startswith('http') and _.action == 'type') else 'navigate' text: $ _.get('text') coordinate: $ _.get('coordinate')# Step #3 - Convert the result of the actions into a chat message- evaluate: contents: >- $ [ \ { \ 'type': 'image_url', \ 'image_url': { \ 'url': result['base64_image'], \ } \ } if result['base64_image'] is not None else \ { \ 'type': 'text', \ 'text': result['output'] if result['output'] is not None else 'done' \ } \ for result in _]# Step #4 - Convert the result of the actions into a chat message- evaluate: messages: "$ [{'content': [_.contents[i]], 'role': 'tool', 'name': 'computer', 'tool_call_id': steps[1].output.tool_calls[i].tool_call_id} for i in range(len(_.contents))]"# Step #5 - Check if the goal is achieved and recursively run the browser- workflow: check_goal_status arguments: messages: $ _.messages julep_session_id: $ steps[0].input.julep_session_id cdp_url: $ steps[0].input.cdp_url###################################################################### CHECK GOAL STATUS SUBWORKFLOW ######################################################################check_goal_status:# Step #0 - Check if the goal is achieved and recursively run the browser- if: $ len(_.messages) > 0 then: workflow: run_browser arguments: messages: $ _.messages julep_session_id: $ _.julep_session_id cdp_url: $ _.cdp_url
from julep import Clientimport yamlimport time# Initialize the clientclient = Client(api_key=JULEP_API_KEY)# Create the agentagent = client.agents.create( name="Julep Browser Use Agent", about="A Julep agent that can use the computer tool to interact with the browser.",)# Load the task definitionwith open('browser_task.yaml', 'r') as file: task_definition = yaml.safe_load(file)# Create the tasktask = client.tasks.create( agent_id=agent.id, **task_definition)# Create the executionexecution = client.executions.create( task_id=task.id, input={ "agent_id": agent.id, "goal": "Search for recent news about artificial intelligence" })# Wait for the execution to completewhile (result := client.executions.get(execution.id)).status not in ['succeeded', 'failed']: print(result.status) time.sleep(1)# Print the resultif result.status == "succeeded": print(result.output)else: print(f"Error: {result.error}")