Overview
This tutorial demonstrates how to:- Set up browser automation with Julep
- Navigate web pages programmatically
- Execute browser actions like clicking and typing
- Process visual feedback through screenshots
- Create goal-oriented browser automation tasks
Task Structure
Letβs break down the task into its core components:1. Input Schema
First, we define what inputs our task expects:2. Tools Configuration
Next, we define the external tools our task will use:3. Main Workflow Steps
1
Create Julep Session
2
Store Session ID
3
Create Browser Session
4
Store Browser Session Info
5
Get Session View URLs
6
Store Debugger URL
7
Initial Navigation
8
Start Browser Workflow
4. Run Browser Subworkflow
Therun_browser
subworkflow is a crucial component that handles the interactive browser automation. It consists of three main parts:
1
Agent Interaction
- Process and understand the userβs goal
- Plan appropriate browser actions
- Generate responses based on the current browser state
- Make decisions about next steps
2
Action Execution
- Iterates through planned actions sequentially
- Executes browser commands (navigation, clicking, typing)
- Handles different types of interactions (text input, mouse clicks)
- Captures screenshots for visual feedback
3
Goal Evaluation
- Assesses progress toward the userβs goal
- Determines if additional actions are needed
- Maintains conversation context
- Decides whether to continue or conclude the workflow
5. Check Goal Status Subworkflow
Thecheck_goal_status
subworkflow is a recursive component that ensures continuous operation until the goal is achieved:
1
Check Goal Status
- Checks if there are any messages to process (
len(_.messages) > 0
) - If messages exist, recursively calls the
run_browser
workflow - Passes along the current session context and connection details
- Maintains the conversation flow until the goal is achieved
- Automatically terminates when no more messages need processing
- The goal is successfully achieved
- No more actions are needed
- An error occurs that prevents further progress
Complete Task YAML
Complete Task YAML
YAML
Usage
Hereβs how to use this task with the Julep SDK:Key Features
- Browser Automation: Performs web interactions like navigation, clicking, and typing
- Visual Feedback: Captures screenshots to verify actions and understand page state
- Goal-Oriented: Continues executing actions until the userβs goal is achieved
- Secure Sessions: Uses BrowserBase for isolated browser instances
- Interactive Workflow: Uses run_browser subworkflow for continuous interaction
Next Steps
- Try this task yourself, check out the full example, see the browser-use cookbook.
- To learn more about the integrations used in this task, check out the integrations page.