Overview

Julep tasks support Python expressions for dynamic value computation and data manipulation. This guide explains how to use them effectively.

The Special _ Variable

The underscore _ is a special variable that serves three different purposes depending on where it’s used:

  1. First Step Input: In the first step of a task, _ contains the execution input:
# If task is executed with input `{"topic": "AI"}`
- evaluate:
    topic: _.topic  # Accesses the input "AI"
  1. Previous Step Output: In any subsequent step, _ contains the output from the previous step:
- evaluate:
    results: _.split('\n')  # Splits previous step's output into lines
  1. Iterator Value: In foreach and map steps, _ represents the current item being iterated.
# If the task is executed with input `{"questions": ["What is AI?", "What is Julep?"]}`
- foreach:
    in: _.questions
    do:
      - wait_for_input:
          info:
            "message": _  # _ is each question

Where Python Expressions Are Used

Python expressions can be used in various task steps. For a complete list of step types and their syntax, refer to the Step Types table in the README.

Common places include:

  • evaluate steps
  • Tool arguments
  • if conditions
  • foreach and map iterations

Available Functions and Libraries

The following Python functions and libraries are available for use in expressions:

Basic Python Builtins

  • abs, all, any, bool, dict, enumerate
  • float, int, len, list, map, max, min
  • round, set, str, sum, tuple, zip, reduce

Safe Versions of Functions

  • range: def safe_range(*args)

    Safely creates a range object with size limits (max 1,000,000 elements).

  • load_json: def safe_json_loads(s: str) -> Any (Deprecated in favor of json.loads)

    Safely parses a JSON string with size limits.

  • load_yaml: def safe_yaml_load(s: str) -> Any (Deprecated in favor of yaml.safe_load)

    Safely parses a YAML string with size limits.

  • dump_json: def dump_json(obj: Any, *, **kwargs) -> str (Deprecated in favor of json.dumps)

    Safely serializes an object to a JSON string.

  • dump_yaml: def dump_yaml(obj: Any, **kwargs) -> str (Deprecated in favor of yaml.dumps)

    Safely serializes an object to a YAML string.

  • extract_json: def safe_extract_json(string: str) -> Any

    Safely extracts and parses JSON from text.

Regex and NLP Functions

  • search_regex: def search_regex(pattern: str, string: str) -> Optional[re2.Match]

    Searches for a regex pattern in a string.

  • match_regex: def match_regex(pattern: str, string: str) -> bool

    Checks if a regex pattern matches a string.

  • chunk_doc: def chunk_doc(string: str) -> list[str]

    Chunks a string into sentences.

  • nlp pipelines.

Example using these functions in an evaluate step:

- evaluate:
    # Parse JSON string with size limit of 1MB
    data: json.loads(_.json_string)
    
    # Parse YAML string with size limit of 1MB
    config: yaml.safe_load(_.yaml_string)
    
    # Extract JSON from text that might contain markdown code blocks
    extracted_content: extract_json('Here is some JSON: ```json\n{"key": "value"}\n```')

Standard Library Modules

  • re: Regular expressions (using re2)
  • json: JSON operations
  • yaml: YAML operations
  • string: String constants and operations
  • datetime: Date and time operations
  • math: Mathematical functions
  • statistics: Statistical operations
  • base64: Base64 encoding/decoding
  • urllib.parse: URL parsing operations
  • random: Random number generation
  • time: Time operations

For the complete list of available functions and their safe implementations, refer to the utils.py file in the source code.

Example Usage

Here’s a practical example combining different aspects of Python expressions:

name: Data Processing Task

main:
  # Using _ as input
  - evaluate:
      topics: _.topics  # Access input topics

  # Using _ as previous output
  - evaluate:
      filtered_topics: [t for t in _.topics if len(t) > 3]

  # Using _ in foreach
  - foreach:
      in: _.filtered_topics
      do:
        - tool: web_search
          arguments:
            query: "'Latest news about ' + _"  # _ is each topic

Security

All Python expressions are executed in a sandboxed environment with:

Function Limits

Limited available functions to prevent unsafe operations

String Limits

Maximum string length restrictions to prevent memory issues

Collection Limits

Collection size limits to prevent resource exhaustion

Time Limits

Execution time limits to prevent infinite loops

This ensures safe execution while providing necessary functionality for task workflows.

Support

If you need help with further questions in Julep: