Overview

Document integration in Julep allows you to seamlessly incorporate document store capabilities into your tasks and agent workflows. This guide covers how to effectively use documents in your applications.

Using Documents in Tasks

Document Access

Access documents within task workflows:

tools:
  - name: search_docs
    system:
      resource: agent
      subresource: doc
      operation: search

main:
  # Search for relevant documents
  - tool: search_docs
    arguments:
      text: "{{inputs.query}}"
      metadata_filter:
        type: "documentation"
  
  # Use search results in prompt
  - prompt:
      - role: system
        content: "Use these documents to answer: {{_.documents}}"
      - role: user
        content: "{{inputs.query}}"

Document Creation

Create documents from task results:

tools:
  - name: create_doc
    system:
      resource: agent
      subresource: doc
      operation: create

main:
  # Generate content
  - prompt:
      - role: user
        content: "Generate documentation for {{inputs.topic}}"
  
  # Save as document
  - tool: create_doc
    arguments:
      title: "Documentation: {{inputs.topic}}"
      content: _
      metadata:
        type: "documentation"
        topic: "{{inputs.topic}}"
        generated_at: "datetime.now().isoformat()"

Document Processing Workflows

Content Generation

Generate and store documentation:

Python
# Create a documentation generator task
task = client.tasks.create(
    agent_id=agent.id,
    yaml="""
    name: Documentation Generator
    description: Generate and store documentation

    tools:
      - name: create_doc
        system:
          resource: agent
          subresource: doc
          operation: create
      
      - name: search_docs
        system:
          resource: agent
          subresource: doc
          operation: search

    main:
      # Check existing documentation
      - tool: search_docs
        arguments:
          text: "{{inputs.topic}}"
          metadata_filter:
            type: "documentation"
      
      # Generate if not exists
      - if: "len(_.documents) == 0"
        then:
          - prompt:
              - role: system
                content: "Generate comprehensive documentation for {{inputs.topic}}"
          
          - tool: create_doc
            arguments:
              title: "{{inputs.topic}} Documentation"
              content: _
              metadata:
                type: "documentation"
                topic: "{{inputs.topic}}"
                status: "draft"
        else:
          - return:
              message: "Documentation already exists"
              documents: _.documents
    """
)

Document Updates

Update documents based on new information:

Python
# Create a document updater task
task = client.tasks.create(
    agent_id=agent.id,
    yaml="""
    name: Document Updater
    description: Update existing documentation

    tools:
      - name: search_docs
        system:
          resource: agent
          subresource: doc
          operation: search
      
      - name: update_doc
        system:
          resource: agent
          subresource: doc
          operation: update

    main:
      # Find document to update
      - tool: search_docs
        arguments:
          text: "{{inputs.topic}}"
          metadata_filter:
            type: "documentation"
            status: "published"
      
      # Update content
      - if: "len(_.documents) > 0"
        then:
          - prompt:
              - role: system
                content: >
                  Update this documentation with new information:
                  {{_.documents[0].content}}
              - role: user
                content: "{{inputs.updates}}"
          
          - tool: update_doc
            arguments:
              document_id: "{{_.documents[0].id}}"
              content: _
              metadata:
                last_updated: "datetime.now().isoformat()"
                update_type: "{{inputs.update_type}}"
        else:
          - error: "Document not found"
    """
)

Integration Patterns

Document-Based Responses

Use documents to enhance responses:

main:
  # Search relevant documents
  - tool: search_docs
    arguments:
      text: "{{inputs.query}}"
      metadata_filter:
        status: "published"
  
  # Generate response
  - prompt:
      - role: system
        content: >
          You are a helpful assistant with access to the following documents:
          {{_.documents}}
          
          Use this information to provide accurate answers.
      - role: user
        content: "{{inputs.query}}"
  
  # Save response
  - tool: create_doc
    arguments:
      title: "Response: {{inputs.query}}"
      content: _
      metadata:
        type: "response"
        query: "{{inputs.query}}"
        sources: "{{[doc.id for doc in _.documents]}}"

Document Chains

Chain document operations:

main:
  # Initial search
  - tool: search_docs
    arguments:
      text: "{{inputs.topic}}"
  
  # Process each document
  - foreach:
      in: _.documents
      do:
        # Analyze document
        - prompt:
            - role: system
              content: "Analyze this document: {{_}}"
        
        # Create analysis document
        - tool: create_doc
          arguments:
            title: "Analysis: {{_.title}}"
            content: _
            metadata:
              type: "analysis"
              source_doc: "{{_.id}}"
  
  # Summarize analyses
  - tool: search_docs
    arguments:
      metadata_filter:
        type: "analysis"
        source_doc: {"$in": "{{[doc.id for doc in _.documents]}}"}
  
  # Create summary document
  - prompt:
      - role: system
        content: "Summarize these analyses: {{_.documents}}"
  
  - tool: create_doc
    arguments:
      title: "Summary: {{inputs.topic}}"
      content: _
      metadata:
        type: "summary"
        topic: "{{inputs.topic}}"

Best Practices

  1. Document Integration

    • Use appropriate document types
    • Maintain document relationships
    • Track document sources
  2. Workflow Design

    • Chain operations efficiently
    • Handle missing documents
    • Validate document content
  3. Performance

    • Cache frequently used documents
    • Use batch operations
    • Implement proper error handling

Example: Complex Document Integration

Here’s an example of a comprehensive document integration:

Python
class DocumentWorkflow:
    def __init__(self, client, agent_id):
        self.client = client
        self.agent_id = agent_id
    
    async def process_topic(self, topic):
        # Create processing task
        task = await self.client.tasks.create(
            agent_id=self.agent_id,
            yaml="""
            name: Document Processor
            description: Process and analyze topic documents

            tools:
              - name: search_docs
                system:
                  resource: agent
                  subresource: doc
                  operation: search
              
              - name: create_doc
                system:
                  resource: agent
                  subresource: doc
                  operation: create
              
              - name: update_doc
                system:
                  resource: agent
                  subresource: doc
                  operation: update

            main:
              # Initial research
              - parallel:
                  - tool: search_docs
                    arguments:
                      text: "{{inputs.topic}}"
                      metadata_filter:
                        type: "primary"
                  
                  - tool: search_docs
                    arguments:
                      text: "{{inputs.topic}} analysis"
                      metadata_filter:
                        type: "analysis"
              
              # Process results
              - evaluate:
                  primary_docs: _[0]
                  analysis_docs: _[1]
              
              # Generate new content if needed
              - if: "len(_.primary_docs) == 0"
                then:
                  - prompt:
                      - role: system
                        content: "Generate primary content for {{inputs.topic}}"
                  
                  - tool: create_doc
                    arguments:
                      title: "{{inputs.topic}}"
                      content: _
                      metadata:
                        type: "primary"
                        topic: "{{inputs.topic}}"
                        status: "draft"
                  
                  - evaluate:
                      primary_docs: [_]
              
              # Analyze documents
              - foreach:
                  in: _.primary_docs
                  do:
                    - prompt:
                        - role: system
                          content: "Analyze this document: {{_}}"
                    
                    - tool: create_doc
                      arguments:
                        title: "Analysis: {{_.title}}"
                        content: _
                        metadata:
                          type: "analysis"
                          source_doc: "{{_.id}}"
                          topic: "{{inputs.topic}}"
              
              # Generate summary
              - tool: search_docs
                arguments:
                  metadata_filter:
                    type: "analysis"
                    topic: "{{inputs.topic}}"
              
              - prompt:
                  - role: system
                    content: "Create a summary from these analyses: {{_.documents}}"
              
              - tool: create_doc
                arguments:
                  title: "Summary: {{inputs.topic}}"
                  content: _
                  metadata:
                    type: "summary"
                    topic: "{{inputs.topic}}"
                    source_analyses: "{{[doc.id for doc in _.documents]}}"
              
              # Return results
              - return:
                  primary_docs: "{{_.primary_docs}}"
                  analyses: "{{_.documents}}"
                  summary: _
            """
        )
        
        # Execute task
        execution = await self.client.executions.create(
            task_id=task.id,
            input={"topic": topic}
        )
        
        # Monitor execution
        while True:
            result = await self.client.executions.get(execution.id)
            if result.status == "succeeded":
                return result.output
            elif result.status == "failed":
                raise Exception(f"Task failed: {result.error}")
            await asyncio.sleep(1)

# Use the workflow
workflow = DocumentWorkflow(client, agent.id)
result = await workflow.process_topic("AI Security")

print("Primary Documents:", result["primary_docs"])
print("Analyses:", result["analyses"])
print("Summary:", result["summary"])

Next Steps

  1. Learn about document management
  2. Explore vector search
  3. Understand task basics