> ## Documentation Index
> Fetch the complete documentation index at: https://docs.julep.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Video Processing

> Learn how to process and analyze videos using Julep

## Overview

This tutorial demonstrates how to:

* Upload and process videos using Cloudinary integration
* Extract and analyze video content
* Add overlays and transformations
* Process video subtitles and speaker information

## Task Structure

Let's break down the task into its core components:

### 1. Input Schema

First, we define what inputs our task expects:

```yaml theme={"dark"}
input_schema:
  type: object
  properties:
    upload_file:
      type: string
      description: The url of the file to upload
    public_id:
      type: string
      description: The public id of the file to upload
    transformation_prompt:
      type: string
      description: The prompt for the transformations to apply to the file
    subtitle_vtt:
      type: string
      description: The vtt file content to add subtitles to the video
```

This schema specifies that our task expects:

* A video file URL
* A public ID for the video
* A transformation prompt describing desired changes
* VTT subtitle content (optional)

### 2. Tools Configuration

Next, we define the external tools our task will use:

```yaml theme={"dark"}
- name: cloudinary_upload
  type: integration
  integration:
    provider: cloudinary
    method: media_upload
    setup:
      cloudinary_api_key: YOUR_CLOUDINARY_API_KEY
      cloudinary_api_secret: YOUR_CLOUDINARY_API_SECRET
      cloudinary_cloud_name: YOUR_CLOUDINARY_CLOUD_NAME

- name: cloudinary_edit
  type: integration
  integration:
    provider: cloudinary
    method: media_edit

- name: ffmpeg_edit
  type: integration
  integration:
    provider: ffmpeg
```

We're using three main integrations:

* Cloudinary for video uploads and transformations
* FFmpeg for additional video processing capabilities

### 3. Main Workflow Steps

<Steps>
  <Step title="Initial Video Upload">
    ```yaml theme={"dark"}
    - tool: cloudinary_upload
      arguments:
        file: $ steps[0].input.video_url
        public_id: $ steps[0].input.public_id
        upload_params:
          resource_type: video
    ```

    <Accordion title="Understanding the use of the steps[0].input variable">
      The `steps[0].input` variable refers to the initial input object passed to the task. It's used to access the input parameters defined in the input schema.
    </Accordion>

    This step:

    * Takes the input video URL
    * Uploads it to Cloudinary
    * Specifies the resource type as video
  </Step>

  <Step title="Create Video Preview">
    ```yaml theme={"dark"}
    - tool: cloudinary_upload
      arguments:
        file: $ steps[0].input.upload_file
        public_id: $ steps[0].input.public_id
        upload_params:
          resource_type: video
          transformation:
            - start_offset: 0
              end_offset: 30
    ```

    This step:

    * Creates a 30-second preview of the video
    * Useful for quick analysis and processing
  </Step>

  <Step title="Analyze Video Content">
    ```yaml theme={"dark"}
    - prompt:
      - role: user
        content: 
          - type: image_url
            image_url:
              url: trimmed_video_url
          - type: text
            text: |-
              Which speakers are speaking in the video? And where does each of them sit?
    ```

    This step:

    * Analyzes the video content
    * Identifies speakers and their positions
    * Uses VTT subtitles for additional context
  </Step>

  <Step title="Generate Speaker Transformations">
    ```yaml theme={"dark"}
    - evaluate:
        speakers_transformations: |-
          $ [
          transform
          for speaker in _.speakers_json
            for transform in [
            {
              "overlay": {"font_family": "Arial", "font_size": 32, "text": speaker.speaker},
              "color": "white"
            },
            {
              "duration": 5,
              "flags": "layer_apply",
              "gravity": "south_east" if speaker.position == "right" else "south_west",
              "start_offset": speaker.timestamps[0].start,
              "y": 80,
              "x": 80
            }
            ]
          ]
    ```

    This step:

    * Creates transformations for each speaker
    * Adds speaker labels with proper positioning
    * Sets timing for each overlay
  </Step>

  <Step title="Apply Transformations">
    ```yaml theme={"dark"}
    - tool: cloudinary_upload
      arguments:
        file: $ steps[0].input.upload_file
        public_id: $ steps[0].input.public_id
        upload_params:
          resource_type: video
          transformation: _.speakers_transformations
    ```

    This step:

    * Uses the Cloudinary upload tool to apply the generated transformations
    * Processes the video with speaker labels and positioning
    * Returns a URL to the transformed video with all overlays applied
  </Step>
</Steps>

<Accordion title="Complete Task YAML" icon="code">
  ````yaml YAML [expandable] theme={"dark"}
  # yaml-language-server: $schema=https://raw.githubusercontent.com/julep-ai/julep/refs/heads/dev/src/schemas/create_task_request.json
  name: Julep Video Processing Task
  description: A Julep agent that can process and analyze videos using Cloudinary

  ########################################################
  ################### INPUT SCHEMA #######################
  ########################################################

  input_schema:
    type: object
    properties:
      video_url:
        type: string
        description: The url of the file to upload
      public_id:
        type: string
        description: The public id of the file to upload
      transformation_prompt:
        type: string
        description: The prompt for the transformations to apply to the file

  ########################################################
  ################### TOOLS ##############################
  ########################################################

  tools:
  - name: cloudinary_upload
    type: integration
    integration:
      provider: cloudinary
      method: media_upload
      setup:
        cloudinary_api_key: "YOUR_CLOUDINARY_API_KEY"
        cloudinary_api_secret: "YOUR_CLOUDINARY_API_SECRET"
        cloudinary_cloud_name: "YOUR_CLOUDINARY_CLOUD_NAME"

  ########################################################
  ################### MAIN WORKFLOW ######################
  ########################################################

  main:
  # Step #0 - Upload the video to cloudinary
  - tool: cloudinary_upload
    arguments:
      file: $ steps[0].input.video_url
      public_id: $ steps[0].input.public_id
      upload_params:
        resource_type: video

  # Step #1 - Analyze the video content with a prompt
  - prompt:
    - role: user
      content:

        - type: text
          text: |-
            You are a Cloudinary expert. You are given a medial url. it might be an image or a video.
            You need to come up with a json of transformations to apply to the given media.
            Overall the json could have multiple transformation json objects.
            Each transformation json object can have the multiple key value pairs.
            Each key value pair should have the key as the transformation name like "aspect_ratio", "crop", "width" etc and the value as the transformation parameter value.
            Given below is an example of a transformation json list. Don't provide explanations and/or comments in the json.
            ``json
            [
              {{
                "aspect_ratio": "1.0",
                "width": 250,
              }},
              {{
                "fetch_format": "auto"
              }},
              {{
                "overlay":
                {{
                  "url": "<image_url>"
                }}
              }},
              {{
                "flags": "layer_apply"
              }}
            ]
            ``
        - type: image_url
          image_url:
            url: $ _.url

        - type: text
          text: |-
            $ f'''Hey, check the video above, I need to apply the following transformations using cloudinary.
            {steps[0].input.transformation_prompt}'''

    unwrap: true
    settings:
      model: gemini/gemini-1.5-pro

  # Step #2 - Extract the json from the model's response
  - evaluate:
      model_transformation: >-
        $ load_json(
          _[_.find("```json")+7:][:_[_.find("```json")+7:].find("```")])

  # Step #3 - Upload the video to cloudinary
  - tool: cloudinary_upload
    arguments:
      file: $ steps[0].input.video_url
      public_id: $ steps[0].input.public_id
      upload_params:
        transformation: $ _.model_transformation
        resource_type: video

  # Step #4 - Evaluate the transformed video url
  - evaluate:
      transformed_video_url: $ _.url
  ````
</Accordion>

## Usage

Here's how to use this task with the Julep SDK:

<CodeGroup>
  ```python Python [expandable] theme={"dark"}
  import time
  import yaml
  from julep import Client

  # Initialize the client
  client = Client(api_key=JULEP_API_KEY)

  transformation_prompt = """
  1- I want to add an overlay an the following image to the video, and apply a layer apply flag also. Here's the image url:
  https://res.cloudinary.com/demo/image/upload/logos/cloudinary_icon_white.png

  2- I also want you to to blur the video, and add a fade in and fade out effect to the video with a duration of 3 seconds each.
  """
  # Create the agent
  agent = client.agents.create(
    name="Julep Video Processing Agent",
    about="A Julep agent that can process and analyze videos using Cloudinary and FFmpeg.",
  )

  # Load the task definition
  with open('video_processing_task.yaml', 'r') as file:
    task_definition = yaml.safe_load(file)

  # Create the task
  task = client.tasks.create(
    agent_id=agent.id,
    **task_definition
  )

  # Create the execution
  execution = client.executions.create(
      task_id=task.id,
      input={
          "video_url":  "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4",
          "public_id": "video_test",
          "transformation_prompt": transformation_prompt,
      }
  )
  # Wait for the execution to complete
  while (result := client.executions.get(execution.id)).status not in ['succeeded', 'failed']:
      print(result.status)
      time.sleep(1)

  # Print the result
  if result.status == "succeeded":
      print(result.output)
  else:
      print(f"Error: {result.error}")
  ```

  ```js Node.js [expandable] theme={"dark"}
  import { Julep } from '@julep/sdk';
  import yaml from 'yaml';
  import fs from 'fs';

  const transformation_prompt = `
  1- I want to add an overlay an the following image to the video, and apply a layer apply flag also. Here's the image url:
  https://res.cloudinary.com/demo/image/upload/logos/cloudinary_icon_white.png

  2- I also want you to to blur the video, and add a fade in and fade out effect to the video with a duration of 3 seconds each.
  `;

  const client = new Julep({
    apiKey: 'your_julep_api_key'
  });

  // Create the agent
  const agent = await client.agents.create({
    name: "Julep Video Processing Agent",
    about: "A Julep agent that can process and analyze videos using Cloudinary and FFmpeg.",
  });

  // Load the task definition
  const taskDefinition = yaml.parse(fs.readFileSync('video_processing_task.yaml', 'utf8'));

  // Create the task
  const task = await client.tasks.create(
    agent.id,
    taskDefinition
  );

  // Create the execution
  const execution = await client.executions.create(
    task.id,
    {
      input: { 
        "video_url": "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4",
        "public_id": "video_test",
        "transformation_prompt": transformation_prompt,
      }
    }
  );

  // Wait for the execution to complete
  let result;
  while (true) {
    result = await client.executions.get(execution.id);
    if (result.status === 'succeeded' || result.status === 'failed') break;
    console.log(result.status);
    await new Promise(resolve => setTimeout(resolve, 1000));
  }

  // Print the result
  if (result.status === 'succeeded') {
    console.log(result.output);
  } else {
    console.error(`Error: ${result.error}`);
  }
  ```
</CodeGroup>

## Example Output

This is an example output when the task is run over the sample video input.

<Accordion title="Sample Video Input">
  <video src="http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4" controls />
</Accordion>

<Accordion title="Sample Video Output">
  <video src="https://res.cloudinary.com/dpnjjk8mb/video/upload/v1737201857/video_test2.mp4" controls />
</Accordion>

## Next Steps

* Try this task yourself, check out the full example, see the [video-processing-with-natural-language cookbook](https://github.com/julep-ai/julep/blob/main/cookbooks/advanced/05-video-processing-with-natural-language.ipynb).
* To learn more about the integrations used in this task, check out the [integrations](/integrations/supported-integrations) page.

## Related Concepts

* [Agents](/concepts/agents)
* [Tasks](/concepts/tasks)
* [Tools](/concepts/tools)
