Overview

This tutorial demonstrates how to:

  • Upload and process videos using Cloudinary integration
  • Extract and analyze video content
  • Add overlays and transformations
  • Process video subtitles and speaker information

Task Structure

Let’s break down the task into its core components:

1. Input Schema

First, we define what inputs our task expects:

input_schema:
  type: object
  properties:
    upload_file:
      type: string
      description: The url of the file to upload
    public_id:
      type: string
      description: The public id of the file to upload
    transformation_prompt:
      type: string
      description: The prompt for the transformations to apply to the file
    subtitle_vtt:
      type: string
      description: The vtt file content to add subtitles to the video

This schema specifies that our task expects:

  • A video file URL
  • A public ID for the video
  • A transformation prompt describing desired changes
  • VTT subtitle content (optional)

2. Tools Configuration

Next, we define the external tools our task will use:

- name: cloudinary_upload
  type: integration
  integration:
    provider: cloudinary
    method: media_upload
    setup:
      cloudinary_api_key: YOUR_CLOUDINARY_API_KEY
      cloudinary_api_secret: YOUR_CLOUDINARY_API_SECRET
      cloudinary_cloud_name: YOUR_CLOUDINARY_CLOUD_NAME

- name: cloudinary_edit
  type: integration
  integration:
    provider: cloudinary
    method: media_edit

- name: ffmpeg_edit
  type: integration
  integration:
    provider: ffmpeg

We’re using three main integrations:

  • Cloudinary for video uploads and transformations
  • FFmpeg for additional video processing capabilities

3. Main Workflow Steps

1

Initial Video Upload

- tool: cloudinary_upload
  arguments:
    file: '_0.upload_file'
    public_id: '_0.public_id'
    upload_params:
      resource_type: "'video'"

This step:

  • Takes the input video URL
  • Uploads it to Cloudinary
  • Specifies the resource type as video
2

Create Video Preview

- tool: cloudinary_upload
  arguments:
    file: '_0.upload_file'
    public_id: '_0.public_id'
    upload_params:
      resource_type: "'video'"
      transformation:
        - start_offset: "0"
          end_offset: "30"

This step:

  • Creates a 30-second preview of the video
  • Useful for quick analysis and processing
3

Analyze Video Content

- prompt:
  - role: user
    content: 
      - type: image_url
        image_url:
          url: "{{trimmed_video_url}}"
      - type: text
        text: |-
          Which speakers are speaking in the video? And where does each of them sit?

This step:

  • Analyzes the video content
  • Identifies speakers and their positions
  • Uses VTT subtitles for additional context
4

Generate Speaker Transformations

- evaluate:
    speakers_transformations: |-
      [
      transform
      for speaker in _.speakers_json
        for transform in [
        {
          "overlay": {"font_family": "Arial", "font_size": 32, "text": speaker.speaker},
          "color": "white"
        },
        {
          "duration": 5,
          "flags": "layer_apply",
          "gravity": "south_east" if speaker.position == "right" else "south_west",
          "start_offset": speaker.timestamps[0].start,
          "y": 80,
          "x": 80
        }
        ]
      ]

This step:

  • Creates transformations for each speaker
  • Adds speaker labels with proper positioning
  • Sets timing for each overlay
5

Apply Transformations

- tool: cloudinary_upload
  arguments:
    file: '_0.upload_file'
    public_id: '_0.public_id'
    upload_params:
      resource_type: "'video'"
      transformation: _.speakers_transformations

This step:

  • Uses the Cloudinary upload tool to apply the generated transformations
  • Processes the video with speaker labels and positioning
  • Returns a URL to the transformed video with all overlays applied

Example Usage

Here’s how to use this task with the Julep SDK:

transformation_prompt = """
1- I want to add an overlay an the following image to the video, and apply a layer apply flag also. Here's the image url:
https://res.cloudinary.com/demo/image/upload/logos/cloudinary_icon_white.png

2- I also want you to to blur the video, and add a fade in and fade out effect to the video with a duration of 3 seconds each.
"""

execution = client.executions.create(
    task_id=TASK_UUID,
    input={
        "video_url":  "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4",
        "public_id": "video_test",
        "transformation_prompt": transformation_prompt,
    }
)

Sample Video Input

Sample Video Output

Next Steps

Try this task yourself, check out the full example, see the video-processing-with-natural-language cookbook.