> ## Documentation Index
> Fetch the complete documentation index at: https://docs.julep.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Spider Crawler

> Learn how to use the Spider Crawler integration with Julep

## Overview

Welcome to the Spider Crawler integration guide for Julep! This integration allows you to crawl websites and extract data, enabling you to build workflows that require web scraping capabilities. Whether you're gathering data for analysis or monitoring web content, this guide will walk you through the setup and usage.

## Prerequisites

<Info type="info" title="API Key Required">
  To use the Spider integration, you need an API key. You can obtain this key by signing up at [Spider](https://spider.cloud/).
</Info>

## How to Use the Integration

To get started with the Spider integration, follow these steps to configure and create a task:

<Steps>
  <Step title="Configure Your API Key">
    Add your API key to the tools section of your task. This will allow Julep to authenticate requests to Spider on your behalf.
  </Step>

  <Step title="Create Task Definition">
    Use the following YAML configuration to define your web crawling task:

    ```yaml Spider Example theme={"dark"}
    name: Spider Task
    tools:
    - name: spider_tool
      type: integration
      integration:
        provider: spider
        method: crawl
        setup:
          spider_api_key: "SPIDER_API_KEY"
    main:
    - tool: spider_tool
      method: crawl
      arguments:
        url: $ _.url
        params: # Optional parameters
          key1: value1 # this a placeholder for the actual parameters
        content_type: application/json
    ```
  </Step>
</Steps>

### YAML Explanation

<AccordionGroup>
  <Accordion title="Basic Configuration">
    * ***name***: A descriptive name for the task, in this case, "Spider Task".
    * ***tools***: This section lists the tools or integrations being used. Here, `spider_tool` is defined as an integration tool.
  </Accordion>

  <Accordion title="Tool Configuration">
    * ***type***: Specifies the type of tool, which is `integration` in this context.
    * ***integration***: Details the provider and setup for the integration.
      * ***provider***: Indicates the service provider, which is `spider` for Spider.
      * ***method***: Specifies the method to use, such as `crawl`, `links`, `screenshot`, or `search`. Defaults to `crawl` if not specified.
      * ***setup***: Contains configuration details, such as the API key (`spider_api_key`) required for authentication.
  </Accordion>

  <Accordion title="Workflow Configuration">
    * ***main***: Defines the main execution steps.
      * ***tool***: Refers to the tool defined earlier (`spider_tool`).
      * ***arguments***: Specifies the input parameters for the tool:
        * ***url***: The URL for which to fetch data.
        * ***params***: (optional) The parameters for the Spider API. Defaults to None.
        * ***content\_type***: (optional) The content type to return. Default is "application/json". Other options: "text/csv", "application/xml", "application/jsonl".
  </Accordion>
</AccordionGroup>

<Note>
  Remember to replace `SPIDER_API_KEY` with your actual API key. Customize the `url`, `params`, and `content_type` parameters to suit your specific needs.
</Note>

<Note>
  The different parameters available depending on the method used for the Spider integration can be found in the [Spider API documentation](https://spider.cloud/api).
</Note>

## Conclusion

With the Spider integration, you can efficiently crawl websites and extract valuable data.
This integration provides a robust solution for web scraping, enhancing your workflow's capabilities and user experience.

<Tip>
  For more information, please refer to the [Spider API documentation](https://spider.cloud/api).
</Tip>
