Jobs

A job in Snoweaver is a configurable unit that defines the metadata and instructions necessary to execute specific tasks. Jobs can be instantiated as executable entities, which can then be triggered programmatically or integrated into Snowflake Tasks for scheduled execution alongside other data pipelines.

The Jobs page in Snoweaver provides a comprehensive interface for creating, editing, testing, and building instances of jobs. When creating a job, you’ll need to define various properties, which are categorized into common properties applicable to all job types and type-specific properties. To enhance flexibility and reusability, many job properties support Jinja templating.

This documentation will provide a detailed explanation of each job property, helping you understand how to configure and optimize your Snoweaver jobs effectively.

Common Properties

Name

Specify a unique name for the job within the project.

Job Type

Snoweaver currently supports four job types:

Web Request Job

Initiates an HTTP request to a specified endpoint and processes the response, which can be in the form of a message or a file. This job type is versatile for interacting with external APIs or web services.

Generic Job

Executes a Jinja template and returns the rendered output. This flexible job type can be used for various tasks that require dynamic content generation or complex logic processing.

SSH Job

Establishes a secure connection to a remote server via SSH and executes specified shell commands. This job type is ideal for remote system administration or running scripts on distant machines.

SFTP Job

Connects to an SFTP (Secure File Transfer Protocol) server to either retrieve or upload files. This job type is crucial for secure file transfers between Snowflake and external systems, supporting both download and upload operations.

Instance Type

Snoweaver jobs can be deployed as executable job instances in Snowflake, offering three distinct types:

Procedure

  • Executed using the CALL command

  • Ideal for operations requiring procedural execution

  • When used as a web request job, returns results in a single-column format

  • Exclusive access to the call_proc function

  • Best for complex, multi-step operations or when side effects are needed

Function

  • Executed within the SELECT clause of a query

  • Returns results in a single-column format

  • Suitable for simple, atomic operations that return a single value

  • Useful for data transformations or calculations within queries

Table Function

  • Executed within the FROM clause of a query

  • Returns multiple columns, including:
    • Start and end times of execution

    • Output data

    • For web request jobs: HTTP status code and headers

  • Useful for complex data retrieval or processing tasks

Each instance type has its unique characteristics and use cases, allowing for flexible job design and execution based on specific requirements.

Macro Libraries

In Jinja, a macro is a reusable code snippet that defines a block of template code with parameters. Macro libraries allow you to consolidate and manage these macros, which can then be imported into your job as needed.

Functions

This property allows you to import Python functions directly into your job’s Jinja environment. Use the Functions page for the development and management of these functions.

Variables

Job variables are accessible throughout all Jinja-compatible components within a job. These variables act as input parameters for job instances, including procedures and functions.

Variables adhere to Snowflake’s data type model. When used in a Jinja template, they are mapped to Python data types as follows:

Snowflake SQL Type

Python Type

Text

str

Boolean

bool

Number

int or decimal.Decimal

Object

dict

Array

list

For detailed information on data type mappings between Snowflake and Python, refer to the Snowflake: SQL-Python Data Type Mappings

External Access Integration

Available for web request or SFTP jobs.

To grant access to specific external network locations, configure an external access integration. This setup involves defining network rules to specify allowed external locations and managing a list of secrets for secure connections.

Endpoint

Available for Web Request, SSH or SFTP jobs.

This property specifies the URL endpoint that the job will connect to. It supports Jinja templating, allowing the use of predefined variables, macro libraries, and functions. Note that it does not have access to pre-hook outputs or secrets.

Note

Avoid including query strings directly in this section. Instead, use the URL parameter template for such cases.

Filepath

Required for SFTP jobs or Web Request jobs when the Payload Format or Response Format properties are set to File. This defines the relative file path for uploading or downloading files from the DATA stage of the project.

This property supports Jinja templating and can utilize pre-hook outputs, predefined variables, macro libraries, and job functions. However, it does not have access to secrets.

For file uploads, Snoweaver prefixes the path with the job name. For instance, if the path is /dir1/file.csv, the full path will be @DATA/[job_name]/dir1/file.csv.

Note

The file path should be relative to the root of the DATA stage. Do not include the stage name “@DATA/” in the value

File Operations Settings

Available for web request or SFTP jobs.

Configure Snowflake file operation settings to manage the downloading or uploading of files from the DATA stage of the project.

Number of Threads

Determine the number of threads to use for file transfers to and from the DATA stage of the project. The file upload process handles data in batches based on size:

  • Small Files (< 64 MB, compressed or uncompressed): Staged in parallel as individual files.

  • Large Files: Automatically divided into chunks, staged concurrently, and reassembled at the target stage. A single thread can handle multiple chunks.

Increasing the number of threads can enhance performance, especially for large file uploads. Supported values range from 1 (no parallelism) to 99 (using up to 99 threads for uploads).

Auto Compress

This setting controls whether Snowflake applies gzip compression during the upload process. If the source file is already compressed, Snowflake will detect the compression method and decompress the file before proceeding.

Auto Decompress

Specifies whether gzip decompression is applied to files after they are downloaded from the DATA stage.

Overwrite

Indicates whether Snowflake should overwrite existing files with the same name during the upload process.

Authenticating Type

Required for SSH and SFTP jobs. Select the authentication method for connecting to the remote SSH server:

  • Password: Authenticate using a username and password combination. This method is simpler but potentially less secure than key-based authentication.

  • Key: Authenticate using a private key file. This method offers enhanced security through public-key cryptography.

Account Secret

Required for SSH and SFTP jobs. Specify the secret based on the chosen authentication method:

  • Password: Provide a basic secret containing the SSH/SFTP account’s username and password.

  • Key: Supply a basic string secret that includes the SSH/SFTP account’s username and the corresponding private key.

Passphrase Secret

Required for SSH and SFTP jobs when using key-based authentication with a passphrase-protected private key. Specify the generic string secret containing the passphrase used to decrypt the private key.

Port

Required for SSH and SFTP jobs. Specify the port number for the remote server connection. By default, SSH and SFTP typically use port 22. However, if the remote server is configured to use a different port, you need to specify it here.

Save Job Results

When enabled, Snoweaver saves the job run results directly to the RESULTS table within the project. This eliminates the need for additional steps to save the results to a staging table for downstream processing.

For implementation guidelines, please refer to this tutorial Tutorial: Save results and orchestrate tasks.

Note

Review your data retention strategies for the RESULTS table if you plan to store a large volume of result data.

Save Scope

The scope column value for records produced by this job in the RESULTS table.

For implementation guidelines, please refer to this tutorial Tutorial: Save results and orchestrate tasks.

Web Request Job Properties

Request Type

HTTP request types define the actions that can be performed on web resources. The supported methods include:

  • GET: Retrieves data from a specified resource.

  • POST: Submits data to a specified resource for processing.

  • PUT: Replaces a resource at a specified URI with a new representation.

  • DELETE: Removes the specified resource.

  • PATCH: Partially updates a resource

Payload Format

The HTTP payload, also known as the HTTP body, carries the data transmitted in an HTTP request. This data can take various forms, such as:

  • Text: Suitable for request bodies containing free text or XML data.

  • JSON: Used for interfacing with a REST API.

  • File: Used for uploading files from the project’s DATA stage.

  • Multipart: Used for multipart requests, allowing files specified from the DATA stage to be included in the payload.

Response Format

The response format defines the structure of the data returned from a job. The available response formats are:

Response Format

Return Data Type

Description

Text

Text

For request bodies containing free text or XML data.

JSON

Variant

For interfacing with a REST API.

File

Text

Used for downloading files or saving responses as files.
The job instance uploads the file to the DATA stage and returns its relative path within that stage.

Secrets

You can define a set of secrets along with their aliases, which can be referenced in any Jinja-compatible properties, excluding endpoint, filepath, and posthook.

For Generic String and OAuth secrets, use the following syntax to refer to their string or token values:

{{ _secrets.[alias] }}

For Basic secrets, use the syntax below to access their username or password values:

{{ _secrets.[alias].username }}
{{ _secrets.[alias].password }}

Pre-hook

Pre-hooks are used to perform essential prerequisites, such as invoking other procedures within the project or setting global variables. This property is compatible with Jinja, allowing the use of predefined variables, macro libraries, secrets, and job functions.

Other Jinja-compatible components can access the output of pre-hooks using the following variable:

{{ _prehook }}

Query Parameters

URL parameters, also known as query strings, are segments of a URL that convey data to web applications. They are commonly used to transmit parameters to a server-side script or application.

This property is Jinja compatible and can utilize pre-hook output, predefined variables, secrets, macro libraries, and functions.

The output must be in JSON format.

Headers

HTTP headers are key-value pairs sent in HTTP requests and responses between clients and servers. They provide essential metadata about the request or response, such as content type, content length, server type, client type, caching policies, and authentication credentials.

This property is Jinja-compatible and can utilize prehook output, predefined variables, secrets, macro libraries, and functions.

The output must be in JSON format.

Payload

The payload refers to the data segment of an HTTP request that contains the actual information being sent to the remote server.

This property is Jinja-compatible and can utilize prehook output, predefined variables, secrets, macro libraries, and functions.

The payload will be formatted based on the specified Payload Format upon verification or transmission.

Post-hook

Post-hooks are triggered after a response is received from a web request, allowing for additional processing or actions based on the response data.

This feature is fully compatible with Jinja, allowing you to use predefined variables, macro libraries, functions, status codes, and response data. However, it does not have access to the secrets.

In a post-hook, you can access the HTTP status code, headers, and payload of the response using the following variables:

{{ _status_code }}
{{ _headers }}
{{ _response }}

Replace The Response Payload

When enabled, this feature allows the job to substitute the original web response payload with the output produced by the post-hook. The generated output is subsequently transformed into the designated response format.

Note

If Save job results is enabled, the post-hook output will be stored in the RESULTS table instead.

Generic Job Properties

Output Template

This property defines the Jinja template used by the generic job. It supports the use of predefined variables, macro libraries, and job functions.

Output Format

Specify the format of the job’s output data.

Output Format

Return Data Type

Description

Text

Text

Use this format if the output contains free text data.

JSON

Variant

Use this format if the output is a JSON document.

SSH Job Properties

Shell Command

The Jinja template for the command to be executed on a remote Linux server. You can use predefined variables, macro libraries, and job functions to create complex and flexible SSH commands. The rendered template will be the actual command executed on the remote server.

SFTP Job Properties

Operation Type

Select whether the job downloads from or uploads to the SFTP server.

SFTP Service File Path

Specify the file path for uploading to or downloading from the SFTP server.

This property is compatible with Jinja and supports predefined variables, macro libraries, and functions. Note that it does not have access to secrets.