Jobs
A job in Snoweaver is a configurable unit that defines the metadata and instructions necessary to execute specific tasks. Jobs can be instantiated as executable entities, which can then be triggered programmatically or integrated into Snowflake Tasks for scheduled execution alongside other data pipelines.
The Jobs page in Snoweaver provides a comprehensive interface for creating, editing, testing, and building instances of jobs. When creating a job, you’ll need to define various properties, which are categorized into common properties applicable to all job types and type-specific properties. To enhance flexibility and reusability, many job properties support Jinja templating.
This documentation will provide a detailed explanation of each job property, helping you understand how to configure and optimize your Snoweaver jobs effectively.
Common Properties
Name
Specify a unique name for the job within the project.
Job Type
Snoweaver currently supports four job types:
Web Request Job
Initiates an HTTP request to a specified endpoint and processes the response, which can be in the form of a message or a file. This job type is versatile for interacting with external APIs or web services.
Generic Job
Executes a Jinja template and returns the rendered output. This flexible job type can be used for various tasks that require dynamic content generation or complex logic processing.
SSH Job
Establishes a secure connection to a remote server via SSH and executes specified shell commands. This job type is ideal for remote system administration or running scripts on distant machines.
SFTP Job
Connects to an SFTP (Secure File Transfer Protocol) server to either retrieve or upload files. This job type is crucial for secure file transfers between Snowflake and external systems, supporting both download and upload operations.
Instance Type
Snoweaver jobs can be deployed as executable job instances in Snowflake, offering three distinct types:
Procedure
Executed using the CALL command
Ideal for operations requiring procedural execution
When used as a web request job, returns results in a single-column format
Exclusive access to the call_proc function
Best for complex, multi-step operations or when side effects are needed
Function
Executed within the SELECT clause of a query
Returns results in a single-column format
Suitable for simple, atomic operations that return a single value
Useful for data transformations or calculations within queries
Table Function
Executed within the FROM clause of a query
- Returns multiple columns, including:
Start and end times of execution
Output data
For web request jobs: HTTP status code and headers
Useful for complex data retrieval or processing tasks
Each instance type has its unique characteristics and use cases, allowing for flexible job design and execution based on specific requirements.
Macro Libraries
In Jinja, a macro is a reusable code snippet that defines a block of template code with parameters. Macro libraries allow you to consolidate and manage these macros, which can then be imported into your job as needed.
Functions
This property allows you to import Python functions directly into your job’s Jinja environment. Use the Functions page for the development and management of these functions.
Variables
Job variables are accessible throughout all Jinja-compatible components within a job. These variables act as input parameters for job instances, including procedures and functions.
Variables adhere to Snowflake’s data type model. When used in a Jinja template, they are mapped to Python data types as follows:
Snowflake SQL Type |
Python Type |
---|---|
Text |
str |
Boolean |
bool |
Number |
int or decimal.Decimal |
Object |
dict |
Array |
list |
For detailed information on data type mappings between Snowflake and Python, refer to the Snowflake: SQL-Python Data Type Mappings
External Access Integration
Available for web request or SFTP jobs.
To grant access to specific external network locations, configure an external access integration. This setup involves defining network rules to specify allowed external locations and managing a list of secrets for secure connections.
Endpoint
Available for Web Request, SSH or SFTP jobs.
This property specifies the URL endpoint that the job will connect to. It supports Jinja templating, allowing the use of predefined variables, macro libraries, and functions. Note that it does not have access to pre-hook outputs or secrets.
Note
Avoid including query strings directly in this section. Instead, use the URL parameter template for such cases.
Filepath
Required for SFTP jobs or Web Request jobs when the Payload Format or Response Format properties are set to File. This defines the relative file path for uploading or downloading files from the DATA stage of the project.
This property supports Jinja templating and can utilize pre-hook outputs, predefined variables, macro libraries, and job functions. However, it does not have access to secrets.
For file uploads, Snoweaver prefixes the path with the job name. For instance, if the path is /dir1/file.csv, the full path will be @DATA/[job_name]/dir1/file.csv.
Note
The file path should be relative to the root of the DATA stage. Do not include the stage name “@DATA/” in the value
File Operations Settings
Available for web request or SFTP jobs.
Configure Snowflake file operation settings to manage the downloading or uploading of files from the DATA stage of the project.
Number of Threads
Determine the number of threads to use for file transfers to and from the DATA stage of the project. The file upload process handles data in batches based on size:
Small Files (< 64 MB, compressed or uncompressed): Staged in parallel as individual files.
Large Files: Automatically divided into chunks, staged concurrently, and reassembled at the target stage. A single thread can handle multiple chunks.
Increasing the number of threads can enhance performance, especially for large file uploads. Supported values range from 1 (no parallelism) to 99 (using up to 99 threads for uploads).
Auto Compress
This setting controls whether Snowflake applies gzip compression during the upload process. If the source file is already compressed, Snowflake will detect the compression method and decompress the file before proceeding.
Auto Decompress
Specifies whether gzip decompression is applied to files after they are downloaded from the DATA stage.
Overwrite
Indicates whether Snowflake should overwrite existing files with the same name during the upload process.
Authenticating Type
Required for SSH and SFTP jobs. Select the authentication method for connecting to the remote SSH server:
Password: Authenticate using a username and password combination. This method is simpler but potentially less secure than key-based authentication.
Key: Authenticate using a private key file. This method offers enhanced security through public-key cryptography.
Account Secret
Required for SSH and SFTP jobs. Specify the secret based on the chosen authentication method:
Password: Provide a basic secret containing the SSH/SFTP account’s username and password.
Key: Supply a basic string secret that includes the SSH/SFTP account’s username and the corresponding private key.
Passphrase Secret
Required for SSH and SFTP jobs when using key-based authentication with a passphrase-protected private key. Specify the generic string secret containing the passphrase used to decrypt the private key.
Port
Required for SSH and SFTP jobs. Specify the port number for the remote server connection. By default, SSH and SFTP typically use port 22. However, if the remote server is configured to use a different port, you need to specify it here.
Save Job Results
When enabled, Snoweaver saves the job run results directly to the RESULTS table within the project. This eliminates the need for additional steps to save the results to a staging table for downstream processing.
For implementation guidelines, please refer to this tutorial Tutorial: Save results and orchestrate tasks.
Note
Review your data retention strategies for the RESULTS table if you plan to store a large volume of result data.
Save Scope
The scope column value for records produced by this job in the RESULTS table.
For implementation guidelines, please refer to this tutorial Tutorial: Save results and orchestrate tasks.
Web Request Job Properties
Request Type
HTTP request types define the actions that can be performed on web resources. The supported methods include:
GET: Retrieves data from a specified resource.
POST: Submits data to a specified resource for processing.
PUT: Replaces a resource at a specified URI with a new representation.
DELETE: Removes the specified resource.
PATCH: Partially updates a resource
Payload Format
The HTTP payload, also known as the HTTP body, carries the data transmitted in an HTTP request. This data can take various forms, such as:
Text: Suitable for request bodies containing free text or XML data.
JSON: Used for interfacing with a REST API.
File: Used for uploading files from the project’s DATA stage.
Multipart: Used for multipart requests, allowing files specified from the DATA stage to be included in the payload.
Response Format
The response format defines the structure of the data returned from a job. The available response formats are:
Response Format |
Return Data Type |
Description |
---|---|---|
Text |
Text |
For request bodies containing free text or XML data. |
JSON |
Variant |
For interfacing with a REST API. |
File |
Text |
Used for downloading files or saving responses as files.
The job instance uploads the file to the DATA stage and returns its relative path within that stage.
|
Secrets
You can define a set of secrets along with their aliases, which can be referenced in any Jinja-compatible properties, excluding endpoint, filepath, and posthook.
For Generic String and OAuth secrets, use the following syntax to refer to their string or token values:
{{ _secrets.[alias] }}
For Basic secrets, use the syntax below to access their username or password values:
{{ _secrets.[alias].username }}
{{ _secrets.[alias].password }}
Pre-hook
Pre-hooks are used to perform essential prerequisites, such as invoking other procedures within the project or setting global variables. This property is compatible with Jinja, allowing the use of predefined variables, macro libraries, secrets, and job functions.
Other Jinja-compatible components can access the output of pre-hooks using the following variable:
{{ _prehook }}
Query Parameters
URL parameters, also known as query strings, are segments of a URL that convey data to web applications. They are commonly used to transmit parameters to a server-side script or application.
This property is Jinja compatible and can utilize pre-hook output, predefined variables, secrets, macro libraries, and functions.
The output must be in JSON format.
Headers
HTTP headers are key-value pairs sent in HTTP requests and responses between clients and servers. They provide essential metadata about the request or response, such as content type, content length, server type, client type, caching policies, and authentication credentials.
This property is Jinja-compatible and can utilize prehook output, predefined variables, secrets, macro libraries, and functions.
The output must be in JSON format.
Payload
The payload refers to the data segment of an HTTP request that contains the actual information being sent to the remote server.
This property is Jinja-compatible and can utilize prehook output, predefined variables, secrets, macro libraries, and functions.
The payload will be formatted based on the specified Payload Format upon verification or transmission.
Post-hook
Post-hooks are triggered after a response is received from a web request, allowing for additional processing or actions based on the response data.
This feature is fully compatible with Jinja, allowing you to use predefined variables, macro libraries, functions, status codes, and response data. However, it does not have access to the secrets.
In a post-hook, you can access the HTTP status code, headers, and payload of the response using the following variables:
{{ _status_code }}
{{ _headers }}
{{ _response }}
Replace The Response Payload
When enabled, this feature allows the job to substitute the original web response payload with the output produced by the post-hook. The generated output is subsequently transformed into the designated response format.
Note
If Save job results is enabled, the post-hook output will be stored in the RESULTS table instead.
Generic Job Properties
Output Template
This property defines the Jinja template used by the generic job. It supports the use of predefined variables, macro libraries, and job functions.
Output Format
Specify the format of the job’s output data.
Output Format |
Return Data Type |
Description |
---|---|---|
Text |
Text |
Use this format if the output contains free text data. |
JSON |
Variant |
Use this format if the output is a JSON document. |
SSH Job Properties
Shell Command
The Jinja template for the command to be executed on a remote Linux server. You can use predefined variables, macro libraries, and job functions to create complex and flexible SSH commands. The rendered template will be the actual command executed on the remote server.
SFTP Job Properties
Operation Type
Select whether the job downloads from or uploads to the SFTP server.
SFTP Service File Path
Specify the file path for uploading to or downloading from the SFTP server.
This property is compatible with Jinja and supports predefined variables, macro libraries, and functions. Note that it does not have access to secrets.