Useful information

Considerations

Resource usage

Snoweaver primarily functions as a code generator, requiring minimal computational resources Typically, sharing an XSmall warehouse among developers suffices for Snoweaver’s operation. This is suitable for most development and testing scenarios. .. storage_start

Storage costs

Data stored in the RESULTS table or project stages incurs storage costs. To manage these costs effectively:

  • Perform regular cleanup: If you don’t intend to keep data for extended periods, implement periodic cleanup processes.

  • Replicate to raw layer: Consider adding a step in your pipeline to replicate new data or files to your raw data layer. This ensures data preservation in case Snoweaver is uninstalled (accidentally or otherwise).

  • Set retention policies: Implement data retention policies aligned with your business needs and compliance requirements.

Data retention

The retention time for the RESULTS table is set to 1 day, with a maximum data extension time of 14 days.

Copilot costs

  • Snoweaver Copilot leverages Snowflake’s Cortex LLM COMPLETE function, which incurs compute costs. For detailed information, refer to Snowflake’s Cortext Cost considerations .

  • Chat history enables Copilot to process all previous questions and responses along with the prompt to Cortex. As the chat history grows, token consumption increases proportionally. To manage this, you can click the New Chat button to clear the current history and start a fresh chat.

  • By default, Copilot processes your questions within a system context. The token consumption for this context varies based on the size of your code and the complexity of your Jinja environment, including variables, functions, and macros used in the job. For general questions, you can select the omit context option. This excludes the system context, reducing token usage and potentially speeding up response times.

Copilot configuration

  • To achieve optimal response quality, we recommend using the most recent language models. These updates have shown substantial enhancements in performance.

  • For cost efficiency, set a small or medium-sized model as the default. Offer users access to one or two larger, high-quality models as alternatives, ensuring they have options if the default responses do not meet their needs.

For further details, please refer to this page: Choosing a model

Environment isolation and access control

The call_proc function can only invoke procedures within the same project and is only usable if the instance type is set to procedure. For interactions with other projects or Snowflake objects, consider using a Snowflake REST API job.

Example: Tutorial: Execute Snowflake statements via Rest API

Upgrading Snoweaver

When upgrading to a newer version or patch of Snoweaver:

  • Launch the Admin console to initiate the API upgrade procedure.

  • After the upgrade, rebuild instances for all jobs to ensure the latest code is applied:

    • Start with Non-Production projects

    • Thoroughly test the rebuilt jobs in these environments

  • Once testing is complete and successful in Non-Production: * Proceed to rebuild and test Production project jobs

This approach ensures a smooth upgrade process and minimizes potential disruptions to your production environment.

Handling Large Data Volumes in REST Responses

When dealing with large responses from REST APIs, especially those exceeding the size limit of the VARIANT data type in Snowflake:

  • Save the response as a compressed file (e.g., GZIP) in the DATA stage.

  • Use appropriate Snowflake functions to query and process the data based on its format (JSON, CSV, etc.).

Best practices

Efficient job creation

To ensure correct configuration, avoid creating jobs from scratch locally unless you are well-versed in the YAML properties and values. Use the project console to create a draft job, then export it as a file using Snoweaver CLI, or copy and edit the YAML file of an existing job locally.

Increasing Productivity

  • Run multiple Snoweaver sessions on the same project for efficient cross-referencing of resources like jobs, libraries, and functions. This approach doesn’t consume additional credits when using the same warehouse.

  • Leverage Copilot to generate draft code for complex logic, saving time and effort. This is particularly useful for Jinja components, macros, and Python functions.

  • Use version control systems to track changes and collaborate effectively. Regular commits and backups ensure code safety and enable easy rollbacks if needed.

  • Implement a consistent naming convention for jobs, libraries, and functions to improve code readability and maintainability.

  • Create reusable macros and functions for common operations to reduce code duplication and enhance consistency across your project.

  • For complex Jinja or Python logic, it might be beneficial to develop and test your code locally using an IDE like Visual Studio Code, and use the Snoweaver CLI for code validation and integration. Example: Tutorial: Job development using a client machine

Version control and backup

Commit your changes to a version control system using the export function whenever possible. Schedule regular backups of resource files for restoration or recovery.

Example: Tutorial: Integrate a project with Github

Environment Projects

If using a single Snowflake account for all virtual environments, create different environment projects for the same project to manage the release lifecycle. Grant project roles to different user groups as needed for Role-Based Access Control.

Jinja debugging

To view all variables and resources available in Jinja, execute the following code:

{% debug %}

Jinja Macros

  • When developing macros, it’s recommended to use ‘-’ at the beginning or end of a block, comment, or variable expression to eliminate any unintended whitespace or newline characters. For example,

    {%- macro get_id(response, name) -%}
    
  • When invoking a Jinja macro, it’s recommended to apply the trim filter to prevent leading or trailing spaces and newline characters if they are not intended. For example,

    {% set id=call_proc('get_id',1) | trim %}
    

Troubleshooting

Test call debugging

To troubleshoot unknown issues during test calls, switch the response format to Text if the format is set to JSON, and disable the raise for status option. This will display the raw response content for debugging.

Function naming

When using Save as to create a new function, ensure to rename the Python definition to match the new function name to pass field validation.

Project variable update

When the value of a project variable is updated, all job instances need to be rebuilt to reflect the change.