Technology Blog for Data Foundry, Decision Intelligence and Composite AI

FinOps for Snowflake

Written by Dr. Jagreet Kaur Gill | 12 August 2024

Introduction

Snowflake is a widely used cloud data platform that companies depend on for their data management requirements. Typically, when users discover the simplicity of working with it, the utilization of the platform experiences a significant surge. Who would not want to analyze a larger volume of data and gain more insights?

 

FinOps is a structured approach for overseeing cloud expenses involving various departments like Engineering, Finance, Data Science, and Product to enhance financial responsibility and optimize business outcomes. Snowflake's pay-as-you-go pricing strategy highlights the importance of FinOps procedures to handle expenditures related to Snowflake's resources efficiently.

Integrating FinOps principles involves balancing performance, cost, and quality considerations when making cloud architecture and investment choices.

General Strategies for Managing Snowflake Costs

Before diving deeper into the technical aspects of snowflakes for cost management, let's look at some general approaches to cost management in snowflake environments.

Choose the right region

Snowflake can be accessed on all three major cloud providers - AWS, Azure, and GCP - in various regions globally. It is crucial to carefully choose the appropriate platform and region for your Snowflake account. Ensure that the desired features are supported on your chosen platform, compare storage prices, which vary, and consider the location from where you will access Snowflake, as egress charges will be incurred on your bill. This decision is not only about cost optimization but also about enhancing performance by reducing latency between your users and your Snowflake data.

Monitor resource usage

Regularly monitoring resource usage is a crucial step in effectively managing Snowflake costs. By doing so, you can pinpoint areas where resource utilization can be optimized, and costs can be reduced.

Snowflake offers an account usage dashboard that allows you to monitor usage throughout your organization. This dashboard provides comprehensive insights into compute and storage usage, query activity, and other important metrics. Utilizing this information, you can identify specific areas where resource utilization can be optimized, leading to cost reduction.

Optimize Resource Utilization

 

Optimizing resource utilization is crucial to effectively managing Snowflake costs. By identifying resources that are either underutilized or overutilized and adjusting their allocation according to usage patterns, you can minimize overall costs or maximize the value derived from your investments.

A great starting point for this optimization process is Snowflake's Query History view. This view offers comprehensive insights into computing activity across all warehouses, enabling you to gain a holistic understanding of the overall activity.

Armed with this valuable information, you can make informed decisions regarding resource allocation. For instance, during peak usage periods, you can allocate more resources to ensure smooth operations, while during off-peak periods, you can reduce resource allocation to avoid unnecessary expenses.

Use storage wisely

Efficiently managing your storage is crucial to minimize its impact on your Snowflake expenses. Here are some recommendations to effectively utilize storage in Snowflake:

  • Use compression options like GZIP to reduce storage usage.

  • Store infrequently accessed data in lower-cost storage tiers like S3.

  • Delete unused data regularly to free up storage space.

Deep Dive into practicing FinOps in Snowflake Environments

Below are some suggestions and some subsequent actions to establish a strong FinOps methodology for Snowflake.

Providing finance and snowflake engineers visibility into cloud spend

Make Finance teams and Snowflake practitioners access detailed information on expenditure and utilization. This allows them to effectively manage finances and track usage within the FinOps framework. Snowflake offers comprehensive usage data through Account and Organization usage views, catering to the needs of both finance teams and Snowflake practitioners.

To enhance financial accountability and visibility, Snowflake provides various pre-built queries that enable users to analyze spending at various levels of granularity, including per organization, per account, per warehouse, per user, and per job. Additionally, the tagging feature allows for easy categorization and analysis of expenditures.

Moreover, Snowflake offers a user-friendly cost exploration interface that facilitates the visual examination of usage data.

Implementing charge-back & show-back models

A crucial result of implementing FinOps practices is to empower enterprises to achieve financial responsibility. To accomplish this, we suggest monitoring usage at various levels of detail (such as organization, account, workloads, warehouses, users, tasks, etc.) and assigning expenses to specific teams, business units, customers, or cost centers (through tagging or other cost attribution features) to facilitate "chargeback" or "show back" scenarios. These mechanisms also play a role in ensuring that business units are held accountable for reporting value realization based on clearly defined criteria.

 

Virtual warehouses consume compute resources that are billable in the form of Snowflake credits. By utilizing separate Virtual Warehouses for each workload, the cost associated with each workload can be determined effortlessly. This information can then be aggregated at the business unit level to support chargeback scenarios.

Setting up budgets and alerts

It is advisable to establish budgets at various levels, including accounts and custom groupings of resources, among others.

The budget functionality in Snowflake allows users to set a spending limit within a specific time for either an account or a custom grouping of credit-consuming objects. This feature offers proactive and reactive alerts regarding spending limits and safeguards against exceeding them.

Budgets not only monitor warehouses and storage, but also keep track of serverless usage, such as automatic clustering, materialized views, search optimization, pipe, and replication. If the spending limit is projected to be surpassed, a daily notification is sent.

Optimizing utilization of paid resources

Snowflake provides various metrics that can be used to optimize compute resources. These metrics include warehouse load metrics, warehouse utilization metrics (currently in private preview), metrics on data spilling to disk/object storage, and warehouse events history (currently in public preview). By analyzing these metrics, users can make informed decisions to "right size" their compute resources.

One way to optimize compute resources is by setting warehouses to automatically resume or suspend based on activity. This ensures resources are only consumed when they are being used. By default, Snowflake automatically suspends a warehouse if it remains inactive for a specified time. Similarly, it automatically resumes the warehouse when a job arrives. It is important to note that these auto-suspend and auto-resume behaviors apply to the entire warehouse, not individual clusters within the warehouse.

 

To achieve cost efficiency, it is recommended to set the auto-suspending duration based on the workload's ability to take advantage of warehouse caches. This involves finding the right balance between quickly suspending computing to save costs and leveraging Snowflake's sophisticated caching for improved price-performance benefits.

Using return-on-investment for further investments in workloads

Snowflake's cost visibility tools help users understand the investment needed for each workload, with the return on investment depending on the business value of the use case. By evaluating ROI (return on investment), decisions can be made on re-architecting, adjusting usage, or retiring workloads to maximize expenditure returns. Snowflake also provides virtual warehouse configurations to optimize workload management based on concurrency, throughput, and cost requirements.

Setting up spend limiting guardrails

For mature and large-scale Snowflake use cases managed by a central platform team, it is advisable to implement policies that restrict certain users or teams from carrying out credit-consuming operations like creating warehouses, adjusting sizes, and scaling limits. This helps in managing costs effectively and minimizing unintended expenses caused by additional users.

Snowflake offers the functionality to assign permissions for utilizing, creating, and modifying credit-consuming resources (such as warehouse creation, sizing, scaling limits, QA's scale factor, etc.) to effectively manage costs and reduce uncertainties.

To control workload expenses, the configurations of virtual warehouses, including size, number of clusters, scaling policy, auto-suspend timeout, and statement timeouts for running and queuing, establish an upper limit on expenditure. When utilizing multi-cluster warehouses, the standard scaling policy is designed to prioritize minimizing queuing time over conserving credits.

Additionally, Resource Monitors play a crucial role by providing alerts and setting strict limits to prevent overspending through credit quotas assigned to individual warehouses.

Conclusion

Snowflake, a popular cloud data platform, offers powerful data analysis capabilities. However, companies must responsibly manage Snowflake resources and costs. Implementing FinOps, a structured approach to overseeing cloud expenses is crucial.

Choosing the right region for Snowflake access is key to optimizing costs. Monitoring resource usage, optimizing utilization, and using storage wisely are also essential steps. Implementing charge-back and show-back models also helps establish a strong FinOps methodology for Snowflake.

Practicing FinOps in Snowflake involves setting budgets, optimizing resource utilization, and using ROI for further investments. Implementing spending-limiting guardrails and Resource Monitors is crucial for effective cost management.