Interested in Solving your Challenges with XenonStack Team

Get Started

Get Started with your requirements and primary focus, that will help us to make your solution

Proceed Next

Elixirdata

Improving Data Governance with Snowflake's Quality Management

Navdeep Singh Gill | 24 March 2025

Improving Data Governance with Snowflake's Quality Management
17:13
Data Governance with Snowflake's Quality Management

In an era where data drives every decision, ensuring accuracy, security, and compliance is more critical than ever. Organizations can significantly enhance their data governance practices by leveraging Snowflake's quality management capabilities. This means keeping data consistent and reliable and ensuring the right people can access the correct information when needed.

 

Snowflake's integrated tools allow teams to monitor and manage data quality in real time, providing insights that help prevent errors before they become costly. By incorporating these tools, businesses can create a more trustworthy data environment supporting faster decision-making and fuel innovation.

 

In this discussion, we'll explore how improving data governance with Snowflake's Quality Management can transform your data practices, making your organization more agile, secure, and confident in its data-driven strategies.

What is Data Governance? 

Data governance is a broad set of policies, strategies, standards, and technology that organizations use to ensure the accuracy, consistency, accessibility, and security of data. It is less about revering data as a strategic asset and more about having formal practices to manage it across its life cycle. Good data governance gets the right balance of control and flexibility to help organizations get maximum value from their data assets while ensuring compliance and minimizing risk. 

Why Data Governance Matters 

In today's data-driven business landscape, the importance of strong data governance cannot be overstated: 

  • Decision Quality: Better data leads to better decisions. When data is reliable and trustworthy, organizations can confidently base strategic decisions on it. 

  • Regulatory Compliance: With regulations like GDPR, CCPA, and industry-specific mandates, proper data governance is no longer optional. 

  • Operational Efficiency: Eliminating data silos and inconsistencies reduces redundant work and streamlines processes. 

  • Cost Reduction: According to Gartner, poor data quality costs organizations an average of $12.9 million annually. 

  • Innovation Enablement: Clean, accessible data accelerates analytics initiatives and AI/ML projects. 

Snowflake's Approach to Data Governance 

Snowflake has emerged as a leading data platform that recognizes governance as a foundational element rather than an afterthought. Their approach integrates governance capabilities directly into their data cloud platform, focusing on: 

  • Unified Governance: A single framework spanning the entire data lifecycle 

  • Scalable Controls: Governance that grows with your data volume and complexity 

  • Automated Quality Management: Built-in tools to monitor and maintain data quality 

  • Collaborative Oversight: Features that enable cross-functional governance team collaboration 

  • Real-time Visibility: Dashboards and metrics that provide immediate insights into data health 

Getting Started with Snowflake Quality Management 

Overview of Available Tools 

Snowflake's quality management ecosystem includes several powerful components: 

  • Data Quality Rules: Define expectations for your data through SQL-based validation rules 

  • Observability Framework: Continuous monitoring and detection of anomalies 

  • Quality Tags and Scores: Metadata that indicates the reliability of datasets 

  • Validation Testing: Tools to verify data against established quality criteria 

  • Historical Quality Metrics: Track quality trends over time to identify patterns 

Setting Up Your Environment 

Before implementing quality management tools, ensure your Snowflake environment is configured correctly: 

  • Role Setup: Create dedicated roles for quality management with appropriate permissions.

  • Warehouse Configuration: Allocate appropriate compute resources.

  • Storage Preparation: Create locations for quality check results  

Basic Configuration Steps 

With the foundation in place, configure your quality management infrastructure: 

  • Enable Quality Features: Activate quality management in your account settings 

  • Install Quality Extensions: Add necessary extensions from Snowflake Marketplace  

  • Configure Alert Destinations: Set up notification channels (email, Slack, etc.) 

  • Create Baseline Metrics: Establish initial quality measurements for key datasets 

Core Quality Management Features 

Data Validation Tools 

Snowflake provides multiple approaches to validate your data: 
Row-level Validation: Check individual records against business rules  

CREATE OR REPLACE PROCEDURE validate_customer_data() 

 

RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  INSERT INTO data_governance.quality.validation_results 
  SELECT current_timestamp(), 'customers',  
         CASE WHEN age < 0 OR age > 120 THEN 'Invalid age' ELSE NULL END 
  FROM customers 
  WHERE age < 0 OR age > 120; 
$$; 

 

 

Aggregate Validation: Ensure dataset-level statistics meet expectations  

CREATE OR REPLACE PROCEDURE validate_orders_completeness() 

 

RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    missing_values INTEGER; 
  BEGIN 
    SELECT COUNT(*) INTO :missing_values 
    FROM orders 
    WHERE product_id IS NULL OR customer_id IS NULL; 
     
    IF (missing_values > 0) THEN 
      INSERT INTO data_governance.quality.validation_results 
      VALUES (current_timestamp(), 'orders', 'Missing required values: ' || missing_values); 
    END IF; 
     
    RETURN 'Validation complete'; 
  END; 
$$; 

 

 

Schema Validation: Verify the structural integrity of your data 

Monitoring Capabilities 

Continuous monitoring ensures proactive quality management: 

  • Anomaly Detection: Identify unusual patterns in data quality metrics 

  • Threshold Monitoring: Alert when quality scores fall below acceptable levels 

  • Freshness Tracking: Ensure data is updated according to expected schedules 

  • Volume Monitoring: Detect unexpected changes in data volume 

Reporting Functions 

Comprehensive reporting provides visibility into data quality: 

  • Quality Dashboards: Visual representations of quality metrics 

  • Detailed Failure Reports: Analyze specific validation failures 

  • Trend Analysis: Track quality metrics over time 

  • Impact Assessment: Understand how quality issues affect downstream processes 

Creating Your First Quality Checks 

Basic Column Validations 

Start with fundamental checks at the column level: 
Completeness Check: Ensure required fields contain values  

 

SQL 

CREATE OR REPLACE PROCEDURE check_email_completeness() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    null_percentage FLOAT; 
  BEGIN 
    SELECT (COUNT(*) FILTER (WHERE email IS NULL) * 100.0 / COUNT(*)) 
    INTO :null_percentage 
    FROM customers; 
     
    IF (null_percentage > 5) THEN 
      RETURN 'FAIL: ' || null_percentage || '% of email addresses are missing'; 
    ELSE 
      RETURN 'PASS: Email completeness meets standards'; 
    END IF; 
  END; 
$$; 

 

 

Format Validation: Verify data matches expected patterns  

 

sql 

CREATE OR REPLACE PROCEDURE validate_phone_format() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  SELECT 
    CASE  
      WHEN COUNT(*) = 0 THEN 'PASS: All phone numbers match expected format' 
      ELSE 'FAIL: ' || COUNT(*) || ' phone numbers have invalid format' 
    END 
  FROM customers 
  WHERE NOT REGEXP_LIKE(phone, '\\d{3}-\\d{3}-\\d{4}'); 
$$; 

 

 

Range Check: Ensure numeric values fall within acceptable ranges  

sql 

CREATE OR REPLACE PROCEDURE validate_price_ranges() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  SELECT 
    CASE  
      WHEN COUNT(*) = 0 THEN 'PASS: All prices within valid range' 
      ELSE 'FAIL: ' || COUNT(*) || ' products have invalid prices' 
    END 
  FROM products 
  WHERE price <= 0 OR price > 10000; 
$$; 

 

Table-Level Quality Rules 

Move beyond columns to assess entire tables: 
Referential Integrity: Ensure foreign keys have matching primary keys  

 

sql 

CREATE OR REPLACE PROCEDURE check_order_integrity() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    orphaned_records INTEGER; 
  BEGIN 
    SELECT COUNT(*) INTO :orphaned_records 
    FROM orders o 
    LEFT JOIN customers c ON o.customer_id = c.customer_id 
    WHERE c.customer_id IS NULL; 
     
    IF (orphaned_records > 0) THEN 
      RETURN 'FAIL: ' || orphaned_records || ' orders reference non-existent customers'; 
    ELSE 
      RETURN 'PASS: All orders have valid customer references'; 
    END IF; 
  END; 
$$; 


 

Uniqueness Verification: Check for unexpected duplicates  

 

sql 

CREATE OR REPLACE PROCEDURE check_product_uniqueness() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    duplicate_count INTEGER; 
  BEGIN 
    SELECT COUNT(*) INTO :duplicate_count 
    FROM ( 
      SELECT sku, COUNT(*) as count 
      FROM products 
      GROUP BY sku 
      HAVING COUNT(*) > 1 
    ); 
     
    IF (duplicate_count > 0) THEN 
      RETURN 'FAIL: ' || duplicate_count || ' duplicate product SKUs found'; 
    ELSE 
      RETURN 'PASS: All product SKUs are unique'; 
    END IF; 
  END; 
$$; 

 

 

Data Balance: Verify expected proportions in categorical data  

 

sql 

CREATE OR REPLACE PROCEDURE check_status_distribution() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    cancelled_percentage FLOAT; 
  BEGIN 
    SELECT (COUNT(*) FILTER (WHERE status = 'Cancelled') * 100.0 / COUNT(*)) 
    INTO :cancelled_percentage 
    FROM orders; 
     
    IF (cancelled_percentage > 10) THEN 
      RETURN 'WARN: Cancelled orders (' || cancelled_percentage || '%) exceed expected threshold'; 
    ELSE 
      RETURN 'PASS: Order status distribution within expected parameters'; 
    END IF; 
  END; 
$$; 


Simple Data Testing Patterns 

 

Implement these common testing approaches: 

Year-over-Year Comparison: Check for unexpected trend changes  

 

Sql 
CREATE OR REPLACE PROCEDURE check_sales_trends() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    yoy_change FLOAT; 
  BEGIN 
    SELECT ((current_year_sales - previous_year_sales) * 100.0 / previous_year_sales) 
    INTO :yoy_change 
    FROM ( 
      SELECT  
        SUM(CASE WHEN YEAR(order_date) = YEAR(CURRENT_DATE()) THEN amount ELSE 0 END) as current_year_sales, 
        SUM(CASE WHEN YEAR(order_date) = YEAR(CURRENT_DATE()) - 1 THEN amount ELSE 0 END) as previous_year_sales 
      FROM orders 
    ); 
     
    IF (ABS(yoy_change) > 25) THEN 
      RETURN 'ALERT: Sales changed by ' || yoy_change || '% year-over-year'; 
    ELSE 
      RETURN 'PASS: Sales trends within expected range'; 
    END IF; 
  END; 
$$; 

 

 

Statistical Outlier Detection: Identify unusual data points  

 

sql 

CREATE OR REPLACE PROCEDURE detect_order_outliers() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    outlier_count INTEGER; 
    avg_order FLOAT; 
    stddev_order FLOAT; 
  BEGIN 
    SELECT AVG(amount), STDDEV(amount) 
    INTO :avg_order, :stddev_order 
    FROM orders; 
     
    SELECT COUNT(*) 
    INTO :outlier_count 
    FROM orders 
    WHERE amount > avg_order + (3 * stddev_order); 
     
    RETURN 'Found ' || outlier_count || ' order amount outliers (>3 std deviations)'; 
  END; 
$$; 

 

Change Detection: Monitor for unexpected data modifications  

 

sql 

CREATE OR REPLACE PROCEDURE monitor_customer_changes() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    change_percentage FLOAT; 
  BEGIN 
    SELECT (COUNT(*) * 100.0 / (SELECT COUNT(*) FROM customers)) 
    INTO :change_percentage 
    FROM customers_history 
    WHERE modified_date > DATEADD(day, -1, CURRENT_DATE()); 
     
    IF (change_percentage > 5) THEN 
      RETURN 'ALERT: ' || change_percentage || '% of customer records modified in last 24 hours'; 
    ELSE 
      RETURN 'PASS: Customer change rate within normal limits'; 
    END IF; 
  END; 
$$; 

Using Snowflake's Quality Dashboard 

Accessing the Dashboard 

Navigate to the quality dashboard through: 

  • The Snowflake web interface via Governance → Data Quality 

  • Custom Snowsight dashboards pointing to quality metrics 

  • External BI tools connected to your quality results tables 

Understanding Key Metrics 

The dashboard provides several essential metrics: 

  • Quality Score: The aggregated measure of overall data quality (0-100) 

  • Failure Rate: Percentage of quality checks that fail 

  • Issue Trends: Patterns in quality problems over time 

  • Impact Assessment: Severity classification of detected issues 

  • Resolution Time: How quickly quality issues are addressed 

Customizing Your View 

Tailor the dashboard to meet your specific needs: 

 

Create Custom Views: Focus on specific datasets or quality dimensions  
sql 

CREATE OR REPLACE VIEW data_governance.quality.financial_data_quality AS 
SELECT 
  check_date, 
  dataset_name, 
  check_name, 
  result, 
  impact_level 
FROM data_governance.quality.validation_results 
WHERE dataset_name IN ('orders', 'invoices', 'payments') 
ORDER BY check_date DESC; 
  • Set Priority Indicators: Highlight critical quality metrics 
  • Configure Time Windows: Adjust the analysis period for trend detection 
  • Define Custom Thresholds: Set organization-specific quality standards 

Automating Quality Monitoring 

Setting Up Scheduled Checks 

Implement regular quality verification: 

1. Create Task Schedules: Automate quality checks at appropriate intervals  
sql 

CREATE OR REPLACE TASK daily_customer_validation 
WAREHOUSE = quality_monitoring_wh 
SCHEDULE = 'USING CRON 0 7 * * * America/New_York' 
AS 
CALL validate_customer_data(); 


 

2. Cascade Dependencies: Chain-related checks together  

 

sql 

CREATE OR REPLACE TASK verify_order_data 
WAREHOUSE = quality_monitoring_wh 
AFTER daily_customer_validation 
AS 
CALL validate_orders_completeness(); 
3. Conditional Execution: Run deeper checks only when basic validation passes  
sql 

CREATE OR REPLACE PROCEDURE conditional_product_validation() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  DECLARE 
    basic_check_result VARCHAR; 
  BEGIN 
    SELECT result INTO :basic_check_result 
    FROM TABLE(CALL check_product_uniqueness()); 
     
    IF (STARTSWITH(:basic_check_result, 'PASS')) THEN 
      CALL validate_price_ranges(); 
    END IF; 
     
    RETURN 'Validation sequence completed'; 
  END; 
$$; 

Creating Basic Alerts 

Establish notification systems: 

1. Configure Alert Thresholds: Define when notifications should trigger 
sql 

CREATE OR REPLACE PROCEDURE email_on_failure(check_name STRING, result STRING) 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  BEGIN 
    IF (STARTSWITH(result, 'FAIL')) THEN 
      CALL SYSTEM$SEND_EMAIL( 
        'data_alerts', 
        'Quality Check Failure: ' || check_name, 
        'The following quality check failed: ' || check_name || '\n\nResult: ' || result 
      ); 
    END IF; 
     
    RETURN 'Alert processing complete'; 
  END; 
$$; 

 

  • Set Up Notification Channels: Connect to email, Slack, or other systems. 

  • Implement Escalation Logic: Increase alert severity based on issue persistence. 

Responding to Quality Issues 

Develop a structured response framework: 

  • Automated Quarantine: Isolate potentially problematic data  
sql 

CREATE OR REPLACE PROCEDURE quarantine_invalid_orders() 
RETURNS VARCHAR 
LANGUAGE SQL 
AS 
$$ 
  BEGIN 
    INSERT INTO data_governance.quarantine.orders 
    SELECT * FROM orders 
    WHERE amount < 0 OR customer_id IS NULL; 
     
    DELETE FROM orders 
    WHERE amount < 0 OR customer_id IS NULL; 
     
    RETURN 'Invalid orders quarantined'; 
  END; 
$$; 
  • Issue Categorization: Classify problems by type and severity 

  • Remediation Workflows: Standardized processes for fixing common issues 

  • Documentation Requirements: Record quality incidents and resolutions 

Building a Data Quality Framework 

Defining Quality Standards 

Establish clear metrics for data quality: 

  • Completeness: Percentage of required data that is present 

  • Accuracy: Correctness of data values 

  • Consistency: Uniformity across related data elements 

  • Timeliness: Currency and availability of data 

  • Validity: Conformance to defined formats and rules 

  • Uniqueness: Absence of unintended duplication 

Implementing Team Workflows 

Create effective quality management processes: 

  • Assign Clear Responsibilities: Define quality roles within the organization 

  • Establish Review Procedures: Regular evaluation of quality metrics 

  • Create Escalation Paths: Clear routes for addressing critical issues 

  • Document Standards: Maintain accessible quality requirements 

  • Conduct Training: Ensure team members understand quality protocols 

Measuring Success 

Track the effectiveness of your quality initiatives: 

  • Reduction in Incidents: Fewer data-related issues reported 

  • Faster Resolution: Decreased time to fix quality problems 

  • Improved Trust: Greater confidence in data-driven decisions 

  • Efficient Operations: Reduced rework due to data issues 

  • Compliance Success: Better regulatory audit outcomes 

Tips for Success and Common Mistakes to Avoid 

Best Practices 

  1. Start Small: Begin with critical datasets rather than attempting to cover everything.  

  2. Prioritize Impact: Focus on quality dimensions that most affect business outcomes.

  3. Automate Intelligently: Balance automated checks with manual oversight.

  4. Involve Business Users: Align quality rules with actual business needs.

  5. Document Everything: Maintain precise records of quality standards and procedures.

Common Pitfalls 

  • Excessive Alerts: Creating too many alerts leads to alarm fatigue 

  • Rigid Thresholds: Setting unrealistic quality expectations 

  • Neglecting Root Causes: Fixing symptoms instead of underlying issues 

  • Siloed Approach: Failing to coordinate quality efforts across teams 

  • Static Rules: Not evolving quality checks as data and business needs change 

By leveraging Snowflake's quality management tools and following these guidelines, organizations can establish robust data governance practices that ensure reliable, trustworthy data for decision-making while maintaining compliance with regulatory requirements. Integrating quality management into everyday data operations transforms governance from a burdensome obligation into a strategic advantage that delivers tangible business value. 

Next Steps with Snowflake's Quality Management

Talk to our experts about implementing compound AI system, How Industries and different departments use Agentic Workflows and Decision Intelligence to Become Decision Centric. Utilizes AI to automate and optimize IT support and operations, improving efficiency and responsiveness.

More Ways to Explore Us

AI for Real-Time Data Quality Monitoring in Snowflake

arrow-checkmark

Automating Data Quality Checks in Snowflake Workflows

arrow-checkmark

Snowflake for Scalable Computer Vision AI Pipelines

arrow-checkmark

 

 

Table of Contents

navdeep-singh-gill

Navdeep Singh Gill

Global CEO and Founder of XenonStack

Navdeep Singh Gill is serving as Chief Executive Officer and Product Architect at XenonStack. He holds expertise in building SaaS Platform for Decentralised Big Data management and Governance, AI Marketplace for Operationalising and Scaling. His incredible experience in AI Technologies and Big Data Engineering thrills him to write about different use cases and its approach to solutions.

Get the latest articles in your inbox

Subscribe Now