a blue background with lines and dots

Migrating from SAS to Snowflake involves moving legacy, proprietary SAS workloads (data storage, ETL, and analytics) into Snowflake’s scalable cloud architecture. This process shifts processing power from fixed-infrastructure SAS servers to the cloud, often by refactoring SAS code into Snowflake SQL or Python/Snowpark.

Assessment and Inventory

  • Audit your SAS estate: Identify all active SAS programs, Macros, PROC SQL, and DATA steps.

  • Determine complexity: Categorize code by function (e.g., data ingestion, data transformations, statistical modeling).

  • Filter redundant data: Older SAS environments often house unnecessary or duplicated files that shouldn't be migrated to the cloud

Data Migration

  • Move historical data: Export historical SAS datasets (.sas7bdat) into cloud storage (e.g., Amazon S3, Azure Blob, or Google Cloud Storage).

  • Load into Snowflake: Use Snowflake’s COPY INTO command or automated data pipelines (e.g., Snowpipe) to load the raw data into base tables.

Code Modernization & Refactoring

Migrating SAS code is often the most complex step because Snowflake does not natively execute SAS macro logic or proprietary syntax. You have two primary paths:

  • Option A: Snowflake SQL & dbt: Rewrite SAS ETL jobs into standard SQL. Many teams use frameworks like dbt (data build tool) to manage and orchestrate the newly written transformations in Snowflake.

  • Option B: Python & Snowpark: Translate SAS logic to PySpark or Python and run it inside Snowflake’s Snowpark environment, which allows you to execute Python code directly on Snowflake compute nodes

Hybrid Operations (Coexistence)

  • SAS/ACCESS: Instead of rewriting everything immediately, use the SAS/ACCESS Interface to Snowflake. This native connector allows you to connect SAS to Snowflake, leaving your storage in the cloud while allowing SAS to manipulate the tables.

  • SAS Viya: If your organization relies heavily on SAS analytics, you can retain SAS Viya for downstream reporting and complex statistical analysis while executing the heavy lifting and ETL jobs in Snowflake

Testing and Validation

  • Reconciliation: Compare row counts, aggregates, and output between your legacy SAS environment and the new Snowflake environment.

  • Performance tuning: Optimize the Snowflake code using clustering keys and materialized views to reduce warehouse compute costs

a blue background with lines and dots