June 6, 2022

How to Leverage the Time Travel Feature on Snowflake

By Sam Hall

Welcome to Time Travel in the Snowflake Data Cloud. You may be tempted to think “only superheroes can Time Travel,” and you would be right. But Snowflake gives you the ability to be your own real-life superhero. 

Have you ever feared deleting the wrong data in your production database? Or that your carefully written script might accidentally remove the wrong records? Never fear, you are here – with Snowflake Time Travel!

What’s The Big Deal?

Snowflake Time Travel, when properly configured, allows for any Snowflake user with the proper permissions to recover and query data that has been changed or deleted up to the last 90 days (though this recovery period is dependent on the Snowflake version, as we’ll see later.) 

This provides comprehensive, robust, and configurable data history in Snowflake that your team doesn’t have to manage! It includes the following advantages:

  • Data (or even entire databases and schemas) can be restored that may have been lost due to a deletion, no matter if that deletion was on purpose or not
  • The ability to maintain backup copies of your data for all past versions of it for a period of time
  • Allowing for inspection of changes made over specific periods of time

How Time Travel Works

Before we learn how to use it, let’s understand a little more about why Snowflake can offer this feature. 

Snowflake stores the records in each table in immutable objects called micro-partitions that contain a subset of the records in a given table. 

Each time a record is changed (created/updated/deleted), a brand new micro-partition is created, preserving the previous micro-partitions to create an immutable historical record of the data in the table at any given moment in time. 

Time Travel is simply accessing the micro-partitions that were current for the table at a particular moment in time.  

How To Configure Time Travel In Your Account

Time Travel is available and enabled in all account types.

However, the extent to which it is available is dependent on the type of Snowflake account, the object type, and the access granted to your user.  

Default Retention Period

The retention period is the amount of time you can travel back and recover the state of a table at a given point and time. It is variable per account type. The default Time Travel retention period is 1 day (24 hours).

PRO TIP: Snowflake does have an additional layer of data protection called fail-safe, which is only accessible by Snowflake to restore customer data past the time travel window.  However, unlike time travel, it should not be considered as a part of your organization’s backup strategy.

Account/Object Type Considerations

All Snowflake accounts have Time Travel for permanent databases, schemas, and tables enabled for the default retention period.

Snowflake Standard accounts (and above) can remove Time Travel retention altogether by setting the retention period to 0 days, effectively disabling Time Travel. 

Snowflake Enterprise accounts (and above) can set the Time Travel retention period for transient databases, schemas, tables, and temporary tables to either 0 or 1 day. The retention period can also be increased to 0-90 days for permanent databases, schemas, and tables.

The following table summarizes the above considerations:

Temporary Tables

Transient Databases, Schemas, Tables

Permanent Databases, Schemas, Tables

Free

N/A

N/A

1 day

Standard +

N/A

N/A

0-1 days

Enterprise +

0-1 days

0-1 days

0-90 days

Changing Retention Period

For the Snowflake Enterprise accounts; two account level parameters can be used to change the default account level retention time.  

  • DATA_RETENTION_TIME_IN_DAYS: How many days that Snowflake stores historical data for the purpose of Time Travel.
  • MIN_DATA_RETENTION_TIME_IN_DAYS: How many days at a minimum that Snowflake stores historical data for the purpose of Time Travel.

The parameter DATA_RETENTION_TIME_IN_DAYS can also be used at an object level to override the default retention time for an object and its children. Example: 

				
					```sql
alter table mytable set data_retention_time_in_days=30;
```
				
			

How To Use Time Travel

Using Time Travel is easy! There are two sets of SQL commands that can invoke Time Travel capabilities:

  • AT or BEFORE: clauses for both SELECT and CREATE .. CLONE statements.  AT is inclusive and BEFORE is exclusive
  • UNDROP: command for restoring a deleted table/schema/database

The following graphic from the Snowflake documentation summarizes this visually:

A screenshot illustrating the the Snowflake Data lifecycle with Time Travel

Query Historical Data

You can query historical data using the AT or BEFORE clauses and one of three parameters:

  • TIMESTAMP:  A specific historical timestamp at which to query data from a particular object.  Example: SELECT * FROM my_table AT (TIMESTAMP => ‘Fri, 01 May 2015 15:00:00 -0700’::TIMESTAMP_TZ);
  • OFFSET: The difference in seconds from the current time at which to query data from a particular object.  Example: CREATE SCHEMA restored_schema CLONE my_schema AT (OFFSET => -4800);
  • STATEMENT: The query ID of a statement that is used as a reference point from which to query data from a particular object.  Example: CREATE DATABASE restored_db CLONE my_db BEFORE (STATEMENT => ‘8e5d0ca9-005e-44e6-b858-a8f5b37c5726’);

The one thing to understand is that these commands will work only within the retention period for the object that you are querying against. So, if your retention time is set to the default one day, and you try to UNDROP a table two days after deleting it, you receive an error and be out of luck! 

PRO TIP: Snowflake does have an additional layer of data protection called fail-safe, which is only accessible by Snowflake to restore customer data past the time travel window. However, unlike time travel, it should not be considered as a part of your organization’s backup strategy.

Restore Deleted Objects

You can also restore objects that have been deleted by using the UNDROP command.  To use this command, another table with the same fully qualified name (database.schema.table) cannot exist.  

Example: UNDROP TABLE my_table

How Time Travel Impacts Snowflake Cost

Snowflake accounts are billed for the number of 24-hour periods that Time Travel data (the micro-partitions) is necessary to be maintained for the data that is being retained. 

Every time there is a change in a table’s data, the historical version of that changed data will be retained (and charged in addition) for the entire retention period. This may not be an entire second copy of the table. Snowflake will try to optimize to maintain only the minimal amount of historical data needed but will incur additional costs. 

As an example, if every row of a 100 GB table were changed ten times a day, the storage consumed (and charged) for this data per day would be 100GB x 10 changes = 1 TB.  

What can you do to optimize cost to ensure your ops team does not wake up to an unnecessarily large Time Travel bill?  Below are a couple of suggestions.

Use Transient and Temporary Tables When Possible

If data does not need to be protected using Time Travel, or there is data only being used as an intermediate stage in an ETL process, then take advantage of using transient and temporary tables with the DATA_RETENTION_TIME_IN_DAYS parameter set to 0. This will essentially disable Time Travel and make sure there are no extra costs because of it. 

Copy Large High-Churn Tables

If you have large permanent tables where a high percentage of records are often changed every day, it might be a good idea to change your storage strategy for these tables based on the cost implications mentioned above.  

One way of dealing with such a table would be to create it as a transient table with 0 Time Travel retention (DATA_RETENTION_TIME_IN_DAYS=0) and copy it over to a permanent table on a periodic basis.  

This would allow you to control the number of copies of this data you maintain without worrying about ballooning Time Travel costs in the background. 

Summary

Time Travel is an incredibly useful tool that removes the need for your team to maintain backups/snapshots/complex restoration processes/etc… as with a traditional database.  Specifically, it enables the following advantages:

  • Data recovery/restoration: use the ability to query historical data to restore old versions of a particular dataset, or recover databases/schemas/tables that have been deleted
  • Backups: If not explicitly disabled, time travel automatically is maintaining backup copies of all past versions of your data for at least 1 day, and up to 90 days
  • Change Auditing: The queryable nature of time travel allows for inspection of changes made to your data over specific periods of time

Final Thoughts

Hopefully, this has helped understand how to use Snowflake Time Travel and the context around how it works, and some of the cost implications.  

If your organization needs help using or configuring Time Travel, or any other Snowflake feature, phData is a certified Elite Snowflake partner, and we would love to hear from you so that our team can help drive the value of your organization’s data forward!

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit