SQL Server 2016 introduces System Version Tables, which is the formal name for the long awaited temporal data feature. In this blog post (part 1) I’ll explain what temporal is all about, and my next post will walk you through detailed demos on temporal.
Temporal means, time-related, and in the case of SQL Server, this means that you get point-in-time access to a table, allowing you to query not only the table’s current data, but data as it appeared in the table at any past point in time. So data that you overwrite with one or more update statements, or data that you blow away with a delete statement, is never really lost. It’s always and immediately available simply by telling your otherwise ordinary query to travel back in time when looking at the table.
The mechanism behind this magic is actually rather simple, and completely seamless. SQL Server automatically creates a history table with the same schema as the table enabled for temporal, records every update and delete into that history table along with timestamps for identifying each version of every update or delete. Then, the query engine integrates with the history table and gives you any desired point-in-time access to the temporal table.
Think of all the great uses for this feature.
- Time travel
- Being able to query data as it changes over time yields tremendous business value, where temporal tables make it very easy to perform all sorts of trend analysis against your data.
- Slowly changing dimensions
- This feature is also very handy when you’re incrementally building large data warehouses with slowly changing dimensions, because the history table always contains data changes that are timestamped.
- Temporal tables also give you an inherent auditing solution, when you need to track what data has changed, and when, although it won’t record who made the change.
- Accidental data loss
- You know that heart-dropping moment after an update or delete that you really didn’t mean? Well, rather than panic and scramble to find that backup and restore it, you can much more easily recover from your accident by accessing the lost data from the history table.
It’s pretty easy to get going with temporal, because there are very few pre-requisites. Any table can become a temporal table as long as It has a primary key (which you have in virtually any table), as well as a pair of datetime2 columns, known as the period columns. Given those minimal requirements, you can turn any table into a system versioned table.
SQL Server creates the history table with a schema to match the main table, except that it does not enforce constraints on the history table. This makes sense if you think about it, because multiple versions of the same row, with the same primary key value, will be written to the history table for every change, and so it’s just not possible to enforce the uniqueness of the primary key in the history table.
After creating the history table, SQL Server automatically populates it and preserves the original version of any row affected by an update statement in the main table, as well as retaining any row that gets deleted from the main table. Now certainly, this is nothing that we couldn’t achieve ourselves by writing triggers to do the same thing. However, the real power of a temporal table comes into play at query time. Simply by including the additional syntax FOR SYSTEM_TIME AS OF with a specific point in time in your SELECT statement, SQL Server automatically executes your query against the table – as it appeared at that point in time.
Once enabled for temporal, you continue treating the table pretty much like an ordinary table. In most cases, you can even ALTER the table’s schema, and SQL Server will automatically reflect the schema change in the history table to keep it in sync. However, there are some types of schema changes that won’t be possible unless you first break the temporal connection between the main table and the history table, make the same schema change to both tables in exactly the same manner, and then re-establish the connection between them. Examples of schema changes that require these extra steps include adding an identity column, or a computed column.
Creating a Temporal Table
Here’s an example of a temporal table.
CREATE TABLE Department ( DepartmentID int NOT NULL IDENTITY(1,1) PRIMARY KEY, DepartmentName varchar(50) NOT NULL, ManagerID int NULL, ValidFrom datetime2 GENERATED ALWAYS AS ROW START NOT NULL, ValidTo datetime2 GENERATED ALWAYS AS ROW END NOT NULL, PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) ) WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = DepartmentHistory))
It’s just like any other table, and includes the two period columns ValidFrom and ValidTo, although these columns can be named anything you like; you just need to add GENERATED ALWAYS AS ROW START and ROW END, and then reference the two columns with PERIOD FOR SYSTEM_TIME. That’s it for the table schema; to actually turn on temporal for the table, we add WITH SYSTEM_VERSIONING = ON, and also set the name for the history table that SQL Server should create, and that must include the schema name; although if you leave out the HISTORY_TABLE name, SQL Server will generate one based on the main table’s internal object ID.
And that’s all there is to it. You just continue working with the table as usual, and SQL Server captures data changes to the history table, and also maintains the date and time for each version of every row.
Querying a Temporal Table
And so, as a result, you can query the table as it appeared at any past point in time simply by including the FOR SYSTEM_TIME AS OF clause like you see here in this example, where the Employee table is being queried as it appeared exactly thirty days ago:
DECLARE @ThirtyDaysAgo datetime2 = DATEADD(d, -30, SYSDATETIME()) SELECT * FROM Employee FOR SYSTEM_TIME AS OF @ThirtyDaysAgo ORDER BY EmployeeId
So any rows that have been deleted in the past thirty days, they’ll be returned by this query. Any new rows created in the past thirty days? Those won’t be returned. And any rows older than thirty days that have been modified in the past thirty days are returned as they appeared exactly thirty days ago. And that’s the magic of temporal tables in SQL Server 2016.