November 22, 2022

phData Toolkit November 2022 Update

By Nick Goble

Hello and welcome to our November update of the phData Tookit, built to accelerate your migration, guide you in enforcing your information architecture, and provide auditing capabilities for your data governance practice.  

This is a CLI and UI-based set of utilities that we’ve built and iterated on as we’ve performed hundreds of customer migrations.  

While each migration is different, there are common tasks that need to be performed for each of these migrations. For example,  standing up platforms like the Snowflake Data Cloud, building out an information architecture, and migrating existing SQL scripts from your source SQL dialect to your target SQL dialect. The Toolkit is purpose-built to tackle these best practices and reduce manual work wherever possible.

Over the course of the last month, we’ve been primarily focusing on translation between SQL dialects, more specifically Microsoft SQL Server to Snowflake.  

Let’s dive in and take a look at some of the changes in more depth.

SQL Translation Updates

SQLMorph, our SQL translation tool, is built to automate the migration of SQL from source to target, primarily with a focus on Snowflake.  This tool works by parsing the source SQL script, putting the logic of the script into an internal representation, and then outputting the same (as much as possible) logic into the target dialect.  

Like most translations in spoken languages, there are situations where things don’t directly translate or additional context has to be added to support the translation.

SQLMorph handles the differences in SQL dialects programmatically for you.

You can think of SQLMorph as the Rosetta Stone of SQL. However, just like spoken languages, translations may need to be updated as the meaning of words changes or new words are introduced to the language. You may also need to continue to add to the “stone” to give future capabilities to translate the language.

We’ve added a number of new translations for the Microsoft SQL Server to Snowflake translation, as well as adjusted a few of the translations.

Microsoft SQL Server to Snowflake

When translating between languages/dialects, one major challenge is that the words or features may not exist in the new system. For example, we’ve added the ability to translate the CREATE COLUMNSTORE INDEX statement. While this is supported in MSSQL, it is not possible in Snowflake as Snowflake doesn’t have indexes on columns.

A few lines of 2 different code dialects

However, there are many situations where functionality is shared between the two systems. For example, we’ve added the ability to translate Raiserror and CREATE TRIGGER statements:

Multiple lines of code in 2 different code dialects
A screenshot of some code snippets

We’ve also introduced a number of fixes to existing translations. This generally includes things like handling parsing issues or handling of keywords. Since SQLMorph parses source dialect syntax into an internal representation, we need to define how to parse those source dialects. We’ve introduced fixes for the following:

  2. Parse user-defined data type in Declare statement
  3. Parse issue with Option clause in Select statement
  4. Parse issue with SET DEADLOCK_PRIORITY statement
  5. Rewrite procedure non-null default values
  6. Translate += assignment operator
  7. Parse AT TIME ZONE clause
  8. Parse table_hint clause with an alias name
  9. Parse multiple semicolons after/before the query
  10. Rewrite MONEY data type to Number(16,2) in snowflake scripting
  11. Parse Create/Drop synonym and give not supported error message
  12. Added not supported error message for Alter view statement
  13. Parse if-stmt and give not supported message

By regularly updating and improving the translations that SQLMorph can generate, we’re ensuring each successive use of the tool will be more efficient and accurate.  

Before deciding to leverage SQLMorph for any customer project, we can run a report to better understand the coverage that SQLMorph is able to provide against the customer’s provided SQL and even make adjustments to the tool before leveraging it for a customer’s migration!

Want to Leverage the phData Toolkit?

As we make updates to the phData Toolkit, it is our hope that you understand the value of the tool and decide to leverage it.  As part of our services, we frequently use these tools to reduce the cost of your migration, ensure data governance and data quality, and ultimately make your organization more successful.  

By reducing the costs of migration, increasing speed-to-market, and enforcing an information architecture built with security in mind, your organization can move faster in a secure manner to drive additional business value.

Get Started with the phData Toolkit

If you haven’t already explored the phData Toolkit, we highly recommend checking it out!

We encourage you to spend a few minutes browsing the apps and tools available in the phData Toolkit today to set yourself up for success in 2022. 

Be sure to follow this series for more updates to the phData Toolkit tools and features. 

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit