October 5, 2023

phData Toolkit September 2023 Update

By Nick Goble

Hello, and welcome to our September phData Toolkit blog series update!  We’re excited about all the amazing fall events: apple picking, corn mazes, pumpkin patches… the best!

Speaking of spooky things, our product engineering team has been brewing up some delectable concoctions (updates) just for you.  So make some apple cider, sit back, and enjoy our latest round of updates hand-delivered right to you!

So, what’s new?

That’s a great question, and we’re excited to show you.  Over the last month, we’ve been continuing to focus heavily on adding additional support for our Microsoft SQL Server to Snowflake SQL translations.  

We’ve seen customer after customer performing this migration (something must be in the air?), and our dutiful product engineering team is driven to empower them!  We’ve also made a new release of our provision tool, your one-stop shop for programmatic access to our toolkit.

Let’s dive in.

SQL Server to Snowflake Translations

Over the course of the last month, we’ve been heavily focused on adding additional support for this SQL translation pairing.  Now, when we talk about support, that can mean a lot of different things.  

In some translations, there might not be comparable functionality in the target dialect, while in other situations, it may be something that’s not defined or supported yet.  In the best-case scenario, the SQL dialects that you’re translating between are nearly identical or easily translated.

In the case of our SQL Server translation, there are a number of different translations that are either a work in progress or aren’t supported by Snowflake.  In an ideal case, you would get a warning or error message notifying you that the translation could not be completed.  In the worst-case scenario, your application breaks because something unexpected happened in the translation.  

To that end, we’ve added support messages around many unimplemented or unsupported translations!  While this seems like a pat on our own back, this is critical when running applications.  Errors need to be handled; otherwise, your application crashes.

In other news, we’ve also run into examples where SQL dialects can differ in ways we don’t intuitively think about or anticipate.  For example, did you know that you can put spaces between a less-than-or-equals symbol in SQL Server?  Who would have thought?  Well, we’ve now added support for this situation.

Source: MSSQL
					select * from t where
min < = 10 and max > = 10 and avg ! = 10 and sum < > 5;
Target: Snowflake
  min <= 10
  AND max >= 10
  AND avg != 10
  AND sum <> 5;

We also ran into a situation where we found that in SQL Server, you can reference a table alias within the column definition in a create-table-as-select statement.  Snowflake doesn’t support this, but we’ve been able to implement a solution with a warning included for future reference.

Source: MSSQL
					insert table3 (d.id, dname) select DISTINCT d.deptid, d.dname from t2 d;
Target: Snowflake
  table3 (
    -- WARNING: line 6, character 15: Syntax difference: Referencing a column using a qualified name like table1.col1 or schema1.table1.col1.
  DISTINCT d.deptid,
  t2 d;

In some situations, though, you still have to throw an error and expect that the user is going to have to triage that error due to support not aligning between platforms.  In spoken languages, we run into this all the time.  

A word may exist in English but doesn’t exist in Spanish, so one language will borrow that word from the other directly and expect that people know the meaning of that word from the other language.  However, computers don’t know how to do this, so we need engineers to figure this out.

Source: MSSQL
as e;
Target: Snowflake
					-- WARNING: line 2, character 0: Syntax difference: INTO clause in SELECT statement. Snowflake CTAS is a workaround for this.
  LEFT OUTER JOIN LATERAL -- ERROR: line 4, character 12: Feature not supported: OPENJSON in a SELECT statement.
  OPENJSON(p.PAYLOAD, '$.str') ase AS e;

Toolkit CLI Updates

We’ve recently launched a new version of the Toolkit CLI (0.22.0).  This contains updates to the base set of commands and the individual tools bundled within the Toolkit CLI.

One of the first actions you need to perform when leveraging the toolkit is authenticating with Snowflake.  This allows the tool to perform actions such as provisioning resources, auditing your Snowflake environment, programmatic data source script execution, and other pieces of functionality.  

Previously, when using the toolkit auth --set command, which allows you to set your authentication to Snowflake, the command would fail if you had an expired token.  We now perform a check to ensure that the token you provided hasn’t expired before trying to use it.  

We also have improved the error handling and messaging around RSA keys in general to provide more guidance and support for users.  If you’ve ever gotten a less-than-helpful error message when trying to run an application, you’ll understand why quality error messages are important!

Another excellent addition to the provision tool in the Toolkit CLI is the ability to define tags at the account level.  For companies with only one account, this might not be the most helpful functionality, but if your organization leverages several different Snowflake accounts as part of the greater organization, tags can help define things such as ownership, chargeback, and logical environments.

Bringing It All Together

We love to iterate on our ability to bring more value to our customers.  This means ideating on tooling our customers need to be successful in their platform, project, and support-based efforts, as well as receiving feedback from people like you!  

If there are things you’d like to see our Toolkit be able to perform for you, give us a shout!  We love hearing from you all and look forward to our next update!

phData Toolkit

If you haven’t already explored the phData Toolkit, we highly recommend checking it out!

We encourage you to spend a few minutes browsing the apps and tools available in the phData Toolkit today to set yourself up for success in 2023. 

Be sure to follow this series for more updates on the phData Toolkit tools and features.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit