March 31, 2022

phData Toolkit March 2022 Update

By Nick Goble

Hello and welcome to the first installment of our phData Toolkit update! Previously, this series was aimed exclusively at providing updates to SQLMorph, our free-to-use transpiler, which is now part of the phData Toolkit!

Since these services have been integrated together, we’ve decided that it makes more sense to inform our customers about all the updates to our platform rather than individual tools. The goal of these updates is to let you know about new functionality, upcoming changes, or to refresh your memory on features that already exist!

We want to ensure that customers stay informed about all the great work our internal engineering team is doing and the value these tools bring when paired with our services.

With that said, let’s get into it!

SQL Translation Updates

Due to the nature of being a translator, we constantly are making changes and updates to the sources and targets that our SQL Translation tool supports. As edge cases are found or new requirements are added to the tool, we adjust the translations to ensure we have a certain coverage of statements and that they consistently work.

As a reminder, if you find any issues or have any feedback, you can always send us a message in the support tab in the Toolkit.

Let’s take a look at some of the additions that have been made.

Netezza to the Snowflake Data Cloud

Over the course of the last month, this source-target pair has been our primary focus. The team has been hard at work implementing the following statements:

  • Insert
  • Update
  • Delete
  • Truncate
  • Create Database
  • Create Schema
  • Set
  • Alter Table
  • Merge
  • Alter [Materialized] View
  • Grant
  • Common Table Expressions
  • Drop
  • Revoke
  • Create/Alter Group

Additionally, we:

  • Implemented create database/schema statements
  • Implemented alter table statements
  • Implemented set/ statements



-- WARNING: NetezzaStmtVisitor: line 1, character 26: Snowflake does not support `COLLECT HISTORY` for Create_Database statement.



Hive Family to Snowflake

One of our largest and most complete set of translations, the Hive family (consisting of Hive, Impala, Databricks SparkSQL, and SparkSQL) had a significant upgrade as well. We’ve added the ability to use the sort_array function, which requires injecting a definition for this method into Snowflake.

    SIZE(col1) = = 0,
    concat_ws(',', SORT_ARRAY(col1))
  ) AS dcode

-- WARNING: Translator: Since snowflake doesn't have in-built function to sort array, `PHDATA_SORT_ARRAY` function can be created in public schema and can be used as an equivalent of `SORT_ARRAY` function.
OR REPLACE FUNCTION PHDATA_SORT_ARRAY(A array) RETURNS string LANGUAGE JAVASCRIPT COMMENT = 'Sort array passed as parameter. Return COMMA delimited string.' AS $$ return A.sort(
  function(a, b) { if (typeof(a) = = = 'number' & & typeof(b) = = = 
					'number') { return a - b;
else { if(a < b) { return -1;
} if(a > b) { return 1;
} return 0;
} }
    -- WARNING: FunctionRewriter: line 1, character 10: Snowflake `ARRAY_SIZE` accepts array and variant as input. Returns NULL if variant value is not an array.
    ARRAY_SIZE(col1) = 0,
  ) AS dcode


Tram Updates

Tram, our Snowflake infrastructure as code (IaC) tool, is built around the idea of accelerating onboarding and providing data governance structures for Snowflake. This tool gives you the ability to generate groups and members of those groups to ensure that users are provisioned in a consistent manner with a different grouping of resources and access.

phData built Tram to take the manual grunt work out of the equation and make onboarding users in Snowflake both seamless and painless.

Over the course of the last month, we’ve added a few new features to this helpful tool.

Drop Schema Upgrades

We’ve now implemented the ability to specify CASCADE or RESTRICT when deprovisioning a schema. Tram not only gives you the ability to provision resources but deprovision them as well. When deciding to deprovision a higher level object like a database or a schema, you need to ultimately decide what happens to the objects within that resource. Documentation on the differences can be found in Snowflake.

Tram does not immediately de-provision your objects. Tram will use a pending_drops table to allow administrators to review. In the event that you don’t want an administrator to have to review these drops, you can specify the –max-drops option, which will apply drops up to the maximum value specified.

Snowflake Tags

We’ve also added the ability to manage and create native tags within Snowflake. We provided this as a feature before Snowflake did (one of the benefits of Tram), however, Snowflake has now added native support for this object.  

It’s worth noting that this feature is still in BETA within Snowflake and the API may change in the future. In the event that the API does change, Tram will be updated to match.

phData Toolkit

If you haven’t already explored the phData Toolkit, we highly recommend checking it out!

We encourage you to spend a few minutes browsing the apps and tools available in the phData Toolkit today to set yourself up for success in 2022. 

Be sure to follow this series for more updates to the phData Toolkit tools and features.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit