phData Toolkit March 2022 Update

Hello and welcome to the first installment of our phData Toolkit update! Previously, this series was aimed exclusively at providing updates to SQLMorph, our free-to-use transpiler, which is now part of the phData Toolkit!

Since these services have been integrated together, we’ve decided that it makes more sense to inform our customers about all the updates to our platform rather than individual tools. The goal of these updates is to let you know about new functionality, upcoming changes, or to refresh your memory on features that already exist!

We want to ensure that customers stay informed about all the great work our internal engineering team is doing and the value these tools bring when paired with our services.

With that said, let’s get into it!

SQL Translation Updates

Due to the nature of being a translator, we constantly are making changes and updates to the sources and targets that our SQL Translation tool supports. As edge cases are found or new requirements are added to the tool, we adjust the translations to ensure we have a certain coverage of statements and that they consistently work.

As a reminder, if you find any issues or have any feedback, you can always send us a message in the support tab in the Toolkit.

Let’s take a look at some of the additions that have been made.

Netezza to the Snowflake Data Cloud

Over the course of the last month, this source-target pair has been our primary focus. The team has been hard at work implementing the following statements:

Insert
Update
Delete
Truncate
Create Database
Create Schema
Set
Alter Table
Merge
Alter [Materialized] View
Grant
Common Table Expressions
Drop
Revoke
Create/Alter Group

Additionally, we:

Implemented create database/schema statements
Implemented alter table statements
Implemented set/ statements

Examples

				
					Source:
CREATE DATABASE customers COLLECT HISTORY OFF;

Target:
-- WARNING: NetezzaStmtVisitor: line 1, character 26: Snowflake does not support `COLLECT HISTORY` for Create_Database statement.
CREATE DATABASE customers;

				
					Source:
ALTER TABLE distributors ALTER COLUMN address DROP DEFAULT;

Target:
ALTER TABLE distributors MODIFY COLUMN address DROP DEFAULT;

Hive Family to Snowflake

One of our largest and most complete set of translations, the Hive family (consisting of Hive, Impala, Databricks SparkSQL, and SparkSQL) had a significant upgrade as well. We’ve added the ability to use the sort_array function, which requires injecting a definition for this method into Snowflake.

				
					Source:
SELECT
  IF(
    SIZE(col1) = = 0,
    'Free',
    concat_ws(',', SORT_ARRAY(col1))
  ) AS dcode
FROM
  emp;

Target:
-- WARNING: Translator: Since snowflake doesn't have in-built function to sort array, `PHDATA_SORT_ARRAY` function can be created in public schema and can be used as an equivalent of `SORT_ARRAY` function.
CREATE
OR REPLACE FUNCTION PHDATA_SORT_ARRAY(A array) RETURNS string LANGUAGE JAVASCRIPT COMMENT = 'Sort array passed as parameter. Return COMMA delimited string.' AS $$ return A.sort(
  function(a, b) { if (typeof(a) = = = 'number' & & typeof(b) = = =

				
					'number') { return a - b;
}
else { if(a < b) { return -1;
} if(a > b) { return 1;
} return 0;
} }
);
$$;
SELECT
  IFF(
    -- WARNING: FunctionRewriter: line 1, character 10: Snowflake `ARRAY_SIZE` accepts array and variant as input. Returns NULL if variant value is not an array.
    ARRAY_SIZE(col1) = 0,
    'Free',
    CONCAT_WS(',', PHDATA_SORT_ARRAY(col1))
  ) AS dcode
FROM
  emp;

Tram Updates

Tram, our Snowflake infrastructure as code (IaC) tool, is built around the idea of accelerating onboarding and providing data governance structures for Snowflake. This tool gives you the ability to generate groups and members of those groups to ensure that users are provisioned in a consistent manner with a different grouping of resources and access.

phData built Tram to take the manual grunt work out of the equation and make onboarding users in Snowflake both seamless and painless.

Over the course of the last month, we’ve added a few new features to this helpful tool.

Drop Schema Upgrades

We’ve now implemented the ability to specify CASCADE or RESTRICT when deprovisioning a schema. Tram not only gives you the ability to provision resources but deprovision them as well. When deciding to deprovision a higher level object like a database or a schema, you need to ultimately decide what happens to the objects within that resource. Documentation on the differences can be found in Snowflake.

Tram does not immediately de-provision your objects. Tram will use a pending_drops table to allow administrators to review. In the event that you don’t want an administrator to have to review these drops, you can specify the –max-drops option, which will apply drops up to the maximum value specified.

Snowflake Tags

We’ve also added the ability to manage and create native tags within Snowflake. We provided this as a feature before Snowflake did (one of the benefits of Tram), however, Snowflake has now added native support for this object.

It’s worth noting that this feature is still in BETA within Snowflake and the API may change in the future. In the event that the API does change, Tram will be updated to match.

phData Toolkit

If you haven’t already explored the phData Toolkit, we highly recommend checking it out!

We encourage you to spend a few minutes browsing the apps and tools available in the phData Toolkit today to set yourself up for success in 2022.

Be sure to follow this series for more updates to the phData Toolkit tools and features.

phData Toolkit March 2022 Update

SQL Translation Updates

Netezza to the Snowflake Data Cloud

Examples

Hive Family to Snowflake

Tram Updates

Drop Schema Upgrades

Snowflake Tags

phData Toolkit

More to explore

Snowflake Query Tagging Best Practices

Data Ingestion from PostgreSQL to Snowflake using Openflow

The GenAI Strategy Playbook

Join our team

Partners

Resources

Software

Accelerate and automate your data projects with the phData Toolkit

Industries

Solutions

Company

Technology Partners

Other Technology Partners

Check out our latest insights

Snowflake Query Tagging Best Practices

Data Ingestion from PostgreSQL to Snowflake using Openflow

Data Engineering

Consulting, Migrations, Data Pipelines, DataOps

Change Management, Enablement & Learning

COE, Coaching, PMO

Data Science and Machine Learning Services

MLOps Enablement, Prototyping, Model Development and Deployment

Strategy Services

Data, Analytics, and AI Strategy, Architecture and Assessments

Reporting, Analytics, and Visualization Services

Self-Service, Integrated Analytics, Dashboards, Automation

Elastic Operations

Data Platforms, Data Pipelines, and Machine Learning