Here at phData, we strive to build automation for mundane tasks whenever possible. This allows us to focus on solving the more complex and challenging problems that our customers face.
Over the years, we have built a few different tools (CloudFoundation, Tram, and PAMS to name a few) that increase productivity, security, and development/deployment of data engineering pipelines and processes. These tools continue to evolve as new functionality comes out from phData partners and as these tools are integrated into customer systems.
One tool that we have built, called SQLMorph, continues to get increased usage. This free-to-use tool is a transpiler built to convert SQL statements from source to target. By using this tool either through the UI or published Python script, it allows you to quickly translate your existing SQL statements into their destination. This lets you spend more time on complex topics such as establishing data governance practices.
As we continue to iterate on SQLMorph, there has become a need to broadcast to our existing and potential customers the additional features and functionality we have added. This blog series aims to do just that.
What is SQLMorph?
For those who are new to SQLMorph, it’s worth explaining what SQLMorph is and the complexity that it aims to solve.
Many enterprises are still running an on-premises database or data warehouse, and are looking to transition to a cloud based system. There are many different providers in the cloud data warehouse space, however we’ve noticed many customers moving to the Snowflake Data Cloud. As such, SQLMorph currently is heavily focused on Snowflake as a target transpilation.
While many providers follow the ANSI SQL standard, almost every provider extends this standard, or compliments it with additional functionality and syntax. Just like programming languages, everybody has an opinion of a better way to do things.
For example, let’s say a customer is looking to migrate from Oracle to Snowflake. They have some SQL that selects data out of their database:
Source Dialect: Oracle
select d.department_name, v.employee_id, v.last_name from departments d cross apply ( select * from employees e where e.department_id = d.department_id ) v where d.department_name in ('Marketing', 'Operations', 'Public Relations') order by d.department_name, v.employee_id
SELECT d.department_name, v.employee_id, v.last_name FROM departments d CROSS JOIN ( SELECT * FROM employees e WHERE e.department_id = d.department_id ) v WHERE d.department_name IN ('Marketing', 'Operations', 'Public Relations') ORDER BY d.department_name, v.employee_id;
While this is a simple translation of Oracle’s Cross Apply to Snowflake’s Cross Join, transpilations can get much more complicated. For example:
Source Dialect: Oracle
PROCEDURE for_loop_cursor_update_insert_example (GROUP_SEQ NUMBER, TRANS_SEQ NUMBER, ID NUMBER,TRANS_ID NUMBER, TRANS_SUM NUMBER, UNIT VARCHAR(6), DEBET NUMBER, CREDIT NUMBER) IS BEGIN PP := 1; FOR v_rec in (SELECT unit from xxx) LOOP FOR d_rec in (SELECT ID, TRANS_SUM, UNIT from yyy where unit = v_rec.unit ) LOOP UPDATE YYY_SEQUENCE SET SEQ_COUNT = SEQ_COUNT + 1 WHERE SEQ_NAME = 'GROUP_SEQ' COMMIT; INSERT INTO AAA (ID, SEQ) VALUES (ID, GROUP_SEQ); PP := 1; WHILE PP <= 4 LOOP PP := PP +1; UPDATE YYY_SEQUENCE SET SEQ_COUNT = SEQ_COUNT + 1 WHERE SEQ_NAME = 'GROUP_SEQ' COMMIT; IF (PP < 3) THEN DEBET := TRANS_SUM; CREDIT := 0; ELSE DEBET := 0; CREDIT := TRANS_SUM; END IF; INSERT INTO BBB(ID, SEQ, UNIT, DEBET, CREDIT) VALUES (ID, TRANS_SEQ, UNIT, DEBET, CREDIT); COMMIT; END LOOP; END LOOP; END LOOP; -- from https://stackoverflow.com/a/61185793/14042808 END;
New SQLMorph Features
As previously mentioned, the goal of this blog series is to announce new features to SQLMorph. These features allow us to continue to automate and abstract differences between SQL syntaxes for customers migrating between providers.
In the last month, we’ve added and made adjustments to the following language specific transpilations:
Oracle to Snowflake
- Update non-reserved keyword list based on Oracle document. Such as SORT, ENABLE, etc.
- Translate CONSTRAINT state eg INITIALLY DEFERRED
- PL/SQL – parse procedure calls without parentheses
- PL/SQL – fixed bug where SQL in EXECUTE IMMEDIATE wasn’t translated correctly
Impala to Snowflake
- CAST is now mapped to TRY_CAST due to source CAST behavior
From: SELECT CAST(I AS INT)
To: SELECT TRY_CAST(TO_VARCHAR(I) AS INT)
- String literals using Double Quotes are now mapped to single quotes
From: SELECT LOWER(“ABC”)
To: SELECT LOWER(‘ABC’)
- Snowflake now supports CONCAT_WS and therefore we removed recursive calls to CONCAT
Impala: SELECT CONCAT_WS(‘A’,’B’,’C’,’D’)
Was re-written as: SELECT CONCAT(CONCAT(CONCAT(‘A’, ‘B’), ‘C’), ‘D’)
Now: SELECT CONCAT_WS(‘A’,’B’,’C’,’D’)
Microsoft SQL Server to Snowflake
- Implemented delimited identifiers
From: SELECT [Exchange Rate (Value In USD)]
To: SELECT “Exchange Rate (Value In USD)”
- Fix alias in front of columnName
From: SELECT ’empID’ = EMP_ID
To: SELECT EMP_ID empID
- Implement CASE expression
- Add ON PRIMARY in create table statement
We’ve also made some modifications to the command line interface (CLI) that we provide for SQLMorph. This allows you to bulk translate SQL files rather than manually going through them one at a time in the user interface (UI).
- Ability to choose the target type for date in Oracle to Snowflake
- Ability to Support server side configs
SQLMorph is Now SQL Translation
Fantastic news! phData combined all of its tools and apps into a unified interface called the phData Toolkit. With the Toolkit, you’ll have a centralized place to utilize all of your favorite phData tools including SQLMorph, now called SQL Translation.