Here at phData, we strive to build automation for mundane tasks whenever possible. This allows us to focus on solving the more complex and challenging problems that our customers face.
Over the years, we have built a few different tools (CloudFoundation, Tram, and PAMS to name a few) that increase productivity, security, and development/deployment of data engineering pipelines and processes. These tools continue to evolve as new functionality comes out from phData partners and as these tools are integrated into customer systems.
One tool that we have built, called SQLMorph, continues to get increased usage. This free-to-use tool is a transpiler built to convert SQL statements from source to target. By using this tool either through the UI or published Python script, it allows you to quickly translate your existing SQL statements into their destination. This lets you spend more time on complex topics such as establishing data governance practices.
As we continue to iterate on SQLMorph, there has become a need to broadcast to our existing and potential customers the additional features and functionality we have added. This blog series aims to do just that.
What is SQLMorph?
For those who are new to SQLMorph, it’s worth explaining what SQLMorph is and the complexity that it aims to solve.
Many enterprises are still running an on-premises database or data warehouse, and are looking to transition to a cloud based system. There are many different providers in the cloud data warehouse space, however we’ve noticed many customers moving to the Snowflake Data Cloud. As such, SQLMorph currently is heavily focused on Snowflake as a target transpilation.
While many providers follow the ANSI SQL standard, almost every provider extends this standard, or compliments it with additional functionality and syntax. Just like programming languages, everybody has an opinion of a better way to do things.
Examples
For example, let’s say a customer is looking to migrate from Oracle to Snowflake. They have some SQL that selects data out of their database:
Source Dialect:Â Oracle
select
d.department_name,
v.employee_id,
v.last_name
from
departments d
cross apply (
select
*
from
employees e
where
e.department_id = d.department_id
) v
where
d.department_name in ('Marketing', 'Operations', 'Public Relations')
order by
d.department_name,
v.employee_id
SELECT
d.department_name,
v.employee_id,
v.last_name
FROM
departments d
CROSS JOIN (
SELECT
*
FROM
employees e
WHERE
e.department_id = d.department_id
) v
WHERE
d.department_name IN ('Marketing', 'Operations', 'Public Relations')
ORDER BY
d.department_name,
v.employee_id;
While this is a simple translation of Oracle’s Cross Apply to Snowflake’s Cross Join, transpilations can get much more complicated. For example:
Source Dialect: Oracle
PROCEDURE for_loop_cursor_update_insert_example (GROUP_SEQ NUMBER, TRANS_SEQ NUMBER, ID NUMBER,TRANS_ID NUMBER, TRANS_SUM NUMBER, UNIT VARCHAR(6), DEBET NUMBER, CREDIT NUMBER) IS
BEGIN
PP := 1;
FOR v_rec in (SELECT unit from xxx)
LOOP
FOR d_rec in (SELECT ID, TRANS_SUM, UNIT from yyy where unit = v_rec.unit )
LOOP
UPDATE YYY_SEQUENCE
SET SEQ_COUNT = SEQ_COUNT + 1
WHERE SEQ_NAME = 'GROUP_SEQ'
COMMIT;
INSERT INTO AAA (ID, SEQ) VALUES (ID, GROUP_SEQ);
PP := 1;
WHILE PP <= 4
LOOP
PP := PP +1;
UPDATE YYY_SEQUENCE
SET SEQ_COUNT = SEQ_COUNT + 1
WHERE SEQ_NAME = 'GROUP_SEQ'
COMMIT;
IF (PP < 3) THEN
DEBET := TRANS_SUM;
CREDIT := 0;
ELSE
DEBET := 0;
CREDIT := TRANS_SUM;
END IF;
INSERT INTO BBB(ID, SEQ, UNIT, DEBET, CREDIT) VALUES (ID, TRANS_SEQ, UNIT, DEBET, CREDIT);
COMMIT;
END LOOP;
END LOOP;
END LOOP;
-- from https://stackoverflow.com/a/61185793/14042808
END;
-- INFO: Comments from translated statement. Future work will list them closer to their original location.
-- Comment(s) from line 38, character 0
-- from https://stackoverflow.com/a/61185793/14042808
create or replace procedure for_loop_cursor_update_insert_example ("GROUP_SEQ" FLOAT, "TRANS_SEQ" FLOAT, "ID" FLOAT, "TRANS_ID" FLOAT, "TRANS_SUM" FLOAT, "UNIT" VARCHAR(6), "DEBET" FLOAT, "CREDIT" FLOAT)
returns string
language javascript
as
$$
var PP = 1;
var cursor1 = snowflake.createStatement({
sqlText: "SELECT unit FROM xxx;"
});
var cursor1Result = cursor1.execute();
while (cursor1Result.next()) {
var cursor1 = snowflake.createStatement({
sqlText: "SELECT ID, TRANS_SUM, UNIT FROM yyy WHERE unit = v_rec.unit;"
});
var cursor1Result = cursor1.execute();
while (cursor1Result.next()) {
var loopResult = snowflake.execute({
sqlText: "UPDATE YYY_SEQUENCE SET SEQ_COUNT = SEQ_COUNT + 1 WHERE SEQ_NAME = 'GROUP_SEQ';"
});
COMMIT;
var loopResult = snowflake.execute({
sqlText: "INSERT INTO AAA (ID, SEQ) VALUES (ID, GROUP_SEQ);"
});
var PP = 1;
while (PP <= 4) {
var PP = PP + 1;
var sql_stmt = snowflake.createStatement({
sqlText: "UPDATE YYY_SEQUENCE SET SEQ_COUNT = SEQ_COUNT + 1 WHERE SEQ_NAME = 'GROUP_SEQ';"
});
var sql_result = sql_stmt.execute();
var sql_stmt = snowflake.createStatement({
sqlText: "COMMIT;"
});
var sql_result = sql_stmt.execute();
if ((PP < 3)) {
var DEBET = TRANS_SUM;
var CREDIT = 0;
} else {
var DEBET = 0;
var CREDIT = TRANS_SUM;
}
var sql_stmt = snowflake.createStatement({
sqlText: "INSERT INTO BBB (ID, SEQ, UNIT, DEBET, CREDIT) VALUES (ID, TRANS_SEQ, UNIT, DEBET, CREDIT);"
});
var sql_result = sql_stmt.execute();
var sql_stmt = snowflake.createStatement({
sqlText: "COMMIT;"
});
var sql_result = sql_stmt.execute();
}
}
}
$$
;
New SQLMorph Features
As previously mentioned, the goal of this blog series is to announce new features to SQLMorph. These features allow us to continue to automate and abstract differences between SQL syntaxes for customers migrating between providers.
In the last month, we’ve added and made adjustments to the following language specific transpilations:
Oracle to Snowflake
- Update non-reserved keyword list based on Oracle document. Such as SORT, ENABLE, etc.
- Translate CONSTRAINT state eg INITIALLY DEFERRED
- PL/SQL – parse procedure calls without parentheses
- PL/SQL – fixed bug where SQL in EXECUTE IMMEDIATE wasn’t translated correctly
Impala to Snowflake
- CAST is now mapped to TRY_CAST due to source CAST behavior
From: SELECT CAST(I AS INT)
To: SELECT TRY_CAST(TO_VARCHAR(I) AS INT) - String literals using Double Quotes are now mapped to single quotes
From: SELECT LOWER(“ABC”)
To: SELECT LOWER(‘ABC’) - Snowflake now supports CONCAT_WS and therefore we removed recursive calls to CONCAT
Impala: SELECT CONCAT_WS(‘A’,’B’,’C’,’D’)
Was re-written as: SELECT CONCAT(CONCAT(CONCAT(‘A’, ‘B’), ‘C’), ‘D’)
Now: SELECT CONCAT_WS(‘A’,’B’,’C’,’D’)
Microsoft SQL Server to Snowflake
- Implemented delimited identifiers
From: SELECT [Exchange Rate (Value In USD)]
To: SELECT “Exchange Rate (Value In USD)” - Fix alias in front of columnName
From: SELECT ’empID’ = EMP_ID
To: SELECT EMP_ID empID - Implement CASE expression
- Add ON PRIMARY in create table statement
We’ve also made some modifications to the command line interface (CLI) that we provide for SQLMorph. This allows you to bulk translate SQL files rather than manually going through them one at a time in the user interface (UI).
- Ability to choose the target type for date in Oracle to Snowflake
- Ability to Support server side configs
SQLMorph is Now SQL Translation
Fantastic news! phData combined all of its tools and apps into a unified interface called the phData Toolkit. With the Toolkit, you’ll have a centralized place to utilize all of your favorite phData tools including SQLMorph, now called SQL Translation.