March 14, 2024

Snowflake ETL Face-Off: Alteryx Designer vs. Matillion ETL

By Pedro Faria

In the data analytics processes, choosing the right tools is crucial for ensuring efficiency and scalability. Two popular players in this area are Alteryx Designer and Matillion ETL, both offering strong solutions for handling data workflows with Snowflake Data Cloud integration. Despite their disparate purposes, these two products share similar capabilities.

Alteryx Designer is a desktop software that thrives as a no-code data agnostic software. Alteryx runs on your local computer’s memory, Alteryx boasts numerous features ensuring good performance for a Snowflake connection, with a particular emphasis on the In-DB connection, which we will focus on in this analysis.

Matillion ETL is purpose-built for the cloud, operating smoothly on top of your chosen data warehouse. Its integration with Snowflake is natural and totally integrated, resulting in the efficient push-down of processing tasks to Snowflake.

In this blog, we’ll conduct a comparative analysis of these platforms, emphasizing usability, performance, and the unique strengths each brings to the table when working with Snowflake. 

The goal is to help you define which tool makes the most sense for your specific use case or situation. It’s important to note that we will not compare the direct monetary cost, as these two products employ entirely different billing methods.

Alteryx Designer + Snowflake

Users can leverage a Snowflake connection through two different methods: pulling data into Alteryx by memory when using the regular tools and the In-DB tools. There are a few differences between each method, but in simple terms, the regular tools will use the local computer processing power, while In-DB tools use the desired data warehouse (Snowflake, in this case) processing power.

Generally, in Alteryx Designer, to establish the Snowflake connection, one could use an ODBC driver or even a DSN-less connection. For today, the focus will be on the In-DB tools. In order to work with a Snowflake connection, users must establish a local ODBC connection and utilize the Alteryx interface to select the desired table or view connection. 

To enhance user usability, the Query Builder within the Connect In-DB tool (shown in the image below) offers a great option for a no-code approach. Under the Visual Query Builder, this pop-up window displays all databases, schemas, tables, and views accessible to the role defined, along with the option to write SQL directly when switching to the SQL Editor tab.

The benefits of using In-DB tools:

  • Leverage the processing power of Snowflake and not your local machine

  • Alteryx offers many tools to filter, join, combine, summarize, and transform data without the need to write SQL

  • Allow for a secure data transfer, as no data is actually leaving Snowflake into your local computer

Below you will find an example of an In-DB workflow to push all the processing to Snowflake to remove columns, filter data, summarize, and join another table to then create the desired output data in the format of a table inside Snowflake.

On the backend, Alteryx translates the user’s workflow into Snowflake SQL before sending the script to run inside Snowflake. Within Snowflake, the transformation of the depicted workflow into Snowflake SQL can be visualized in the Query History tab.

In the showcased example, data manipulation involves datasets with over 100 million records. Utilizing the Alteryx tools to translate to SQL and sending the script to Snowflake mirrors running the script directly in Snowflake’s Worksheets. 

The performance metrics indicate an 80-second runtime for the script and a total of 87 seconds for Alteryx to generate the script, Snowflake to execute it, and Snowflake to confirm the successful creation of the TOTAL_CUSTOMER table.

Using regular Alteryx tools in this scenario would involve data being transferred to Alteryx, resulting in significantly slower performance. Leveraging In-DB tools not only ensures safety but also provides optimal performance. It is the best practice to use In-DB tools for Snowflake processing, and the example below illustrates this point.

Matillion ETL + Snowflake

Matillion ETL offers a user-friendly experience through a native interface that is purpose-built specifically for the cloud. Today we will focus on Snowflake as our cloud product. While it may not boast the same drag-and-drop simplicity as Alteryx, Matillion provides a clean, purpose-built environment that streamlines the ETL and ELT process. Users comfortable with SQL may find Matillion’s approach more familiar and seamless.

Matillion truly shines in parallel processing and scalability. Optimized for cloud-based data warehouses like Snowflake, Matillion efficiently distributes workloads across clusters, ensuring high performance even with extensive data processing requirements.

For a Snowflake connection, the user will need to set up a Matillion environment. Users can set multiple environments. When setting it up, users will define the default settings as well as the credentials to be used for the specific Snowflake instance they want to leverage. 

In Matillion, developing workflows with Snowflake in mind is simple and user-friendly. Connecting to a Snowflake table using the Table Input tool involves straightforward steps, and Matillion conducts status checks with an OK indication to ensure the selected options will work. The sample tab provides a handy feature where Snowflake runs a SELECT statement with a limit clause to return a sample of data into Matillion in a fraction of a second.

The final workflow presentation in Matillion closely resembles Alteryx. The key difference lies in the Snowflake connection, allowing quick validation (green outline around the components) and sampling of data as each component is built.

Thanks to native connectivity, Matillion demonstrates faster workflow execution. The entire process in Matillion took 4 seconds, with Snowflake executing many queries, many of them being logistic queries, and the main transformation query taking 2.1 seconds.

Comparing Alteryx Designer and Matillion ETL

Usability is a key factor when evaluating ETL tools, as it directly impacts the efficiency and ease of use for data professionals. Both Alteryx and Matillion offer a simple intuitive drag-and-drop interface that requires no code for data manipulation. Their visual interface makes it accessible for users with varying levels of technical expertise, promoting collaboration across teams.

Matillion features two types of jobs: Orchestration and Transformation Jobs. While Alteryx is a great tool, Matillion stands out by allowing the development of a true ETL process. It offers the ability to orchestrate ETL processes with numerous DDL features, error handling, coding options, metadata handling, native connectors, APIs, and Cloud Platform-specific features. Transformation jobs provide even more flexibility to manipulate data without the need for SQL, surpassing the capabilities of Alteryx in this aspect.

For advanced or admin users, Matillion is a handy option. Snowflake’s DDL options and other advanced features are integrated into Matillion as native components. In comparison to Alteryx, even a simple command like ALTER WAREHOUSE has a specific Matillion component. 

Alteryx’s limitation to only a single command per Connect In-DB tool means that only the script can be placed into the tool, excluding other commands like the ALTER WAREHOUSE command. Consequently, Alteryx developers may need more experience with Snowflake than Matillion users.

Alteryx and Matillion serve different purposes, and their comparison should be based on specific use cases. Data transformation can be achieved on both platforms. Thus, for performance comparison, we leveraged the same transformation logic for both software, and Matillion performed better.

For developers, version control is often associated with ETL processes. Alteryx Designer does not offer an option. The only version control option is through the Alteryx Server (additional cost than just Designer), Matillion offers a git connection natively.

For sampling and validation, Alteryx offers some options with containers, but you don’t have the option to truly validate a single step in the middle of the job without executing the components prior to the step you want. Unlike Alteryx, where running a larger part of the workflow is required to view the data, Matillion enables real-time data preview and sampling as each component is added. 

Keep in mind that each sample is indeed a Snowflake query running on the backend. Both products will execute jobs in your Snowflake instance, but Matillion allows you to single out a specific step in the job.

In addition, for control’s sake, Matillion offers a Plan view where you can visualize the whole workflow in text form. In simple terms, this is just an “Explain” statement executed in Snowflake and returned into Matillion, the image shown below. 

Alteryx offers a “similar” but less complete option with the Dynamic Output In-DB to view the built-out query logic. The Query field must be selected as displayed below. Because we are using the In-DB tools when we run the workflow, we create the complete SQL script and push it into Snowflake.

Conclusion

After considering everything we’ve discussed up to this point, it indeed makes sense why Matillion performed better than Alteryx. At the end of the day, we are truly comparing apples to oranges. Both pieces of software offer great features and capabilities, but one is built with Snowflake in mind, while the other ‘just’ connects to Snowflake

Alteryx can be a great tool for business user developers who require direct access to Snowflake tables and views without the need to perform any DDL or other advanced Snowflake adjustments. 

Envision your marketing department needing to access data inside Snowflake, but their main sources of data are not limited to Snowflake alone; they include SQL Server, Shared Folder Access, Excel Files, etc. Alteryx should be the better software for these needs. If the requirement is for an ETL or ELT process that includes Snowflake processing, it’s best to leave this to another software.

Matillion can be leveraged in a completely different way. Although it has some additional features similar to Alteryx, it is best to understand that they are probably too different to be the answer for the same use case or situation. Snowflake-leveraged ETL processes through this platform are easy and straightforward. 

If there is a need to develop data pipelines, for example, from FTP to S3 Bucket to Snowflake, this can be achieved and managed much more easily through Matillion. Snowflake use and administration are much easier done through Matillion, and as displayed above, it also performs much better.

If you have any additional questions about Alteryx Designer vs. Matillion ETL, reach out to our team of experts!

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit