March 3, 2023

Five Most Useful Extensions in KNIME

By John Emery

When you download KNIME Analytics Platform for the first time, you will no doubt notice the sheer number of nodes available to use in your workflows. With hundreds of nodes at your disposal, the idea of wanting more nodes might seem unnecessary. 

Of course, as any analytics engineer could tell you, there is no such thing as too many tools. This is where KNIME truly shines and sets itself apart from its competitors: the scores of free extensions available for download. Whether you want nodes to publish your data to Tableau Server, connect to a Snowflake Data Cloud database, or perform image or audio analyses, there is an extension for you.

In this post, we will discuss five of our favorite and most useful extensions in KNIME. Rather than focusing on a platform or database-specific extensions (such as the Tableau and Snowflake extensions mentioned above), we will focus on extensions that provide value to KNIME developers of all types.

Downloading Extensions in KNIME

First, a quick word on how to download extensions within KNIME Analytics Platform.

  1. With KNIME Analytics Platform open, navigate to File → Install KNIME Extensions…
  2. In the menu that appears, either use the search bar to type in a keyword or scroll through the list of folders.
  3. Once you have found the extensions that you wish to install, press Next through a few menus and accept the license agreement. 
  4. Press Finish and allow KNIME to install the selected extensions. Once the installation is complete, restart KNIME. When it reopens, you will see your new extensions in the Node Repository.
Installing Extensions

KNIME Database

KNIME Database Extension.png

The KNIME Database extension includes 50 nodes that allow you to connect to databases, perform in-database processing, read and write tables, and pull data into a standard KNIME workflow. If you need to connect to a database for any purpose, this extension cannot be ignored. When we download a new version of KNIME, this extension is always one of the first we install.

One of the best features of this extension is the array of optimized connectors for some of the most popular databases on the market. These include Microsoft SQL Server, MySQL, Oracle, and PostgreSQL. If you noticed Snowflake’s absence, don’t worry—it has its own extension.

In addition to these dedicated connectors, the DB Connector node can connect to a generic JDBC database, so don’t fret if your database lacks a dedicated node.

An important feature of this extension is a set of nodes that allow you to utilize the power of the database engine. These nodes, such as DB Pivot and DB Row Filter, allow you to perform operations on your data without the need to read a potentially very large table into KNIME. 

KNIME Data Generation

KNIME Data Generation Extension.png

The next extension might seem like a bit of an oddity at first glance. KNIME Data Generation contains nodes that are used to create data on-the-fly. When building workflows, you sometimes need to mock up data; whether the real data set is unavailable, or dynamic data generation is necessary for your workflow, these nodes will help you get the job done.

Some interesting nodes in this extension include Time Series Generator, One Row to Many, and Counter Generation.

KNIME Expressions

This extension is one of the smallest among our favorites. It contains only two nodes: Column

Expressions and Variable Expressions. These two nodes do basically the same thing: allow developers to create an arbitrary number of new fields (or update existing ones) using a full suite of JavaScript-based functions. The only difference between these two nodes is that Column Expressions is used to update or create columns, whereas Variable Expressions is used to update or create flow variables.

The reason we like this extension so much is that the default formula-writing nodes in KNIME—String Manipulation and Math Formula, to list two—only allow you to use a specific subset of functions and you can only manipulate one column (or variable) at a time. In addition, with the newest KNIME release, these nodes allow you to perform functions on row offsets, which the base nodes do not allow.

KNIME Python Integration

KNIME Python Integration Extension.png]

For as powerful and robust as KNIME Analytics Platform is, there are just some things that it either cannot do or are too challenging to be practical. In these instances, those with a Python background may wish to utilize the KNIME Python Integration extension. 

Like the KNIME Expressions extension listed above, this extension also includes only two nodes: Python Script and Python View. The first allows developers to execute arbitrary Python 3 scripts. When you download KNIME, a base package of Python libraries is installed, but if you need something else, installing new libraries is very easy. 

We find this KNIME blog very helpful when first setting up your Python environment.

The other node in this extension, Python View, also allows developers to execute Python scripts. The difference is that this node outputs a data visualization, such as a bar chart, scatter plot, or geospatial map. The output of the Python View node can be used within a component for making dynamic dashboards and web apps.

KNIME Textprocessing

KNIME Textprocessing Extension.png

The last extension that we will discuss here is one of the most fun. KNIME Textprocessing contains a family of nodes that perform a variety of text processing functions. Some of these are simple Case Converter (makes all text upper or lower case) and Punctuation Erasure (removes punctuation characters). 

Others are less simple but still understandable to the amateur—Dictionary Filter and Sentence Extractor. And others are highly complex and suitable for only those who really know what they are doing—Stanford NLP NE Learner and Term Document Entropy.

Across the dozens of nodes available in this extension, you are sure to find what you need for your text processing and natural language processing needs. In addition to that, some of these nodes can be used creatively for other purposes. For example, you can use the Tika Parser node to extract text from documents and URL sources. We have used this node to pull data from PDFs that contained tables and then formatted that data into an analytic data table in KNIME.

Conclusion

In summary, KNIME is a versatile platform for data analysis, and the extensions highlighted in this blog offer a range of useful features for data processing and visualization. This blog only scratches the surface of the myriad extensions offered—for free, remember—by KNIME Analytics Platform. 

Whether you are a data scientist, analyst, or developer, these extensions can help you to streamline your workflow and increase your productivity. We recommend that you give them a try and see how they can enhance your data analysis experience.

Need help getting started with KNIME?

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit