May 23, 2022

Data Modeling in Sigma Computing: What is the Difference Between Lookups vs. Joins?

By Joe McFarren

Sigma Computing is a data exploration platform that provides several great features for various teams in your analytics environment. Business analysts can explore reports and gain insights from visualizations and data tables. Data analysts can write functions to manipulate data and create their own dashboards and reports. Data engineers can mesh data sources and manipulate data right in the platform. 

Sigma can not only create visualizations in a matter of seconds, but it is also a great tool for data modeling and data engineering. Take a look at these things to consider when starting with Sigma.

In this blog, we’ll go over the similarities and differences between a few ways you can join or manipulate your data: data lookups and data joins. While most data analysts or data engineers understand these concepts, it’s important to note some nuances and how to best implement these choices in Sigma.

What is a Lookup in Sigma Computing?

If you are familiar with Excel, you will pick up lookups in Sigma fairly quickly. In essence, a data lookup in Sigma is a v-lookup in Excel. Sigma recently released this feature, and it’s something me and my team have already used extensively. 

How to Create Lookups in Sigma

A data lookup is a great choice if you are essentially doing a left join and want to add or remove columns very easily. You can click the down arrow on a column and select “Add column via lookup” (note at the time of publishing this blog, this feature is in beta).

a screenshot showing the "add column via lookup" capability in Sigma Computing

From here, you simply select which data element your new columns are coming from. Here I am selecting the Customer table, and I want to add the “Cust Since” column. Since this is not an aggregate, there is only one value for each customer, so I select “None” for aggregate. 

To finish this example, I select my columns to join on; in this case, it is “Cust Key.” You can see that 100% of my keys have a match and then I click done.

a screenshot showing which data elements your new columns are coming from
a screenshot showing the new column with a blue pill highlighting the column

Following my lookup, you can see that I brought a column from one table into another. You’ll notice that the column brought in has a blue pill surrounding it. This indicates it is a lookup column, and you can click pill to see additional fields to bring in. This is especially handy if you want to bring in or remove fields after creating your lookup. 

Considerations When Using Lookups in Sigma

There is the possibility you return more than one row in a lookup: say you want to find the average order price for each customer, you would then select your aggregate to return one line. Sigma will take all of the prices for orders for each customer, average that value, and present one value for the customer since the join is on Cust Key.

One final thing to note is that lookups will only return a single value; if this join was selecting store numbers that customers have purchased orders from, you would see an asterisk denoting that there was more than one row returned that cannot be aggregated (like average of profit, or total sales dollars).

Finally, you will see that Sigma creates a function for you using the Lookup() function. So, if you’d rather write out the function instead of using the user interface, you can do that too.

When to Use Lookups

Lookups are great if you want to do a left join on the fly and want to dynamically add or remove columns for ad hoc analysis. Lookups run quickly in Sigma, so there’s no real concern about performance since a lookup is optimized to run on Snowflake. 

Overall, Sigma makes it easy to add columns to your data tables using lookups, and it is a great concept to add to your repertoire.

What is a Join in Sigma?

Much like lookups, data joins allow you to bring in other columns into your final base table. A join is perfect if you want to combine customer information with customer order information. 

Previously, you could use a data link in Sigma to join tables, but as of the writing of this blog, there is new functionality to allow for joins in workbooks. 

How to Create Joins in Sigma

To join a table, click the “Add Element” button, select a new table, select a new data source, and select “Join.”

a screenshot of how to navigate to add a new join in Sigma Computing

Select the data source you want to bring into Sigma first, and you will see a data canvas that allows you to add joins. To add a second table, click the plus sign next to sources in the top right of the screen, and select your second table. 

The difference between using a join versus a lookup is that you can select the join type you want to use: left outer join, inner join, right outer join, or a full outer join. 

In this example, I used the same Plugs Electronics Hands On Lab Data that I did in the lookup example and used a full outer join to connect the D_CUSTOMER table, matching on “Cust Key.”

a screenshot showing an overview of a join in Sigma Computing

This section of the join screen gives a great overview of your join. There are 100 customers in the customer table that are not represented in the other table under the “Keys with no matches” section. 

Additionally, you can click on the final output to select the columns you want to bring in.

selecting columns you want to bring into the final output

If you don’t want all of these columns, uncheck the columns that are not necessary, and click done to bring in your newly joined table to your workbook.

You’ll also see the data lineage if you have a complex join which is a great visual aid to help you track down where your data is coming from. 

Consideration when Using Joins in Sigma

One thing to keep in mind is that a join can return multiple values per join key. For example, a join can be useful to bring in stores that customers placed an order from, whereas a lookup would return an asterisk since you cannot aggregate these values.

Joins are best if you want to productionalize a data source, that is, if you want to use this joined data source without much problem. Adding columns via lookup is great for ad hoc analysis, but joining the tables in Sigma can provide a more robust data source you can reuse for other tables and vizzes. In terms of best practices, lookups are more ad hoc, and joins are better for reuse.

Tips for Choosing Between a Lookup or Join

Tip 1: Remember, when choosing between a lookup or a join, there are a few things to consider: do you need to edit the columns brought in? Or can they stay the same for future use? Do you simply need a left join? Or do you need other types of joins to bring in more rows?

These are things to consider when using a join or a lookup. We’re always here to help if these are considerations in your Sigma implementation!

Tip 2: You may want to consider a lookup over a join if there are only a few columns you need to bring into your table since lookups can be advantageous performance-wise.

A join in Sigma can provide more robust data modeling, but if you only need to bring in a customer name from your customer table to your order table matching on customer ID, a lookup in Sigma is easy, simple, and performs a bit better.

Closing

Overall, both lookups and joins are useful and powerful features of Sigma. A lookup is great when you want to dynamically add or remove columns into a table and if your use case is suited to only left join. Lookups also only return a single value. 

Data joins, on the other hand, allow for a bit more flexibility in the join type you would like to use and creates a new base table for you. Both features have a GUI-based interface, which makes it easy for just about anyone using the tool to create a new table with a join, or add columns with a lookup. 

Take a look at phData’s overview of Account Types to learn more about what each account type can do.

We are Here to Help!

Data modeling is a crucial step in any modern data analytics stack. phData is Sigma’s largest consulting provider, and our team of expert consultants can provide guidance, strategy, and assistance in modeling your data in Sigma. Reach out to us today!

FAQs

Not at all. Sigma provides an option to create a join through the user interface without writing SQL or any formulas. Simply select “Add Column via Lookup” and use the popup to create your lookup. You can, of course, write out the lookup yourself, but Sigma provides a user-friendly way to do this as well.

Yes, you can join many data sources into one case table in Sigma. This feature lets you do robust data modeling right in the platform without manipulating the data elsewhere. This is great for data analysts that may not have access to Snowflake or if there are ad hoc requests. Sigma’s join feature shows the data canvas front and center and provides a data lineage to show exactly where your data is coming from and where it is curated into a base table.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit