How to Optimize Power BI and Snowflake for Advanced Analytics

The world of business intelligence and data modernization has never been more competitive than it is today. The landscape of companies competing in this very hot industry is ever-changing, with some technologies separating themselves with their innovation and ease of use.  

At phData we have the luxury of working with a diverse array of clients differing in industry, location, and size. Our visibility into a swath of different companies allows us to gauge the market for different emerging technologies and analyze the current technology landscape. 

Two of the platforms that we see emerging as a popular combination of data warehousing and business intelligence are the Snowflake Data Cloud and Power BI. 

In the market, phData has seen Snowflake and Power BI  become more common as a data platform and data visualization pairing that provides best-in-class capabilities from both Snowflake and Microsoft. 

This guide will provide context behind the relationship between the platforms, uncover a few common misconceptions, and explore some best practices for deploying Snowflake and Power BI together. 

*Note: It is important to call out that selecting tools for your analytics needs is just a smaller component of creating and implementing an analytics strategy. Much of what is discussed in this guide will assume some level of analytics strategy has been considered and/or defined.  

Want to Save This Guide for Later?

No problem! Just click this button and fill out the form to download it. (You can read the full guide without giving us your email — keep scrolling!)
Download Snowflake Guide

Table of Contents

Why Discuss Snowflake & Power BI?

Snowflake is a cloud computing–based  data cloud company that provides data  warehousing services that are far more  scalable and flexible than traditional data  warehousing products.  

Having gone public in 2020 with the largest  tech IPO in history, Snowflake continues to  grow rapidly as organizations move to the  cloud for their data warehousing needs. 

Figure 1: Magic Quadrant Cloud Database Systems
Magic Quadrant for Cloud Database Management
Source: Gartner (December 2021)

Power BI is a data visualization and analysis  tool that is one of the four tools within Microsoft’s Power Platform. Debuting in  2015, Power BI has undergone meaningful  updates that have made it a leader not just  in data visualization, but in the business  intelligence space as well.  

Microsoft Power BI has been the leader in  the analytics and business intelligence  platforms category for several years  running.

tool chart

Snowflake and Power BI are both market and thought leaders in their respective industries with each leading in innovations and growth amongst BI and data modernization platforms. Because of this, they are a common combination for companies when selecting tools for their analytics needs. 

Now you might be asking, “if both of these tools are widely adopted, what’s the issue?” 

Here’s where things get fun, let’s dive in!

Despite dozens of clients and countless projects leveraging Snowflake and Power BI, there is still some resistance from both clients and vendors to using the two platforms together.  

One big issue that contributes to this resistance is that although Snowflake is a great cloud data warehousing platform, Microsoft has a data warehousing tool of its own called Synapse. 

In a perfect world, Microsoft would have clients push even more storage and compute to its Azure Synapse platform. 

On the other hand, Snowflake wants to drive as much storage and compute onto their platform as possible too. One of the easiest ways for Snowflake to achieve this is to have analytics solutions query their data warehouse in real-time (also known as DirectQuery). This ensures the maximum amount of Snowflake consumption possible. 

While Microsoft might be fine with your organization using Snowflake – especially when Snowflake is deployed on Azure –  Power BI is most efficiently used when running its reports on an imported or cached model of the data, which limits Snowflake’s consumption.  

We’ll get into some more details in this guide on why that is, but for now, this context should help you understand how the two companies view one another in the market. 

Common Issues, Complaints and Misconceptions

As discussed in a previous section, there are a lot of complexities that can arise when using these two  platforms, along with some misconceptions. In this section, we are going to discuss some of these items  and dive deep into whether these issues have any merit.  

Power BI Doesn't Connect To Snowflake

Our Rating: False

This is actually one of the most common misconceptions that we encounter about using Snowflake and Power BI. Microsoft has developed a native connector for Power BI that allows developers to choose Import or DirectQuery mode as well as the capability to utilize Azure AD Single Sign On (SSO) with Snowflake. 

Even though at face value the claim is outright false, there is some important historical context from which it originates. 

Snowflake was originally launched in October 2014, but it wasn’t until 2018 that Snowflake became available on Azure. At first, the only way to connect to Snowflake from Power BI was via an ODBC connection. 

The December 2019 release of Power BI Desktop introduced a native Snowflake connector that supported SSO and did not require driver installation. The release of the native connector was only four years ago at the time of this guide’s writing. 

The June 2021 release of Power BI Desktop introduced Custom SQL queries to Snowflake in DirectQuery mode. Although this feature brought Power BI up to par with some of its competitors, this lag in integration with Snowflake introduced ambiguity for a lot of people about whether Microsoft was interested in making their platforms work with Snowflake. 

Microsoft Doesn't Want You Using Snowflake

Our Rating: Somewhat True

Microsoft would obviously prefer that clients use Azure Synapse Analytics over other cloud tools such as  Snowflake. However, Snowflake runs better on Azure than it does on AWS – so even though it’s not the ideal situation, Microsoft still sees Azure consumption when organizations host Snowflake on Azure. 

With Azure being a common landing spot for Snowflake, the term “frenemy” is appropriate here, especially if  the client is using other Microsoft tools. Both companies seem to recognize this “necessary evil” dynamic as  they continue to be partners as of 2022.  

You Can't Use Dax With Directquery In Snowflake

Our Rating: Somewhat True

Although this claim is fairly misleading, it is true that the amount of DAX functions available to users is limited when using DirectQuery instead of Import mode. More importantly users must understand that this limitation is not unique to Snowflake. 

The most commonly used functions that you lose when using DirectQuery are Time Intelligence functions such as TOTALYTD, DATESYTD, and EOMONTH. To get around losing Time Intelligence functions, a robust calendar table is suggested to reference for time-based metrics.

While the loss of certain DAX functions is definitely a shortcoming that we hope Microsoft will address in the near future, the impact of these lost DAX functions is not necessarily as big as you would expect. 

Power BI Throttles Snowflake

Our Rating: True, but getting less true

The way that Power BI constructs queries to be sent back to the source data to render the visuals does throttle DirectQuery connections, although this throttling is not unique to Snowflake. This is one of the many reasons that Power BI is more efficient when using Import mode, a cached dataset hosted in Power BI Service.

The good news is that the Snowflake and Microsoft teams both agree that this is an issue and have recently worked together to make improvements to this limitation.

When Power BI loads a visual in DirectQuery, it sends several queries back to the source data instead of a single query. This causes more queries being needed to render a report. However, with the improvements made by Snowflake and Microsoft together with Query Parallelization, users can now alter the number of concurrent queries sent back to the data source. Query Parallelization doesn’t reduce the amount of queries sent, but allows the user to increase how many are packaged together and sent at the same time versus sequentially.

Similar to Query Parallelization, Microsoft introduced Horizontal Fusion in September of 2022. Essentially, Horizontal Fusion reduces multiple queries that have a similar shape into a one query. Power BI then sends that singular complex query back to Snowflake source eliminating any duplicative trips back to Snowflake.

Best Practices For Using Snowflake and Power BI

As the previous section suggests, there are some complexities when using Power BI with Snowflake. Although  using Power BI with Snowflake is similar to using Power BI with other data sources, we do have some specific suggestions when using the two tools together, especially since the use of DirectQuery will be top of mind. 

Use Power BI’S Native Snowflake Connector

You can connect Power BI to Snowflake just like you can connect Power BI to any other database using the native connector that was released in 2019. Prior to this, an ODBC connector was the only way to connect to Snowflake from Power BI. 

In 2021, Microsoft enabled Custom SQL queries to be run to Snowflake in DirectQuery mode further enhancing the connection capabilities between the platforms. 

Using the native connector is not only easier for Power BI users to connect to Snowflake, but it also enables Azure AD SSO to be enabled.  

If you’re interested in learning more, we highly recommend checking out our comprehensive blog that covers this in much more detail. How to Connect Power BI to Snowflake 
How to Connect Power BI to Snowflake 

Choose Import or Directquery Mode Carefully

Power BI offers two main connection types when connecting to data sources, Import and DirectQuery. We will explain the benefits and opportunities for both, when to use one over the other, and when to pair them in a composite model.

Using Import

Import mode is the most common connection type we see when working with customers. Importing data allows you to ingest a copy of the source data into an in-memory database. From there, Power BI’s underlying storage engine, VertiPaq, provides significant compression capabilities.

The VertiPaq engine is columnar based which is different from a basic SQL server, which traditionally computes on rows and offers slow performance.

Power bi data model

VertiPaq can compress data in multiple ways that store the data into memory, speeding up DAX calculations significantly. A general rule of thumb is that you can typically expect about 10x compression when importing data into Power BI.

Since Import mode creates a cache of the dataset, a refresh schedule will need to be set up to keep data current. Refreshes of large datasets will need thoughtful up-front planning. Features like Power BI Premium Large Dataset Storage and Incremental Refresh should be considered for importing large data volumes.

Using DirectQuery

You can also connect to Snowflake with Power BI using DirectQuery mode. Although a majority of use cases for tools like Tableau or Power BI rely on cached data, use cases like near real-time reporting need to utilize direct queries.

VertiPaq

In late 2021, Power BI introduced custom SQL queries to Snowflake using DirectQuery. The connector is built in such a way that the input SQL statement is formatted as Snowflake SQL when the command is submitted, so users only need to write their query in standard SQL. 

Another thing to note is that you will need to include the fully qualified table name in the query to ensure it connects successfully. See below for an example: 

				
					select *
from table(information_schema.task_history(scheduled_time_range_start=>dateadd(‘hour’, -1,
current_timestamp()),task_name=>’load’))
				
			

Using a Composite Model

composite model

For those users who want to leverage the best of Import and DirectQuery methods, Microsoft now offers composite modes. In a composite model, the storage mode can be configured differently for each table. With Snowflake, we recommend setting the largest, most frequently updated fact tables as DirectQuery connections, and the rest as Import or dual mode. This way, each of your contextual data points used for filtering are stored in memory. 

This permits you to only pull the data necessary to make decisions while allowing you to stay updated on the latest information and maintain report performance levels. 

Using a composite model could result in additional consumption of Snowflake queries when compared to Tableau, as Tableau does not have the option to create a composite model. 

Because Tableau only has the option of all Extract or all DirectQuery within their data sets, there may be situations where the Tableau developer chooses to Extract their data where the Power BI developer could utilize a composite model. Even with a composite model, the same respective considerations for Import and DirectQuery hold true. 

For more information on composite models, check out Microsoft’s official documentation.

Wrapping Up DirectQuery vs Import

At phData, we suggest using whatever connection method is best for our client’s specific use case, although selecting DirectQuery as the connection mode when unnecessary can often lead to slower performance of reports and dashboards. 

In Import mode, Power BI efficiently caches the data and can often lead to better report performance. On the other hand, when the data is rapidly changing, not using DirectQuery in your fact tables can result in decisions being made based on data that is not up to date.

It is important to note that this methodology is generally tool agnostic and is recommended by Tableau and other BI platforms as well. 

Model Your Data Appropriately

Once you have chosen the method to connect to your data (Import, DirectQuery, Composite), you will need to make sure that you create an efficient and optimized data model. Creating an efficient data model can be the difference between having good or bad performance, especially when using DirectQuery.

Here are some of our best practices for building data models in Power BI to optimize your Snowflake experience:

1. Use a STAR schema – A star schema will simplify joins and queries, process transactions faster, and  allows for continuous development as new data becomes available without worrying about changing  the granularity of the entire source. We recommend utilizing Snowflake to build out the necessary views  and tables for your comprehensive data models. These tables and views that you build will be pieced  together to create the enterprise reporting data models that will ultimately be used to feed your Power  BI reporting.  

2. Use only the necessary data in your model – Limiting the columns in your data model is almost always  more impactful to the model’s performance than limiting the number of rows. Whether you limit the  columns in your Snowflake reporting view or use Power Query, unnecessary data will slow rendering and data refresh times.  

3. Use materialized views or aggregated tables where possible – It is best practice to create materialized  views for your reporting needs rather than querying the actual source tables in your production  environment. If your reporting does not require your data to be at its lowest level of granularity, consider  aggregating your data to the required level of granularity to reduce the amount of data being pulled in and the subsequent workloads that would otherwise be necessary to aggregate the data once in Power  BI Desktop.  

4. Hide the unnecessary relationship fields – Once you have built your data model in Power BI, hiding the  field in your relationships that you don’t want users to have access to will help with confusion by your  end users.  

5. Assume referential integrity in relationships – Choosing the “Assume Referential Integrity” option will  streamline the queries sent between tables in a relationship and will result in quicker render times. This  tactic can be used in all connection modes but is especially impactful in DirectQuery mode.  

6. Avoid bi-directional filtering – Enabling bi-directional filtering in your relationships not only makes your  data model more complex, but it can also cause undesired end-user behavior.

Set Up Azure Ad SSO To Snowflake

If you use the native Snowflake connector in Power BI Desktop, you will be able to set up Azure AD SSO with Snowflake. Setting up Azure AD SSO allows you to use the security rules that are set up in Snowflake – otherwise, you would rely strictly on controlling access via Power BI.

Setting up Azure AD SSO between Power BI and Snowflake will require actions in three different areas:

Azure portal
Azure Portal 

In Azure Portal you’ll need to set up Snowflake in your Applications menu in order to get the security token to then give Snowflake

snowflake-admin
Snowflake Admin

In Snowflake you’ll need to use the Azure token to create a mapping of Azure AD to Snowflake usernames. This script is provided in Snowflake’s documentation

Power BI Tenant
Power BI Tenant

In the Power BI tenant settings you’ll need to enable Snowflake SSO as well as set up any necessary data sources using gateways

As shown in Snowflake’s documentation on this topic, you can see how the Azure AD process passes your Microsoft credentials to Snowflake to allow for authentication.

How Snowflake and Power BI Communicate

The Azure AD to Snowflake Process

  1. Azure AD creates AAD token (expires every 90 days)
  2. Token is passed to Power BI Service
  3. When authenticating with Microsoft account in Power BI, the token is passed to Snowflake
  4. The mapping in Snowflake authenticates username based on token
  5. Authentication is granted and data is passed back to Azure AD through Power BI
  6. User is then able to query Snowflake based on Azure AD credentials

Use The Right Azure VM For Your Gateways

Many organizations use On-Premises Gateways for cloud-to-cloud connections for the added layer of security. If you’re using On-Premises Gateways to connect Power BI to your Snowflake data sources, make sure to get the right Azure VMs to host the gateways. Now you might be asking, why would someone use an On-Premises Gateway with a cloud-to-cloud connection?

There are two scenarios that we see that call for this configuration:

  1. Connecting to Azure Private Link – in this case, it’s required to traverse a gateway.
  2. Gateways are being used as another layer of security between Snowflake or cloud data source and Power BI users.
  3.  
Once you’ve identified the need for a gateway, you will need to make sure that you host the gateway on a shared machine, not on an individual’s computer. If you haven’t done this before, we have resources on how to select the right virtual machine for your gateways.

Different VMs are meant for different types of jobs. Make sure to find the right one for your use case. Here are some considerations for choosing the right VM:

  1. Decide whether your main connection method will be Import or DirectQuery – there are VMs specific to either use case.
  2. If using Import, identify the largest dataset you’ll need to load and find a VM with enough RAM.
  3. Use the Azure pricing sheet to find out which VMs match your budget.

There are also different types of Azure Virtual Machines that you can choose from:

  1. General Purpose – As the name suggests, these VMs are a jack of all trades that are meant for smaller workloads.
  2. Compute Optimized – Meant for lots of computing which can be better for DirectQuery workloads.
  3. Memory Optimized – Proper VM for large caches which means that it will work best with lots of Import mode use cases.
  4. Storage Optimized – Best for leveraging large data warehousing jobs.
  5. GPU – Very specialized use cases that include heavy graphics and video editing.
  6. High Performance Compute – These are the most powerful (and expensive) VMs offered and designed for real-time use cases.

Ensure Data Center Distance Is Minimized

When deploying Power BI Premium Capacity, you can select which region you want the data center for your dedicated capacity to be located in. Minimizing the distance between your Snowflake and Power BI data centers can reduce latency and improve query performance.

If you are running both of them on Azure, selecting the same data center will result in the best performance for your Snowflake to Power BI connection.

ENSURE DATA CENTER DISTANCE

Use a Dates Table To Get Around DAX Functionality Loss

Sales YTD

Sales YTD = CALCULATE( SUM([Sales]), [YTD Flag]=”TRUE” )

As discussed earlier in this guide, using DirectQuery in Power BI will result in lost DAX functionality, the biggest of these functions being Time Intelligence functions. We recommend that you use a dates table to get around this loss of functionality. 

Having this dates table widely available to developers will ensure that the necessary fields are there to compensate for the loss of DAX. 

Increase Concurrent Query Limit For Your Data Model

When using DirectQuery for your model, the default amount of concurrent queries that can run to the source is 10, but you can increase that to 30 in your Settings in Power BI Desktop.

In your Power BI Desktop report – go to File -> Options and settings -> Options -> Published dataset settings.

*Note: This does increase concurrent queries to the source system if your Snowflake instance is not appropriately sized for the queries being generated. If this is your case, this may not help your performance.

If you are using an On- PremisesGateway, you can adjust the number of mashup engine containers running simultaneously on the gateway. These containers are the functions running your queries, and are the same components that run in Power BI Service and Desktop.

The more mashup containers, the more queries you can concurrently run through the gateway. Finding a good balance here is critical as these mashup containers require resources and will affect the VM’s performance.

Query Limit

*Note: This does increase concurrent queries to the source system if your Snowflake instance is not appropriately sized for the queries being generated. If this is your case, this may not help your performance.

If you are using an On-Premises Gateway, you can adjust the number of mashup engine containers running simultaneously on the gateway. These containers are the functions running your queries, and are the same components that run in Power BI Service and Desktop. 

The more mashup containers, the more queries you can concurrently run through the gateway. Finding a good balance here is critical as these mashup containers require resources and will affect the VM’s performance.

Here are the steps on how to change this setting:

  1. Open the gateway configuration file on the machine.
  2. Find the “MashipDefaultPoolContainerMaxCount” setting and change it from the default 0 to the desired value.
  3. Save and restart the gateway.

Other Considerations

In addition to the other strategies listed above, here are some more general tips for making sure that your Snowflake and Power BI experience is optimized.

Snowflake

  • Specific warehouse for Power BI reporting – sized appropriately
  • Ensure that results caching is taking place

Power BI

  • Write efficient DAX
  • Limit visuals on the page
  • Use filters instead of slicers
  • Use Apply buttons on your filters/slicers

If you are using Snowflake for your data warehousing needs and are now looking for a BI platform, make sure that you consider Power BI. The Microsoft Power Platform has been the Gartner Magic Quadrant Business Intelligence leader over the past several years.

Power BI’s cost, easy learning curve, and continued investment from Microsoft in new and innovative features make it a popular choice for enterprise organizations.

Cost

From a pricing standpoint, Power BI Pro licenses (used by developers to publish content) start at $9.99 per month per user, while Premium Capacity starts at $4,995 a month but depends on the number of developers and deployment capacity.

Organizations at the E5 level of Office get Pro licenses for free and Premium Per User (PPU) licenses for $10 a month per user (compared to $20 for non-E5). Because of its scale and integration with the other Microsoft tools, pricing on Power BI is hard to beat.

Learn more about Power BI Premium Pricing.

The Power Platform Stack

While there is not currently a direct connector for Snowflake within Power Apps or Power Automate, data flows can be configured within Dataverse using Azure Analysis Services to read from and write to Snowflake tables.

This allows customers to take advantage of the Power Platform tools to analyze data, build solutions, automate processes, and even build Microsoft Power Virtual Agents to handle requests without intervention. These additional tools in the Power Platform open up more possible consumption of Snowflake data than there would be otherwise.

Snowflake in Conjunction With Tabular Models

Reading and Writing to Snowflake Data Warehouse from Azure Databricks using Azure Data Factory

Other options include adding a custom connector to reach Snowflake by using CData’s Snowflake driver, although this will introduce additional costs.

https://www.cdata.com/drivers/snowflake/

Microsoft Integration

One of the most attractive features for organizations moving to the Power Platform is the integration with the Office suite. Moving from Excel to Power BI is relatively easy considering that they are both Microsoft products and the syntax is very similar compared to other BI tools. 

As more organizations move away from static reporting tools like Excel, Power BI is seen as a natural next step. For a very similar situation that we have worked on, check out the use case at the link below to find more information about how we transformed a Fortune 500 financial company’s reporting by transitioning them from Excel to using Snowflake and Power BI.

In this engagement, phData helped transition reporting that had been traditionally done in Excel to Power BI reports that were connected to Snowflake. This enabled their data engineering teams to create fast and efficient data pipelines that helped feed Power BI reports and eliminated hours of manual work to update Excel and CSV files. 

Monthly Updates

Microsoft shows continual investment in the product and its user base by updating Power BI monthly. These updates include new features, improvements to existing ones (as requested by users), and additional data source connections.

phData’s Power Platform team stays up to date with the latest feature releases and additions to the Microsoft roadmap. See below for some of our blogs on recent Power BI release:

phData is one of Snowflake’s most trusted partners in North America – so much so that in 2022 phData was awarded Snowflake’s Americas Partner of the Year.  

We are so incredibly grateful to be able to partner with an amazing company like Snowflake and help our common customers achieve their data goals. phData is committed to Snowflake and we firmly believe that Snowflake’s brightest days are still ahead. 

poty-bllog

During the past couple of years, phData has seen tremendous growth in its Power Platform Analytics business, which paved the way for us to be a Microsoft Gold Partner in 2022. 

Our skilled team of Power Platform experts helps clients to implement, enable, enhance, and scale their current Power BI analytics infrastructure. If you are just starting out with Power BI or contemplating a Power BI deployment at your organization, make sure to read our Implementing Power BI Premium Capacity whitepaper

With our deep commitment to Snowflake and our team of Power Platform experts, phData is uniquely positioned to help you and your organization take on the challenges of deploying Snowflake and Power BI to work in unison. 

Client Success Stories with Snowflake + Power BI

Fast-Casual Restaurant Chain Revolutionized Its Enterprise Reporting

A fast-casual chain of restaurants (one of phData’s largest analytics clients) uses both Snowflake and Power BI to enable their enterprise reporting. These reports are sent from centralized BI to thousands of their restaurants on a daily basis informing them of metrics about their productivity, profit margin, and delivery effectiveness. 

While the data models themselves are still housed within Power BI Service and maintained via Tabular Editor, all of the data pipelines and view creation still flow through Snowflake. The customer leverages the strengths of both platforms by allowing Power BI to support its enterprise reporting, while also letting Snowflake do all of the heavy lifting on the data model.  

CPG Enterprise Successfully Implements Power BI Premium and Improves Reporting

A major Consumer Packaged Goods (CPG) company, eager to dive deeper into Power BI Premium and enhance its reporting capabilities, faced a challenging task. The goal was not only to set up the necessary infrastructure to support comprehensive enterprise reporting, but also to seamlessly integrate Power BI with their existing Snowflake Data Cloud environment, enabling single sign-on (SSO) for dashboard users. Recognizing the complexity of this task, they turned to phData for a solution.

Understanding the challenges and objectives of the client, phData devised a tailored strategy. They closely collaborated with the client to deploy Power BI Desktop and Power BI Service across four crucial business groups dispersed over two continents. 

As a result of phData’s solution, the client was not just equipped with a more potent Power BI Premium Capacity, but also gained the ability to view vital dashboards and construct their own ad hoc reports. Through this collaboration, the CPG company was able to take a significant step forward, enhancing its reporting and data capabilities across its multinational operations.

 

Read the Full Case Study >>

Conclusion

In Closing

The future is bright for both Snowflake and Power BI. After a record-breaking IPO in 2020, Snowflake continues to see rapid growth and adoption in all markets and verticals while competitors like Redshift (AWS), Synapse (Azure), and Databricks are not seeing the same widespread adoption.

Microsoft has been leading the Gartner Magic Quadrant for BI tools for a few years running and there’s no reason to see that stopping any time soon. Microsoft continues to invest tons of money into the platform and releases new features every month.

By leveraging Power BI and Snowflake, our customers can accelerate their ability to generate insights, create powerful visualizations, and drive business decisions powered by Snowflake. And while there are nuances to using these two tools, Power BI is the data visualization platform of choice for many of our customers, and our potential customers who use Snowflake

Looking to accelerate your data journey with Snowflake and Power BI? phData would love to help guide you to success. Reach out today for any questions, advice, and guidance.

Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.

Accelerate and automate your data projects with the phData Toolkit