Wouldn’t it be wonderful if all of the data you needed was in one place? An enterprise data warehouse that is immaculately curated: perfectly complete and wonderfully accessible. It’s the data analyst equivalent to a sunset on the beach while sipping on a mojito.
This has been the dream since the 1970s, pioneered by Teradata and IBM in the 1980s, and later popularized by Bill Inmon and Ralph Kimball in the 1990s. Heck, the first interview question I was ever asked was, “Explain the difference between the Inmon and Kimball models.”
So, fifty years later, we all have perfectly curated enterprise data warehouses and reporting is going perfectly. End of blog. Everything’s great. Nothing to do.
Well, no. Very few companies (if any) can claim to have an all-encompassing data warehouse. It’s still a worthy aspiration, and I feel strongly that we should be working toward it. However, it isn’t an achievable goal in almost every situation. There is always more data to ingest or share, often in disparate environments. Because an all-encompassing data warehouse is an aspiration, we need to figure out practical solutions for making data more accessible.
Don’t fear though! The Snowflake Data Cloud has a solution!
Snowflake is a great example of a best-in-class data warehouses in itself. They evolved what it meant to be a data warehouse, and designed it with the advantages of cloud computing in mind. If you can centralize all data for reporting into a single Snowflake account, more power to you. However, Snowflake recognizes that this isn’t always an achievable endeavor.
Enter Snowflake Secure Data Sharing. This core feature of Snowflake allows for data collaboration across environments, whether those be internal disparate data products or separate businesses. Best of all, it’s so easy to use.
In this blog, we’ll explore how to develop a use case for leveraging the Snowflake Secure Data Sharing feature so that your business can better utilize data sharing to drive real business value, as well as the strategy to set it up. But first, let’s talk about Snowflake Secure Data Sharing itself.
Simply put, Snowflake Secure Data Sharing is a mechanism to share tables or views from a provider to a consumer. The data provider creates the share and it is imported by the data consumer.
Now, you aren’t sharing the data in the literal sense. No data is copied. The consumers are read-only users for the shared data sets. Since none of the data moves, it is remarkably easy to set up a data share.
Setting this up may be simple, but in practice, this is powerful. Data sharing via Snowflake can enable the dissemination of data throughout a complex organization or group of organizations.
Secure is the keyword here. First, the provider decides what data is shared and only that data is shared. Second, the provider explicitly decides who the consumers are.
Disclaimer – The Snowflake Secure Data Sharing use cases that we will talk about below assume that both parties (the providers and consumers) have Snowflake. This is reasonably common due to Snowflake’s current (and expanding) popularity. If only one party has Snowflake, you can still share data with a reader account, but we won’t cover that in this post.
Though not an exhaustive list, these are some of the most common use cases we are seeing with data sharing in the market.
Manufacturers and vendors have complicated symbiotic relationships. Manufacturers supply the products that vendors distribute and sell to consumers. However, the control and oversight of the selling of the manufactured products (to consumers) are often completely controlled by the vendors. From a data perspective, this means the vendor owns and controls most of the consumer-side purchasing data.
Manufacturers want to know what is happening with their products, but it is easy to lose sight of what is actually happening. This data is as valuable to the manufacturer as it is to the vendor. Manufacturers want to know all about who is buying their products, their demographics, and any other relevant sales metrics.
The manufacturers rely on the vendors to provide this data back to them so they can make better decisions. The vendors are essentially the gatekeepers of this data, and it is on them as to how they want to provide this data back to the manufacturers.
In our experience here at phData, companies share data in some not-so-great ways. The biggest faux pas is sharing files via email. First, it’s not secure. Files can be easily shared or changed without governance. Second, it heavily relies on individuals to manually generate and send these files. Timing becomes an issue. Humans are busy with competing priorities. Quality is often a concern here too. Creating files manually is prone to mistakes.
Companies may also be sharing these files via SFTP sites. Arguably much better than email (security is miles better). However, the second issue with emails remains, they are time-consuming to generate and prone to mistakes.
Now, to be fair, both the email and SFTP options could be automated. In our experience, particularly if this process is reliant on business users, these processes tend to not be completely automated. Even when automation is in scope, there is a sizable effort there just to get a single file from A to B.
In addition, the biggest problem is on the manufacturer’s side: the data receiver. The need is to ingest these files into their systems, manually or via EDI (electronic data interchange). That can be a large lift. Anything manual with data tends to be time-consuming and prone to error. EDI, in particular, has the disadvantage of needing specialized systems for transmission (EDI provider), as well as a way to parse and ingest the data files. Neither is an ideal solution.
Both parties, manufacturer and vendor, are spending resources, time, and people just to provide downstream data back upstream. Large manufacturers have dedicated teams, sometimes multiple teams, who focus just on this relationship. Luckily, Snowflake makes this so much easier by using data sharing.
If the vendor (provider) already has the data in Snowflake, the sharing to the manufacturer’s (consumer) Snowflake environment is exceptionally easy. Just a couple of quick administrative steps to get it going. Comparatively much quicker than setting a flow to automate an entire data sharing process.
Also, it is secure. The data never leaves the provider’s instance. You are providing read-only access to the consumer. No data is copied. It also addresses timing concerns. As soon as the data is ingested into the vendor’s Snowflake environment, it is available for the consumer. No extra steps need to be taken.
Snowflake allows for business-to-business data sharing to be easy to implement, without maintenance, and automatically share the data as soon as it is available, with no manual data transferred by the consumer or provider’s people.
Utilizing Snowflake Secure Data Sharing can make it quick, easy, and secure for businesses with a symbiotic relationship to share the data they need to easily elevate both businesses. This will both strengthen the relationship between partners and suppliers.
There are three main data/analytics operating models that we see in organizations: centralized, decentralized, and federated (sometimes referred to as hybrid). Let’s go through some quick definitions:
We generally recommended a federated model here at phData. For Snowflake Secure Data Sharing, it can be a powerful tool for companies who are decentralized or ideally, are enterprises moving from a decentralized model into a federated one.
Companies with a decentralized analytics model have a couple of issues, but most distill down to the issue that they are unable to share or consume data from other groups within the company. This can manifest in one of two ways.
First, there is duplication of work. For instance, Marketing and Sales have large overlaps in reporting needs. The second is that data is enhanced significantly with data from other domains. The Supply Chain domain will benefit substantially from data from both HR and Finance.
Snowflake Secure Data Sharing makes it easy to share this data within an organization. HR can make its data mart accessible to the other business domains. So can Finance. If Marketing and Sales share their domains, they can see where their work overlaps and remove possible duplication of work within the organization through sharing resources, rather than recreating them.
Even more so than in the decentralized model, Snowflakes Secure Data Sharing helps support the adoption of a federated analytics model within an organization. Yes, ideally there is a centralized data warehouse that supports critical business use cases across the organization.
Snowflake is an excellent choice for this warehouse. However, even with how easy it is to implement and use Snowflake, it will take a while to compile the information needed for supporting all groups within the organization.
This is where Snowflake Secure Data Sharing can be a huge advantage for an organization. Use data shares while you, as the centralized group supporting the organization, explore that data that exists within the organization. Also, use data shares to “keep the lights on” to have data accessible while you work on centralizing it in one instance for the organization. Also, maybe you find out that some of the data works as a share and never really needs to go through the effort of centralizing it.
Snowflake Secure Data Sharing makes collaboration and sharing within the organization a simple task, while also setting the background for ideal analytics operating models like the federated model. At the end of the day, this should help your organization make better business decisions.
If you have been part of a company acquisition or merger, you know how difficult it can be to integrate technology. Each company has its own set of infrastructure for every single aspect, from HR systems to CRMs to data warehouses.
Now luckily, assuming that both parties are using Snowflake, this is a very easy item to address. It is almost certain that the business use cases of these combined companies will move quicker than the ability to integrate everything into a single technology platform. Even with the best technologies and plans, this just takes time and resources.
Data sharing is a great stop-gap solution. In a matter of moments, you can be sharing and analyzing data between multiple entities. Hopefully, this will generate the new opportunities both companies were hoping for when going into the merger.
Data sharing is an excellent way to facilitate collaboration around data, especially to help facilitate merger entities.
Data monetization is the shining goal of many organizations right now. How can we turn the data we are collecting into a product in itself?
Now you don’t need to be a Facebook to potentially see a benefit from data monetization. Many companies collect data that can benefit other companies, outside of the standard manufacturer-to-vendor relationship we talked about earlier. Essentially, selling it to 3rd parties.
This data could take any number of shapes, from geolocation data to demographic data, just to name a few. But what is key here is that your company has and owns data that could be beneficial to the operations of unrelated companies.
So, how do you share it? As more and more companies invest in data monetization, the marketplace will become more crowded. Companies will need to differentiate themselves, and one way is to invest in optimal ways to share the data to the consuming organizations.
Nobody wants to receive flat files of huge amounts of data. Heck, I remember being sent an external hard drive by a vendor before. Nowadays, why not keep it simple? Share with Snowflake.
First, a definition is in order. You might not be familiar with the term “clean room” in the context of data. Essentially, a “clean room” is a protected, de-identified, and aggregated data set. It is generally meant to protect private or proprietary data through obfuscation.
There are many reasons why a company may want to create a data clean room, but a common one would be a shared go-to-market strategy. Two companies may want to partner to target customers together, but may not want to share sensitive data.
Snowflake Secure Data Sharing is an ideal solution for this use case. Both companies, in this scenario, can be data providers and data consumers. The data sets shared can be targeted to be clean, so only the appropriate data is shared. However, the data that is shared is readily accessible, helping both companies strategically co-target customers together.
Guess what, we are pretty close to the enterprise data warehouse dream that started 50 years ago. Sure, it might not be on a single infrastructure instance that was originally envisioned, but Snowflake Secure Data Sharing has made data more accessible than ever.
We are in a better position than we have ever been in to make this dream a reality. A centralized instance with all of the data a company needs, accessible via one tool, that can be consumed by business units throughout an enterprise. It’s pretty cool.
Now that we have talked about common reasons why organizations might adopt Snowflake’s Secure Data Sharing, let’s talk about the next steps. How do you, as an organization, create a solid plan to implement data sharing?
That’s where we at phData come in. We have robust experience accelerating data journeys and designing modern data platforms. Data sharing can be a large component of that plan.
We call this a capability assessment. How ready is your organization for adopting and applying a specific technology or concept? After all, you already have Snowflake. It is about enabling your organization to use it effectively.
Within a short engagement, generally about a month, we will provide the following, all focused on enabling data sharing across your organization:
This will help you get started right away and in the right way. Get in touch with one of our data strategy experts today for a free strategy session! We are always happy to answer any questions you might have.
Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.