What is Data Engineering? Everything You Need to Know in 2022

A picture with data all around with the phData logo in the center

What is Data Engineering? Everything You Need to Know in 2022 It’s easy to overlook the amount of data that’s being generated every day — from your smartphone, your Zoom calls, to your Wi-Fi-connected dishwasher. It is estimated that the world will have created and stored 200 Zettabytes of data by the year 2025.  While […]

DataOps: What Is It, Core Principles, and Tools For Implementation

DataOps: What Is It, Core Principles, and Tools For Implementation When building a successful company, it’s critical to have a strategy around how you build and scale your business from a technology and data perspective.  Your business likely has competitors that are trying to beat you to market, technology is constantly evolving, and so are […]

What Should I Look For in a Data Catalog Tool?

In our previous blog in this series, we spent a lot of time exploring why a data catalog is valuable and who you might need to support it. With that background information in mind, we’re ready to take a look at some actual tools and properly uncover what’s the best data catalog for your business. […]

What Team Supports Your Data Catalog Best?

A diagram displaying an example org structure of people to support the data catalog tool.

Welcome to part two of our trilogy on data catalogs. If you missed our first blog on what a data catalog is, be sure to check it out! In this blog, we’ll explore what the ideal team to support your data catalog looks like. Who Are the Users of a Data Catalog? A tool is […]

What’s a Data Catalog and How to Choose the Right One

Your business might be moving to the cloud, just completed, or have been established with it for a little while, and you are likely wondering, “what data catalog tool is best for me?” The short answer is…it depends. There are a lot of options available, and choosing the right data catalog for your business will […]

What is Data Modeling and How Do I Choose the Right One?

Building a successful data management solution requires several correct choices to be made in terms of technology, architecture, and design. Modern cloud based data platforms like the Snowflake Data Cloud can address most of your technological needs, but you still need to ensure that the design and structure of the data complement the technology you […]

How to Build a Modern Data Platform Utilizing Data Vault

When looking to build out a new data lake, one of the most important factors is to establish the warehousing architecture that will be used as the foundation for the data platform. While there are several traditional methodologies to consider when establishing a new data lake (from Inmon and Kimball, for example), one alternative presents […]

How Do I Use StreamSets Test Framework?

StreamSets Test Framework (STF) is a set of Python tools and libraries that enables developers to write integration tests for StreamSets: Data Collector Control Hub Data Protector Transformer This unique test framework allows you to script tests for pipeline-level functionality, pipeline upgrades, functionality of individual stages, and much more according to the requirements. But the […]

How does Kubernetes Horizontal Pod Autoscaling Work with Custom Metrics?

Kubernetes is a great way to deploy cloud-native applications in the cloud or on-premises. One of the Kubernetes Pod Autoscaling features’ biggest advantages is to automatically scale your application based on demand. This can be extremely helpful when the load an application encounters is variable. Kubernetes has three different types of scaling: Cluster scaling, Vertical […]

How to Know if Your Data Engineering Projects Will be Successful

Data and analytics platform diagram for successful data engineering projects

To make sure data engineering and analytics projects are successful, not only do you need to pick the right technology and have the right people; you also must have the discipline to apply software engineering best practices. What sort of practices am I talking about? Make sure your requirements are clear and communicated to all […]