What is AutoML and is it Right For You?

PyCaret includes built-in visualizations to help evaluate models. Generating these visualizations requires a single function call rather than complex custom code.

Introduction Automated Machine Learning (AutoML) tools have become wildly popular in the field of data science because they can automate some of most repetitive tasks across machine learning (ML) projects.  These tools are applicable to most ML projects and applications, and can be used in virtually any industry to rapidly develop ML models.  By automating […]

What is MLOps and Why Do I Need It?

What is MLOps

There’s a reason why report after report shows explosive growth for machine learning (ML). It’s the key to unlocking the sort of predictive intelligence and innovative automation capabilities you need to be competitive in 2021. However, getting consistent ML results requires a consistent ML approach—hence the need for MLOps. At phData, we think of MLOps […]

Deploying an Emotion Classification System for Videos with Deep Learning

This shows us the emotion score of each frame of each video.

This is a follow-up to our Building an Emotion Classification System for Videos with Deep Learning post. We all know that adopting artificial intelligence at enterprise scale can be challenging and that the challenges don’t stop once a machine learning (ML) model has been developed and trained. Deploying a model to production and integrating it […]

How to Identify PII in Text Fields and Redact It

The DSS Flow to help identify PII data in text fields.

As our customers move data into the cloud, they commonly face the challenge of keeping that data protected. Most applications are designed to store sensitive data in designated fields. Despite interface designers’ best efforts, however, sensitive data is inevitably stored in unintended fields, often as free form text in a generalized “notes” field. So, we […]

Bayesian Hyperparameter Optimization with MLflow

The number of boosting iterations proved to be the most significant hyperparameter in our search.

Bayesian hyperparameter optimization is a bread-and-butter task for data scientists and machine-learning engineers; basically, every model-development project requires it.  Hyperparameters are the parameters (variables) of machine-learning models that are not learned from data, but instead set explicitly prior to training – think of them as knobs that need to be fiddled with in order to […]

Techniques for Labeling Data in Machine Learning

Techniques for Labeling Data in Machine Learning

An Introduction to Data Labeling Imagine you want to start an agribusiness and your goal is to maximize profits by growing abundant, good-quality crops. However, growing large amounts of crops is limited by the number of resources you have, such as labor and land. And the quality of crops depends on the quality of the […]

Building an Emotion Classification System for Videos with Deep Learning

Overview of the Dataiku DSS flow

Introduction to Deep Learning with Dataiku At phData, we have seen the value that deep learning brings to organizations that can successfully harness it. From reducing diagnostic errors in radiology to more accurately detecting manufacturing defects, we’ve certainly seen our share of wins, but not without pain. Most organizations will fail to adopt deep learning […]

The Impact of Coronavirus on Machine Learning Models

As the historic coronavirus pandemic continues to unfold, governments, businesses, and individuals all around the world are making unprecedented changes. Most countries have implemented policies like mandated lockdowns and universal basic income that might have been politically unthinkable only a few months ago. Meanwhile, to confront emerging economic and health challenges, citizens have had to […]

To Big Data, And Beyond: Why Organizations Need Machine Learning Engineers

Machine Learning Engineering

Introduction to Machine Learning Engineering Over the past several years, as data acquisition and storage capabilities have exploded within the information technology landscape, so too has the realization that this historical data can be modeled to provide future insights. On the heels of this big data revolution, data science has emerged as one of the […]

Beyond Data Science: Building Culture and Infrastructure For Data-Driven Transformation

Data Science

Introduction I’m a data scientist. I’ve seen data scientists uncover important business insights and create artificial intelligence that can transform entire industries. But despite the fact that data science teams can readily produce groundbreaking results, most companies struggle to incorporate those results into business processes and applications. It is all too common for data-science teams […]