Case Study

Medical Device Manufacturer Makes AI a Reality with Dataiku & phData

The Customer’s Challenge

For a global medical device manufacturer to declare its new machine learning platform implementation successful, they needed help checking the compliance boxes of a strict security policy. 

Additionally, they needed a hand with navigating the AWS landscape and implementing infrastructure with best practices.

phData’s Solution

Working in concert with internal IT teams and the Dataiku Field Engineering team, phData deployed Dataiku on the customer’s cloud infrastructure in full compliance with corporate cloud and security standards. We helped ensure best practices were implemented with high availability for AWS services including EKS with autoscaling, aggregate logging, and on-demand GPUs. 

The Full Story

A Fortune 500 medical device manufacturer was coming fresh off a successful Machine Learning PoV project on Dataiku. As they looked toward the next phase, their IT architecture and strategy team wanted to ensure a robust production environment was rolled out for use across the enterprise. To accomplish this, they needed a partner with deep expertise in implementing and operationalizing data platforms.

Why phData?

phData’s commitment to successful project outcomes was not unknown to the customer. They saw the ingenuity and dedication phData pours into making clients successful in past engagements. 

phData’s deep knowledge of Dataiku, staff of passionate experts, and proven track record of navigating corporate IT standards, procedures, and processes were crucial considerations. Additionally, our expertise with the customer’s modern data stack (AWS and the Snowflake Data Cloud) made phData the clear choice.

Implementing an Enterprise Platform for Machine Learning

To allow for access to external data sources, connectivity was established to Snowflake, Oracle, SAP HANA, and MySQL. With the customer moving toward standardizing on Snowflake as their primary data platform, particular emphasis was placed on setting up secure connectivity and allowing users to use the roles and access patterns they were accustomed to. 

To accomplish this, an OAuth integration was created using Azure AD as the identity provider (IdP) which allowed users to authenticate as themselves to Snowflake from within Dataiku.

To enable quick onboarding of new users and projects, we coordinated with the IT service management team to refine and automate access to the Dataiku platform. This included the creation of new access management flows within their ITSM tool (ServiceNow) with streamlined approvals and membership to Active Directory groups (which were configured in Dataiku to provide license assignment and platform and project access).

Dataiku does not currently provide native monitor tools. The dkumonitor open source project can often be a viable option. However, due to a security policy that does not allow the use of older python versions, an alternative was needed. 

The customer did not have off-the-shelf monitor tools available (splunk, datadog) or any extra dollars for an AWS native solution (CW Synthetics). To ensure a stable and always available platform, we created a custom monitoring and alerting solution for Dataiku services and APIs. 

To conform to stringent security standards, we navigated the requirements of numerous compliance tools required by the InfoSec team. We provided implementation and remediation services for the following:

  • Rapid7 (OS vulnerability scans)
  • Dome 9 (cloud vulnerability scans)
  • Contrast (web application firewall)
  • Twistlock (EKS security)
  • AWS WAF
  • Artifactory for Docker images (allowed for image scans)

Lastly, we shared guiding principles, tips, and some recommended best practices for early-adopter data science teams that were onboarding onto the platform. This allowed them to get off the ground and succeed with their first data science projects on Dataiku.

Results

Users can now be onborded onto the Dataiku platform with just a few clicks in their Self Service catalog and their manager’s approval. This allows the IT team to automatically track costs and automate chargeback to appropriate business units. 

By making this process so simple, countless users have been able to access Dataiku and enterprise data sources. These users are hard at work to develop new AI-based insights with machine learning on Dataiku. 

Take the next step
with phData.

Looking to implement machine learning at your organization? Reach out to phData today to learn how we can help! 

Accelerate and automate your data projects with the phData Toolkit

Introducing
Data Coach is our premium analytics training program with one-on-one coaching from renowned experts.