The phData team architected an Internet of Things (IoT) solution to deliver both real-time and historical data into production, using Streamsets and Kudu to handle the sensor data coming in from the field — building custom software to handle data ingestion from WITSML — and HDFS for long-term storage. They implemented a lambda architecture between Kudu and HDFS, providing a unifying Impala view to query both hot and cold datasets.
In the unpredictable world of oil and gas (O&G), the safety of onsite workers is paramount; and with all the variables at play in the extraction process, successful operators know to expect the unexpected. That’s what drives one Fortune 500 O&G company based in the U.S. to do whatever they can to improve their safety monitoring systems.
Their drilling rigs are equipped with a variety of sensors that transmit status readings such as well pressure, flow rate, and temperature to a third-party data vendor. This allows the O&G company’s operations team to monitor status indicators — looking for aberrations that might suggest any potential danger to personnel or equipment — via dashboards on their vendor’s web application.
However, because the company was unable to build additional automation via the vendor tooling, they decided to build a more robust solution that went beyond basic dashboards to actually flag anomalies and send out urgent alerts automatically, in order to minimize risk in the event of an incident. In addition, their data scientists sought a long-term solution to store the sensor data coming in, in order to power historical analyses and predictive analytics to help them preempt future accidents even before they happened.
To support monitoring and forecasting across all their drilling sites, the O&G company would need to build out a new data streaming framework that could process the high volume of data coming from their IoT sensors in real time — all while ensuring a consistent, unified view across their internal teams, their third-party data vendor, and their ecosystem of smaller contractors and subcontractors.
As a result, the company’s monitoring systems and dashboards can continuously query the Kudu table where this data was stored to compare it against the historical data now being compiled by their data science team.