Client: Chemical Manufacturer

Our client provides catalysts (to their clients) for the production of plastics and other products. In addition to providing the raw materials, our client also collects data from their clients manufacturing process for research, product improvement and development.

Challenge: No Method of Integrating and Analyzing Client Data

Data is provided in many different formats from their clients. In addition to format, data is also provided in varying units of measures. In order to analyze the information, an exorbitant amount of time and money is required to normalize the information for analysis. 

Since the information was siloed (across their clients) they could not analyze the information across their clients.

Solution: Dunn Solutions Develops Data Pipelines to Cloud Data Lake

In order to address these challenges, Dunn Solutions designed and developed a data lake which used the Snowflake database in an Azure environment. The data lake stores and integrates data from various clients in a single location which can be used for reporting and analysis.

As part of the project Dunn Solutions develop a standard “form” for ingestion into the data lake using Databricks, Blob Storage and FTP sites for clients to deposit data to. Data pipelines were created for each type of file for automated data ingestion.

Finally, a “curated zone” allowed analysts to use analytical tools for interacting with and analyzing the data to discover trends and outliers.

Result: Centralized Repository Provides Ready Access to Client Data Sets

With this solution, our client can ingest sample data into a central repository which provides access to data across all their clients. The automation reduced the massive amount of manual data wrangling required and thus allowed them to focus their efforts on data analysis, product development and product improvement.