Challenge
Global beauty company Coty’s objective is to be an innovation pioneer in the industry, delighting its customers with exciting products and services. To achieve this, it relies heavily on data.
Recent years have seen an exponential increase in the amount of data gathered and its sources. This has made the process of collecting, accessing and handling data very complex and labour intensive. Our challenge at Profusion was to design advanced-level data architecture that could:
- Substantially reduce manual data-related tasks
- Break data silos and provide a reliable central source
- Provide the platform to develop new capabilities and exciting data products.
Solution
We started by exploring the 10 data sources and 1,000+ variables collected in the many datasets.
After some heavy data cleaning and wrangling, we built a data schema. This enabled the linking of various sources and simplified data interpretability.
As our brief required us to develop flexible, scalable and secure architecture, we opted for a cloud-based solution hosted on AWS. To be able to deal with both structured and unstructured data, we designed a data lake powered by some of the latest technologies available.
Finally, we designed and built complex data pipelines to automate data extraction, cleaning, harmonisation and transformation. Our design makes it very simple to add new data sources to the roadmap of multiple innovation teams at Coty.