Lessons on maturing data journeys in developing environments

Natalie Cuzen
10 months ago
71point4 > Blog > Data analytics > Lessons on maturing data journeys in developing environments
71point4 > Blog > Data analytics > Lessons on maturing data journeys in developing environments

Lessons on maturing data journeys in developing environments

Posted by: Natalie Cuzen
Category: Data analytics

At 71point4, much of our work focusses on data engineering (designing and building data pipelines) and data analysis in developing environments – within institutions that are just beginning to capture and work with their data. Our role and expected challenges working in these environments differ considerably from those encountered in more advanced environments that are already unlocking value from data. In this blog we’ve compiled our key lessons as analytics consultants working in developing environments.

The approach to data analytics matters. In advanced environments, the systems that manage operational processes are built with a clear recognition that data is a valuable asset. Production and analytical environments are almost identical, including data structures and data scope. In fact, analytical environments often constitute a full replication of the production environment where the data in the analytical environment is updated overnight to reflect the full production view for the previous day. Data analytics design occurs alongside core system design and system testing protocols inherently incorporate both operational and analytical components. Critically, responsibility for support and maintenance of operational and analytical environments rest within the same organization.

In developing environments, however, the picture is different. Most systems are built to exclusively serve an operational purpose, with little upfront consideration of analytical requirements. An interest in system analytics typically emerges only after the operational system is live. At the same time, analytical skillsets may be in short supply. As a result, responsibility for functions that support analytics – designing and building systems to capture, transform, store, and analyse data at scale, as well as the maintenance of those systems – is often performed by external resources (consultants, typically), while maintenance of the operational system remains within the primary organization.

 

Developing data environments are institutions that are just beginning to capture and work with their data; advanced data environments are institutions that are already unlocking value from their data

 

This split in oversight between (in-house) production and (typically outsourced) analytical environments introduces complexity. System changes are inevitable and the separation of responsibility across two dependent environments is a notorious source of communication breakdown. Updates to the operational system unforeseen by those supporting the analytical environment can break integration procedures between the environments. This can put analytics on pause and force the analytics team to play catch-up on updating the integration and analytical data structures. Depending on the extent of the changes, a complete overhaul of the analytical environment may be required. These events reduce the cadence of analytical updates and can be pretty problematic where real-time insights are expected.

A more serious threat, however, is a change to the operational system that remains undetected by the analytics team. In this scenario, integration procedures remain intact but the completeness or substance of the underlying data is undermined. In the absence of a warning sign, these data changes can go undetected and corrupt analytics results.

Data analytics on immature systems compounds this complexity. New or immature systems are expected to evolve rapidly – responding to feedback to remove, improve or add features to better meet the system’s core needs. Short release cycles on immature systems are therefore critical for system advancement. But the associated system changes, which can be frequent and extensive, compound the complexities of shared responsibility across in-house production and outsourced analytics teams. Performing data analytics on immature systems within developing environments is a double whammy for data scientists.

In light of this, should we even bother with analytics in this context?

Early data analysis following system deployment (in other words, using immature systems) offers significant benefits, often demonstrating the case for the system’s existence and revealing insights that improve the system itself. We typically wouldn’t want to delay analytics capability until the system is more mature and stable because we are likely to miss critical signals on its performance and opportunities for improvement.

And so the question becomes: how do outsourced analytics consultants mitigate risks and manage the effort required to produce reliable data analytics for immature systems in developing environments? In our experience, an explicit recognition of the risks of separate responsibilities across production and analytical environments goes a long way. It goes without saying that we need good communication between support teams as well as leadership buy-in on the importance of analytics. But practically we find that the operational team’s data mindset and skillset is a binding constraint to any realised benefit. Comprehensive regression testing prior to generating updated analytics is a guard rail for unreliable communication on system changes. We should also anticipate extended lead times to validate analytical outputs. Do not underestimate the value of communicating analytics results back to the operational team themselves; even if the team’s data capability is limited, they are typically closer to the realities of what the data represents and can often easily detect erroneous results.

And finally, expect a bumpy ride because progress under these conditions is hard-won. The good news, however, is that it will get easier in the long-term. There has been massive growth in the recognition of data as an asset and analytical skillsets will catch up. Even for now, the effort is worth the reward.

Author: Natalie Cuzen

Leave a Reply