The Approach vs Technology Confusion: Where do Data Products Fit In? | Issue #38
Lifecycles, Fundamental Anatomy, and Value for Aspiring Metric-Driven Orgs
The Data Product precipice sits right at the intersection of Approach and Technology. Just like particles have a dual nature: wave and matter.
♻️ Data Products as an Approach
An approach is a PROCESS- a way to do things. Data Products establish that process through the Data Product Lifecycle.
One might argue Data Products is more an approach than technology, much like Agile for Software. Agile went through its share of pushback from sticky processes and emerged as a must-have. Post-adoption, Agile became a game changer for software teams and products.
However, an approach for software cannot necessarily be duplicated for data. We have seen that over most part of the last decade.
Interestingly, the Data Product approach has changed that. It works for the core indefinite element of any data stack absent in the software stack: Data. And it needs to be actively enforced through a Data Product Platform that integrates with existing data stacks.
Agile tools, the likes of JIRA and Asana, enable Agile Software Development, but establishing a cyclic and accountable cycle for data development is slightly more challenging since the main ingredient, data, is volatile and subject to constant change.
Data Developer Platforms (a data product infrastructure standard) solves this problem by enabling the Data Product Approach for data development. It establishes a repetitive and reliable cycle despite the dynamic nature of data.
Similar to the benefits of Agile, the data product lifecycle’s reliability comes from being testable and changeable at any point in the lifecycle.
🦾 Data Products as a Technology
Technology refers to a TANGIBLE MACHINE/mechanism. Data Products are tangible products that users can “purchase” and use. The lifecycle or approach discussed above is enabled through this tangible product - the product’s structure or anatomy.
The benefit of the data product lies in its anatomy itself. The secret is tightly coupling data with infrastructure, code, and metadata instead of loosely processing and projecting data for various applications.
How does that help?
When data freely floats across your infra, there’s hardly any scope for reproducible cycles for incremental evolution.
💡 Just as diamonds cut diamonds and fire fights fire, an evolving or changing framework optimally handles ever-changing data.
Evolving lifecycles are the best bet to optimize the value of data that is intrinsically dynamic. Coupling data with infrastructure, code, and metadata allows it. The tight coupling fosters quick changes
while maintaining high stability and interoperability with a plethora of tools, sources, and consumption endpoints that the data stack necessitates.
Every Data Product serves a specific purpose or drives a set of business metrics which can be organized as a metric tree.
Metric trees are a web of associated metrics cutting across the entire vertical of the data’s journey. The coupling between infrastructure and data layers allows the existence of such transparent metric trees and ensures that all initiatives across technical and business verticals are transparently observable in the context of the target metrics, i.e. if they are fuelling or harming the business.
We explain this ability through a compact metric tree model for the office of CDO and CMO, where every low-level to high-level metric across tech and business layers is interconnected and traceable due to the anatomy of the data product (data + metadata + code + infra).
As a consequence, decision-makers can roll out informed decisions across solutions, operations, and development tracks without having to know the ins and outs of every vertical - only the factors that move the needle.
As a consequence, decision-makers can roll out informed decisions across solutions, operations, and development tracks without having to know the ins and outs of every vertical - only the factors that move the needle 🪡
Community Insights 🫂
If you’re looking for more ideas and insights on similar lines, here’s an interesting piece that talks about an agile-like approach specifically in the context of data instead of a direct translation of Agile in the context of software. The article also delves into a high-level strategy on essentials you must acquire to get going.
Erik Lenhard, partner and director at BCG, also talks about the necessity of this convergence between Agile and Data tracks through Data Mesh (or data products as core building blocks).
The article is crisp and highlights high-value points directly without beating around the bush. An excerpt that’s a personal favourite:
The best way to begin the journey for the strategic build of a Data Mesh is via pilot projects, starting from a clear problem statement. Next, we deploy end-to-end pilot projects that support business value creation, involving multiple data sources and best schema changes.
We set up cross-functional teams to build iteratively, rather than initiating a ‘big-bang,’ driven by use-cases with scale-out/industrialisation as key goal from the start. It’s important to view this process as a change journey, not simply an introduction to a new set of tools.
Once again, thanks for sticking to the end! Here’s a bonus meme 🤡
📚 Related Reads
https://datadeveloperplatform.org/ - a community influenced repo specifying the fundamentals of a data platform infrastructure standard, ideal for data product development.