Why do companies keep releasing new data platforms? Because we haven't solved the problem yet. Whenever I see a product landscape this crowded, it always means the same thing. Customers don't fully understand their needs, and solutions providers aren't done guessing. The amount of money flowing into the data platform space proves there's ROI to be had by whichever business finally figures it out.
SAP rolled out Datasphere, its data warehouse cloud solution, in March. Like many others, it has difficulty explaining where technology fits in the modern firm and data platform product landscape. It's not that their message isn't on point. They've articulated the best vision I've seen so far, but that's not the challenge.
SAP, and most other large enterprise software vendors, are speaking to an audience still coming to terms with the problems they're trying to solve. The micro is well understood, but most businesses struggle with the macro, where technical solutions must extend across the enterprise. Technology is a strategic asset, and defining it that way has been challenging.
Each business unit has clarity into its own needs and chooses solutions to fit. When solutions lived in isolation, the micro picture was all that mattered. The need to pull data from across these business unit micro technology environments has created a macro technology platform. Most data platforms are built to live in a micro technology environment and solve business unit level micro challenges. Businesses need a new type of platform to manage enterprise-wide use cases.
Building Best-In-Class Ecosystems
It's impossible for a single provider like SAP to create best-in-class solutions across enterprise use cases and technology stacks. There's no way for one enterprise software company to do everything well. Salesforce has attempted to acquire its way to dominance, but that strategy has proven difficult to execute.
Smaller startups have the advantage and will build best-in-class solutions for a targeted set of use cases better than incumbents. The only way for a company to become a macro technology platform is to create a partner ecosystem. Companies like SAP have bitterly resisted that approach, but it seems like attitudes are changing.
The ecosystem model is the only one that makes sense, so the move towards developing them is unavoidable. A strong core product with a best-in-class ecosystem is the way this should work. SAP is developing its ecosystem with partners like Collibra, Databricks, and Confluent. In the past, enterprise application companies made moving data across different parts of the ecosystem difficult.
Implementing a technical-strategic architecture (I'll explain that concept more next) like data fabric is impossible without an open ecosystem. Data must flow into and out of all the micro platforms to support a true macro technology ecosystem. A monolithic, single-vendor solution can't work.
Opensource components can add massive value and should also be part of the ecosystem. That's an even more significant challenge for large enterprise software companies, but it's not a true ecosystem without opensource being a part of it. The need for customization and niche implementations will be a constant for these large platforms.
Enterprise-Wide Workflows And Business Challenges
The first time I encountered the macro challenge was building a pricing model for a large retailer. Pricing touches every part of the retail business, and building the model required data from multiple organizations. I spent more time getting the data engineering side working than building and improving the model.
Big screen TVs occupy a lot of warehouse space and have very low margins. When warehouses fill up, that space is precious, and optimizing the margin per useable cubic foot is essential. Discounting TVs to free up space for higher margin or faster selling products can significantly impact the bottom line. I cannot tell you how much of a pain building that data pipeline was.
The biggest challenge was rebuilding the business context required to support that part of the pricing model. The connections between product size, margin, sell-through rate, floor space, and inventory planning lived in a few employees' heads. Most data platforms don't capture the data businesses need to maximize the value of data science initiatives.
There are dozens of other examples of business context that I've seen during my career. Business context contains domain knowledge, assumptions, expertise, and decision-making processes. It focuses on the business's goals and supports connecting technical initiatives with outcomes the company cares about. With current data platforms and models, business context, like the connection between pricing and warehouse space, must be rediscovered as part of data analysis.
It's painful and expensive. Businesses keep every piece of data because they don't know which data contains signal that helps reassemble the business context. Data platforms must capture more than the business's data to solve the macro challenge. They must also capture parts of the business itself. Thinking about data platforms with that objective changes the solution.
A Single Enterprise Vs. A Collection Of Business Units
Companies like SAP have an enterprise platform view of the modern business, and they see the macro picture. Data creates the opportunity for businesses to understand their own macro pictures. Current architecture patterns make that expensive and time-consuming putting it out of reach for most.
Enterprise structure and technical application architecture must be in alignment. Architecture must be built to fit the business's operational structure and core strategy. Today, architecture is built for multiple micro ecosystems. The technology organization must stitch together the macro technology platform from the patchwork of micro ecosystems. That's why ITOps is so complex and expensive.
The need to connect data from across the business to support use cases makes technology a top-level strategic construct. Business data fabric is a strategic and architectural construct that fits into the macro technology platform paradigm. Over the next 2 years, we will see an increasing number of hybrid techno-strategic architectures and patterns proposed.
Data fabric solves the challenge of maintaining business context as much as it solves the challenges around data engineering and governance. I think we've all heard the technical case for data fabric or data mesh. SAP is, intentionally or unintentionally, making the argument for data fabric as the solution to a strategic problem too. That's where I see the potential for their Datasphere platform to gain traction.
Technology platforms are a new type of digital team that supports products and operations across the firm. Data must fit into that structure and increase its value. The data platform must live on the business's emerging macro technology platform to connect data from different micro ecosystems.
The more parts of the business that connect through the data platform, the more complex reassembling those connections through data analysis and modeling becomes. Data engineering and governance costs rise while ROI plummets. It makes more sense for the macro technology platform and architecture to preserve the semantical connections that define the data's business context.
Positioning Data As An Asset
C-level technology leaders and teams are strategic partners. The macro technology ecosystem must support that goal by prioritizing the business need vs. the data or technology. Most needs are discovered after the fact but should be defined as the upfront objective for data gathering and model development.
On the surface, macro ecosystems look and sound complex. That's why so many enterprise software companies struggle to get traction with this message. When you compare macro ecosystems to the alternative, they are actually much more straightforward. Most businesses don't see how technology's place in the business is evolving until it's too late, and they're locked into a patchwork of micro vs. a single macro technology ecosystem.
Low and no-code solutions are developed differently from software engineered solutions, but shouldn't they follow a similar lifecycle? Should self-service analytics solutions go through different verification, integration management, and monitoring processes than data analyst and scientist built solutions? The shadow IT, apps, and analytics problems are caused by different processes being applied to self-service vs. developer-built solutions.
SAP's low and no-code environments integrate with developer environments. They can pull solutions from each other and enforce similar processes. As complex as that sounds, the alternative is much harder to manage. The challenges associated with integrating solutions and processes across different platforms is one reason it rarely gets done.
SAP's Datasphere paradigm is the way most businesses will go in the future. The concept of data engineering, data science, software engineering, self-service tools, and enterprise apps serviced by different solutions doesn't make much sense. That logic applies to most other use cases and workflows that extend across the enterprise.
It will be interesting to see how the next 18 months treat our currently oversized data tools landscape. There's not enough space for all the players we now have. However, the macro technology ecosystem concept makes space for more players. That will ultimately be good for companies coming to terms with the big picture.
Technical strategy is becoming a necessity for businesses. Anything that spans the business needs a top-level strategic construct. Companies like SAP offer a massive range of solutions, and the danger is that businesses will implement more solutions than they need. Technology shouldn't be in the driver's seat. Just because it's available and feasible doesn't mean it's valuable. The business's technology ecosystem and architecture should reflect its structure and needs. When strategy drives technology, the result is higher ROI.
> Big screen TVs occupy a lot of warehouse space and have very low margins. ... Discounting TVs to free up space for higher margin or faster selling products can significantly impact the bottom line.
That's fascinating. Very entertaining dive into the complexity of seemingly ordinary stuff. Thanks for sharing.
> I cannot tell you how much of a pain building that data pipeline was.
Did the project survive after all? Or did it slowly die on crutches? :D
> It will be interesting to see how the next 18 months treat our currently oversized data tools landscape. There's not enough space for all the players we now have.
I'm not sure I get why there's not enough space