Hello Dataspace: From Big Data to Better Data
The problem: Bad data = bad economics
Data analytics and artificial intelligence are suffering from a productivity crisis. And data is the core problem. Even the best software won’t work with the right data. According to meta research more than 80% of the time budget of a data analytics project is spent on data wrangling instead of results (Schlueter Langdon 2019). This turns the 80/20 Pareto principle, a cornerstone of business efficiency, upside down (e.g., Neuman 2005).
The solution: Dataspace = better data
“Are you still storing, or are you already sharing?” One solution to the data crisis is data sharing. Let a software app pull the right data from the source when it is needed. Treat data like modern logistics: allow for on-demand and just-in-time delivery. Until 2022 such data sharing has been impeded by a lack of data sovereignty: How to protect my rights to data when I share it? The solution is a dataspace. It facilitates data sharing with data sovereignty protection. A first industrial version will be launched by the Catena-X network of leading automakers and tier 1 suppliers in 2023 (link).
What is a dataspace?
A dataspace is a data communication or data dial-tone system to enable peer-to-peer data sharing transactions initiated by a data consumer (e.g., a software app like Material Traceability or Circular Economy) to pull data on-demand from a data provider for a specific data product (a digital twin, for example; see Schueter Langdon & Sikora 2020) with (a) data sovereignty protection through access and usage policy enforcement and (b) Gaia-X compliance (such as self-description and verification) so that the data provider retains power to control rights to the data (first use cases in Schlueter Langdon & Schweichhart 2022). For a C-level description, please read:
Schlueter Langdon, C. 2023. From Big Data to Better Data – Dataspace Top 10. In: Mertens, C., et al. (eds.). Data Move People, Anthology (version 2.0, January), International Data Spaces Association, Berlin (forthcoming)
Our “Data Analytics Innovation” miniseries
This article continues our series on data analytics innovation. Previous episode include:
- Behavioral Analytics: Auto Interior & UX: Shift to behavioral variables (link, 2016)
- Technology Personified – AMA: Shift to recommendations (link, 2014)
Newman, M. E. 2005. Power laws, Pareto Distributions, and Zipf’s law. Contemporary Physics 46(5): 323–351
Schlueter Langdon, C. 2019. Data is broken: The data productivity crisis. Telekom Data Intelligence Hub Blog Story, T-Systems International, Frankfurt, link
Schlueter Langdon, C., and K. Schweichhart. 2022. Data Spaces: First Applications in Mobility and Industry. In: Otto B., et al. (eds.). Data Spaces – Part IV Solutions & Applications. Springer Nature, Switzerland: 493-511, link
Schlueter Langdon, C., and R. Sikora. 2020. Creating a Data Factory for Data Products. In: Lang, K. R., J. J. Xu et al. (eds). Smart Business: Technology and Data Enabled Innovative Business Models and Practices. Springer Nature, Switzerland: 43-55, link