Network Theory 2.0 - Part 1: Data Gravity
First part of an article series on Network Theory 2.0 and the potential future of the Digital Data Universe is changing
Metcalfe's Law needs an amendment - an upgrade, if you will
Metcalfe states that the value of a network increases with the square of the number of users in a given network. It does not, however, account for the value of the data that each user in the network produces.
In this article we will go through three core concepts:
How networks evolve beyond Metcalfe’s Law once users start adding more data
Why AI models will be commoditized, but value of data will be enhanced
How new business models will be built around the concept of “Data Gravity”
How Networks Evolve
Let’s take a very simple example: the global telephone network
Back in the days of fixed line telephones, each new user who joined the network would add to the total value of the network - according to Metcalfe the value scaled by the square of the number of users. However, the usage of the network by any given individual had very little impact on the overall quality of the network.
If two network members chose to call each other for days on end, no one else really benefited from that (in fact, if network traffic went too high all members faced detrimental network congestion). The telephone system therefore had “Low Data Gravity”
On the other hand, when Social Media entered the stage, the value of data added to the network by individual participants started to become apparent.
Even if you didn’t know all members of the network, the data they added (e.g. through their likes, photos, or preferences) started adding to the recommendation algorithm and the value proposition of the platform grew. However, this required a network where users actively participated and shared their data, as opposed to most other systems where users often want to keep their insights hidden from view.
Social Media, therefore, saw some of early examples of “High Data Gravity Systems”
What’s different now, is that with recent developments in Artificial Intelligence (AI) we are likely to see the conversion of Low Data Gravity to High Data Gravity Systems, and the acceleration of High Data Gravity as a major driver of competitive advantage
The Value of Data
Let’s think a non-networked piece of software - let's for simplicity think of Excel
(I’ve yet to meet many people who have never used Microsoft Excel)
It's hard to argue that Excel has a lot of network effects at first glance, beyond of course the valuable communities that spring up in online forums, courses, blogs, etc. The fact is that if the number of Excel users double over night, or you spend an hour longer working on your Excel model one night, the value proposition of Excel in the short term will probably not fundamentally change for you
More Data + AI = Increased Importance of Data Gravity
The more data you add to a given application, the more the you can (now with more accessible AI) learn both about your data and how you can use data.
Thus, if one Super-User of AI-enhanced-Excel application adds more data to their own files or starts using Excel in novel and innovative ways, the data can be used to teach the solution much faster about how to continuously improve the solution itself. Yes, there has been some forms of data capture and performance enhancement in the past, but traditionally utilizing such data has required cumbersome data analysis and has struggled to scale when the amounts of data grew too large.
Thought experiment: You want to build an advanced Excel model for your business - it could be a market model, valuation model, pricing model, etc.. Today, you either need to build that model from scratch, or find a template from someone else who has built something similar.
What if instead, Excel had trained on all the models that both you, your colleagues, and the community at large has built, and from that be able to build you a model template to start from?
Increased access to data means that both incumbents and challenges can start leveraging the proprietary data that often already exist in their systems in entirely new ways. (However, that assumes that companies already have access to their own data in a frictionless way.)
The "new" law of Network Effects thus becomes that the value of the system increases by both the number of users and the data added by those users.
Even if you now only have two users in a system, if those users keep adding more of their data, and they keep exploring that data and using it in new ways, the system itself can learn from that and keep improving the value proposition of the system.
This is the concept of DATA GRAVITY.
Although there is at time of writing a lot of excitement about the recent progress of Large Language Models (LLMs), smarter chatbots, etc., those models will soon become relatively commoditized as more and more companies figure out how to best apply the mathematical principles behind LLMs.
Now, don't get me wrong - there is a huge first mover advantage up for the best trained models that can be directly deployed to solve valuable customer use cases. However, the value of these models still lies primarily in the data sets they were trained on
At time of writing, OpenAI (LINK - Research report) has a lead on Google because of their superiorly trained model, but the GPT-3 model itself was trained on “The Internet”, and thus does not contain a proprietary training set. Such models and their efficiency is fundamentally guided by mathematical principles - and soon new and more efficient models will likely be developed in a continuous arms race of AI-models.
New Business Models
To stay on the OpenAI example: being early matters!
However, the head start of ChatGPT is not just in a first mover advantage of capturing the first users and thus locking them in by way of usage, community, and best practices, (actually much like Excel has done).
A less obvious but still important long-term value driver comes from the millions of text prompts captured every day by ChatGPT, revealing not only users interests, but also how they engage and speak with a chatbot based LLM.
Applying these prompts to learn and enhance the user experience, much like Google has learnt from search engine queries in the past, will open up for a lot more innovative and exciting business models
The same will be done in large corporations, where data flows between users and systems can be leveraged to improve overall business efficiency and value creation
Hence, new and evolving business models will increasingly revolve around data - impacting how all companies are run. Companies will need to manage their data a lot more efficiently, to liberate their data from their system silos, and to make their data accessible for more advanced applications.
Thus, they must find tools and systems with high data gravity that can help accumulate the valuable data that they need, and AI will be a major lever here. As Peter Diamandis puts it:
“There will be two kinds of companies at the end of this decade... Those that are fully utilizing AI, and those that are out of business.” - Peter Diamandis
Example: One example of a startup leveraging Data Gravity to enhance its network effects would be Intella, a real-time intelligence provider of language services for market research, multi-dialect Arabic voice transcription, and chatbots.
By building one of the largest databases of labelled voice data across different Arabic dialects, Intella has been able to build a platform that leverages data to continuously enhance its insight delivery to customers. As the platform extends across voice-to-text use cases, the greater the overall customer value as the artificial intelligence engine of Intella fine-tunes its domain specific performance
(Disclaimer: I hold a small stake in Intella through an angel syndicate, based on the hypothesis of their data gravity potential)
Next Steps
However, "Network Effects 2.0" and of high Data Gravity comes with new challenges
More and more data will increasingly be the offspring of LLMs - just think of all the marketing copy, social media content, and sheer software code being created by AIs every single day, all adding the potential training set of future LLMs. This may challenge end-user trust in the data, unless trust-mechanisms (or trustless guarantees?) can be put in place. Here enters the topics of Data Ancestry and Data Shelf-Life, which will be the covered in Part 2 and Part 3 of this series
For now, let's leave the page on the importance of building Data Gravity in new and existing data systems. Any system or corporation looking to evolve their competitive advantage will soon experience the accelerated value proposition from accelerating their data creation, optimizing their data operations, and capturing an understanding of their own data consumption.
"They" have long said that "Data is the new oil"
Well, now it's ready to grease the machinery of the new Digital Data Universe
About Author: Although trained as a robotics engineer, I’ve spent nearly a decade as a consultant working on strategy development for global corporations.
Now, I spend my time strategizing on the Future of Tech and how new technologies will impact sectors and business models across industries, advising both corporates and startups on digital strategy development
LinkedIn, Twitter, Website