Elon Musk, CEO of Twitter, Tesla, and SpaceX, has accused Microsoft of illegally using Twitter’s data to train its artificial intelligence model.
Musk’s accusation came after reports that Microsoft was dropping Twitter from its advertising platform, which allows ad buyers to manage their social media accounts in one place.
OpenAI was created as an open source (which is why I named it “Open” AI), non-profit company to serve as a counterweight to Google, but now it has become a closed source, maximum-profit company effectively controlled by Microsoft.
Not what I intended at all.
— Elon Musk (@elonmusk) February 17, 2023
In response to the reports, Musk tweeted, “They trained illegally using Twitter data. Lawsuit time.” However, it is worth noting that Musk has a reputation for making bold statements on social media that don’t always come to fruition. As of writing this, no lawsuit has been filed.
This incident highlights the growing concern around data ownership in the development of artificial intelligence.
Large technology companies, like Microsoft, are investing heavily in creating advanced AI models, such as OpenAI’s GPT, while data owners are becoming increasingly aware of the value of their data and looking for ways to stop Big Tech from using it or to charge for its use.
Microsoft has been creating its own large language models (LLMs) while also selling access to OpenAI’s models. Last year, Microsoft invested $10 billion in OpenAI in a unique deal.
Musk was one of the co-founders of OpenAI but left its board in 2018. Recently, Musk has criticized OpenAI’s shift from a non-profit to a highly valuable business model, which he believes has been influenced by Microsoft.
They trained illegally using Twitter data. Lawsuit time.
— Elon Musk (@elonmusk) April 19, 2023
This situation underscores the fact that data ownership is becoming an increasingly contentious issue in the development of AI. As AI continues to advance, data is becoming even more valuable, and it is likely that more disputes over data ownership will arise. It will be up to regulators to determine how to resolve these disputes in a fair and transparent manner.
The development of large language models (LLMs) such as GPT relies on the use of vast amounts of data, much of which is collected from social media platforms like Twitter, StackOverflow, and Reddit. This data is particularly valuable for training AI models because it captures informal, conversational language.
Musk vs. Microsoft: The Fight for Ownership of AI Development
As these new AI models transition from research institutions to the corporate world, the issue of data ownership has become increasingly important.
Reddit recently announced that it would begin charging companies for access to its programming interface, which is used to feed Reddit conversations into AI training software.
Universal Music Group also made headlines when it claimed that AI-generated music violated copyright law and breached its agreements. Similarly, Getty Images is suing Stable Diffusion for copying its content to train its AI image generator.
Elon Musk, who previously co-founded OpenAI, announced in December that Twitter would “pause” OpenAI’s access to its database. Musk has also announced plans to create his own large language model, called TruthGPT, within one of his companies.
The impact of the growing dispute over data ownership in the development of AI models is difficult to predict with certainty. However, there are a few potential outcomes that could result from these conflicts.
Firstly, data owners may become more selective about who they allow to access their data for AI training purposes. This could limit the number of companies or institutions that are able to develop advanced AI models, potentially slowing down the pace of AI innovation.
Secondly, disputes over data ownership could lead to increased legal battles and regulatory oversight. As more companies seek to monetize their data, it’s likely that we’ll see more copyright claims and other legal challenges. This could make it more difficult for smaller companies or individuals to access the data they need to develop new AI models.
Overall, the rise of LLMs and the need for vast amounts of training data has brought the issue of data ownership to the forefront. As data becomes increasingly valuable, data owners are seeking to exert greater control over how their data is used.
These disputes will likely continue to arise as AI development progresses, and it will be up to regulators to determine how to balance the interests of data owners and AI developers.