AI: Blurring lines between 'Open' & 'Close' LLM AIs. RTZ #406
...a distinction increasingly without a difference
One of the enduring debates in this AI Tech Wave thus far, is the ongoing debate between open-source and ‘closed’ or proprietary LLM AI models. Despite blurring lines between them.
For now LLM AI leaders OpenAI, and Anthropic remain on the ‘closed’ side. Whereas their partners Microsoft, Amazon, Google and others primarily play on the ‘closed’ side, despite having some large and small models in the ‘open’ bucket.
Amongst the Big Tech ‘Magnificent 7’ LLM AI developers, only Meta has had the ‘Open-Source’ LLM AI crown, with its Llama 3 series of models getting the most accolades on this front. Founder/CEO Mark Zuckerberg has made the most of this distinction, and that reputation is an important driver for the company to be able to attract top AI talent for its AI related efforts.
And that alone is an important driver given the importance of that input in AI innovation, along with AI GPU chips, data center infrastructure, Power, and other relatively scarce Inputs.
But as the Information makes clear in “What Counts as Open-Source [AI]?”, the lines are blurring between open and closed:
“When Meta Platforms’ chief artificial intelligence scientist Yann LeCun last month posted on LinkedIn about Meta’s strategy of releasing its large language models for free, some commenters praised its approach, saying it was “reshaping industry collaborations.” Others disagreed with LeCun’s description of the strategy as “open-source.”
“They should absolutely get credit for open model NOT open source—calling something open source without being open source is really misrepresenting the open source movement,” one commenter wrote. “It's a shame open-source in this context is just marketing spin for data laundering,” another said.”
“A third suggested that Meta refer to LLMs such as its Llama model as “open-weight” instead of “open-source,” because the company shares model weights, or the settings that determine how a model responds to queries, but not information like training data.”
“This may seem like an argument over semantics. But whether some models—including those from Meta, France’s Mistral and Germany’s Aleph Alpha—are truly open-source has become a recurring debate among people who work on AI, particularly academic researchers.”
“The answer could have policy consequences, too. The European Union’s AI Act exempts open-source models from some of the law’s requirements. If the EU deems a model open-source, it could limit how much information the model maker has to share about how it developed the model.”
“This debate stems from the lack of a settled definition for open-source AI. The Open Source Initiative, which sets standards for open-source software, is currently working on a definition for open-source AI. Its definition of open-source software has several criteria, including that the software must allow free redistribution and include source code.”
Some debate Meta’s claim for the open-source LLM AI crown:
“But some open-source proponents say the traditional definition of open-source software doesn’t translate well to AI, and that many self-proclaimed open-source developers aren’t truly open-source because they have restrictive licenses or share little about how their models are trained and fine-tuned. Meta’s Llama 3, for example, requires companies with more than 700 million monthly active users to request licenses that could have fewer rights than the general agreement for Llama 3.”
“OSI hasn’t approved a license for Meta, which would indicate that the organization believes that a developer had followed its standards for open-source software.”
The other reason this distinction may be moot, is as training and inference of increasingly dozens of LLM AI models, open and closed, large and small, increasingly get blended in the AI results that millions of users ultimately see in their AI applications and services.
This line between open and closed becomes increasingly a mere technical point of debate, especially when Data gets added to the mix. Especially as a an increasing number of open and closed models are used together to generate ‘on-the-fly’ ‘Synthetic Data’ for multimodal AI models and their prompt ‘outputs’. Both via direct user inputs and APIs.
Especially given the ‘reinforcement learning loops’ in their TeraFlop matrix math glory, that distinguish this AI Tech Wave from all prior tech waves as previously discussed.
But for now, the debate between Open and Closed continue unabated, for a range of reasons, from technical purity, to regulatory impunity.
LLM AI companies will use this distinction to argue their book, regardless of which side of the open-closed debate they’re on. For now. Despite blurring lines. Stay tuned.
(NOTE: The discussions here are for information purposes only, and not meant as investment advice at any time. Thanks for joining us here)