I have previously outlined how open source AI efforts continue to drive significant innovation and commercial development efforts by researchers, developers, and businesses worldwide.
Especially after Meta’s release of its Llama AI open source models back in March, and of course other open source AI software tools over the years like React, PyTorch and others. Meta’s Llama 2 release a few days ago, is a meaningful update and industry catalyst on the open source front.
Some numbers by industry observer Indrajeet on Twitter, colorfully fleshes out the details, which highlight out how Meta is leading the open source model pack out there:
The chart shows the sheer number of open source models on the ‘GitHub for AI’ software distribution hub site Hugging Face, by major open source and AI developers Meta, Google, Microsoft, Salesforce, and Nvidia.
Hugging face Founder/CEO Clement Delangue then summarizes it, with links to the core companies on hugging face, providing the base models, and additional resources:
“Numbers of public models on @huggingface:
- @Meta: 689 including MusicGen, Galactica, Wav2Vec, RoBERTa,... - huggingface.co/facebook
- @Google: 591 including BERT, Flan, T5, mobilnet,... - huggingface.co/google
- @Microsoft : 252 including DialoGPT, BioGPT, layoutLM, uniML, Deberta,... -huggingface.co/microsoft
- @Salesforce: 88 including CodeGen, Blip,... - huggingface.co/Salesforce
- @nvidia: 86 including Megatron, Segformer,... - huggingface.co/nvidia
Inspiring to see all the contributions of big technology companies to open-source AI. Let's go!”
Another industry observer Daniel Bender adds OpenAI as well into the mix:
“For comparison, I add the company which pushed generative Al forward the most.
@OpenAI: 30 models including
Whisper, Clip El, Shap-E.”
Note these open source models are also available on other cloud data center hubs like Amazon AWS, Microsoft Azure, and many others, so the Hugging Face numbers are but a subset of what’s getting distributed out there.
But the above numbers provide relative and directional metrics on the thousands of AI open source models and software that the industry is experimenting and building around actively.
I’ve also previously emphasized how Meta has been a meaningful catalyst, most recently with their release of their Llama 2 open source foundation LLM AI model updates in various sizes:
“LLaMA 2 will be available with weights and commercial licenses, in three sizes with 7, 13 and 70 billion parameters. That compares to a 175 billion parameters for OpenAI’s GPT 3 LLM AI model, and reported 1.8 trillion parameters for GPT 4. So in that context, these Foundation models vary across the size spectrum. But even in these sizes, they have powerful research and commercial uses.”
The reaction by developers and companies post the release of Llama 2 by Meta has been positive, and should drive much further activity and innovation in the months to come.
As I’ve highlighted in follow up posts, there is a lot to look forward to in the coming months and years, in both the open source and closed LLM AI model innovation fronts:
“AI is still being invented and reinvented: Core AI innovation at the deep technical level is at the ‘beginning of the beginning’ phase. The best technical innovations are still emerging every week out of academic research labs, in the form of AI papers, along with open source efforts by developers worldwide, then being flooded with venture dollars to create a Cambrian explosion of new ‘AI Native’ startups.”
“Assumptions on AI capabilities and ‘Compute Costs’ in particular, are likely to change meaningfully just this year alone. Expect this to continue into next year and accelerate.”
The number of open source models by Meta and others above, just illustrate the increasing amount of activity and experimentation, in these very early days. Even in the dog days of summer. Stay tuned.