Meta is going all in on open-source AI. The company is today unveiling LLaMA 2, its first large language model that’s available for anyone to use-for free.
Since OpenAI released its hugely popular AI chatbot ChatGPT last November, tech companies have been racing to release models in hopes of overthrowing its supremacy.
In February when competitors Microsoft and Google announced their AI chatbots, Meta rolled out the first, smaller version of LLaMA, restricted to researchers.
The company is actually releasing a suite of AI models, which include versions of LLaMA 2 in different sizes, as well as a version of the AI model that people can build into a chatbot, similar to ChatGPT. Unlike ChatGPT, which people can access through OpenAI’s website, the model must be downloaded from Meta’s launch partners Microsoft Azure, Amazon Web Services, and Hugging Face.
LLaMA 2 also has the same problems that plague all large language models: a propensity to produce falsehoods and offensive language.
Greater access to the code behind generative models is fueling innovation.
The idea, Al-Dahle says, is that by releasing the model into the wild and letting developers and companies tinker with it, Meta will learn important lessons about how to make its models safer, less biased, and more efficient.
A powerful open-source model like LLaMA 2 poses a considerable threat to OpenAI, says Percy Liang, director of Stanford’s Center for Research on Foundation Models.
Liang was part of the team of researchers who developed Alpaca, an open-source competitor to GPT-3, an earlier version of OpenAI’s language model.
In its research paper, Meta admits there is still a large gap in performance between LLaMA 2 and GPT-4, which is now OpenAI’s state-of-the-art AI language model.
A more customizable and transparent model, such as LLaMA 2, might help companies create products and services faster than a big, sophisticated proprietary model, he says.
Getting LLaMA 2 ready to launch required a lot of tweaking to make the model safer and less likely to spew toxic falsehoods than its predecessor, Al-Dahle says.
Its language model for science, Galactica, was taken offline after only three days, and its previous LLaMA model, which was meant only for research purposes, was leaked online, sparking criticism from politicians who questioned whether Meta was taking proper account of the risks associated with AI language models, such as disinformation and harassment.
Meta’s approach to training LLaMA 2 had more steps than usual for generative AI models, says Sasha Luccioni, a researcher at AI startup Hugging Face.
The model was trained on 40% more data than its predecessor.
Meta’s commitment to openness is exciting, says Luccioni, because it allows researchers like herself to study AI models’ biases, ethics, and efficiency properly.
The fact that LLaMA 2 is an open-source model will also allow external researchers and developers to probe it for security flaws, which will make it safer than proprietary models, Al-Dahle says.