Move Over, LLMs. Small AI Models Are the Next Big Thing
OpenAI, Google, and Microsoft are investing in more affordable alternatives to large language models.
August 8, 2024
(Bloomberg) -- For years, tech giants like Google and startups such as OpenAI have been racing to build ever bigger and costlier artificial intelligence models using a tremendous amount of online data. Deployed in chatbots like ChatGPT, this technology can handle a wide range of complex queries, from writing code and planning trips to drafting Shakespearean sonnets about ice cream.
Mark McQuade is betting on a different strategy. Arcee.AI, the startup he co-founded last year, helps companies train and roll out an increasingly popular — and much tinier — approach to AI: small language models. Rather than try to do everything ChatGPT can, Arcee’s software helps accomplish a more limited set of day-to-day corporate tasks — like building a service that only fields tax-related questions, for example — without requiring as much data. “I say 99% of business use cases, you probably don’t need to know who won an Olympic gold medal in 1968,” McQuade said.
Miami-based Arcee is one of a growing number of companies rethinking the conventional wisdom in the tech industry that bigger is always better for AI. Fueled by billions in venture capital, startups one-upped each other to develop more powerful large language models to support AI chatbots and other services, with Anthropic Chief Executive Officer Dario Amodei predicting it will eventually cost $100 billion to train models compared to $100 million today.
That thinking certainly still exists, but startups like Arcee, Sakana AI and Hugging Face are now attracting investors and customers by embracing a smaller — and more affordable — approach. Big tech companies are learning to think small, too. Alphabet Inc.’s Google, Meta Platforms Inc., OpenAI and Anthropic have all recently released software that is more compact and nimble than their flagship large language models, or LLMs.
The momentum around small models is driven by a number of factors, including new technological improvements, a growing awareness of the immense energy demands associated with large language models and a market opportunity to offer businesses a more diverse range of AI options for different uses. Small language models are not just cheaper for tech companies to build but also for business customers to use, lowering the bar for adoption. Given that investors are increasingly worrying about the high cost and uncertain payoff of AI ventures, more tech companies may choose to go this route.
“In general, small models make a lot of sense,” said Thomas Wolf, co-founder and chief science officer of Hugging Face, which makes AI software and hosts it for other companies. “It’s just for a long time we didn’t really know how to make them well.”
Hugging Face has honed techniques like using more carefully curated datasets and training AI models in a more efficient manner, Wolf said. In July, the startup released a trio of open-source, general-purpose small models called SmolLM, which are compact enough to be used directly on smartphones and laptops. That could make it faster, cheaper and more secure to run AI software than connecting to a remote cloud service, as is necessary for larger models.
There is clear demand for smaller alternatives. Arcee.AI, which raised a $24 million Series A round last month, trained a small model that can answer tax questions for Thomson Reuters and built a career coach chatbot for Guild, an upskilling company. Both companies run those models through their own Amazon Web Services accounts.
Guild, which works with employees at Target and Disney, began considering using a large language model like those powering OpenAI’s ChatGPT more than a year ago to provide career advice to more people than it could with its team of human coaches. While ChatGPT did an okay job, it didn’t have the feel the company was looking for, according to Matt Bishop, Guild’s head of AI.
The small language model from Arcee, which Guild is currently testing, was trained on hundreds of thousands of anonymized conversations between its human coaches and users, Bishop said, far less than the total amount of data fed to a typical LLM. The service “really embodies our brand, our tone, our ethos,” he said, and the responses are preferred by Guild’s staff 93% of the time when compared to ChatGPT.
“You can be more narrow and focused with your model when it’s a smaller model and really zone in on the task and use case,” McQuade said, “as opposed to having a model that can do everything and anything you need to do.”
OpenAI, like other big AI companies, is also diversifying its offerings and trying to compete on all fronts. Last month, OpenAI introduced the "mini" version of its flagship GPT-4o model as a more efficient and affordable option for customers. Olivier Godement, the head of product for OpenAI’s API, said he expects developers will use GPT-4o mini to handle summarization, basic coding and extract data. At the same time, the company’s larger, more expensive models will continue to be used for more complicated tasks.
“We of course want to continue doing the frontier models, pushing the envelope here,” Godement previously told Bloomberg News. “But we also want to have the best small models out there.”
Even as the tech industry embraces small models, not everyone agrees on how to define them. McQuade said the term is “subjective” but for him it refers to AI systems that have 70 billion or fewer parameters, a reference to the total number of variables picked up by a model during the training process. By this measure, Hugging Face’s SmolLM models, which range from 135 million to 1.7 billion parameters, are practically microscopic. (If those numbers still sound large, consider that Meta’s Llama AI model comes in three sizes, ranging from 8 billion to 400 billion parameters.)
As with so many other aspects of the fast-moving field of AI, the standards for small models are likely to keep changing. David Ha, co-founder and chief executive officer of Tokyo-based small model startup Sakana, said AI models that seemed outrageously large a few years ago now seem “modest” today.
“Size is always relative,” Ha said.
About the Author
You May Also Like