AI Startup Anthropic Says New Models Cut Hallucination Risks
The startup says new versions of Claude will be twice as likely to answer a question correctly.
March 6, 2024
(Bloomberg) — Artificial intelligence startup Anthropic, one of the sector's most closely watched companies, is rolling out new software for its chatbot Claude that will be better at carrying out complicated instructions and less prone to making things up.
On Monday, San Francisco-based Anthropic introduced three new AI models — Claude 3 Opus, Sonnet and Haiku. The literary names hint at the capabilities of each model, with Opus being the most powerful and Haiku the lightest and quickest. Opus and Sonnet are available to developers now, while Haiku will arrive in the coming weeks, the company said on Monday.
Chatbots capable of mimicking human conversation have become an increasing focus of Silicon Valley companies — with fast tech advances fueling an investing frenzy. Although chatbots themselves are by no means new, the technology powering Claude and competitors' bots is a more powerful tool known as a large language model, which is trained on massive swaths of the internet in order to generate text, such as an answer to a question or a poem. Such tools are an application of generative AI, systems that consider input such as a text prompt and use it to output new content.
But the technology has issues. For example, the chatbots are prone to saying things that aren't true, an issue sometimes referred to as hallucinations. "These models are still just trained to predict the next word — it's very, very hard to get to zero percent hallucination rate," Anthropic President Daniela Amodei said.
In its latest launch, the company has tried to address the problem, a priority for Anthropic customers, Amodei said. The company said the new versions of Claude software are twice as likely to offer correct answers to questions and less likely to make things up.
Anthropic was formed in 2021 by former employees of OpenAI, including Daniela Amodei and her brother Dario, who serves as its chief executive officer. The company has since become one of OpenAI's most formidable competitors, raising billions in venture capital funding. Most of its customers are businesses, ranging from search engine DuckDuckGo to travel guide publisher Lonely Planet.
Anthropic has emphasized developing AI safely and responsibly, which has at times limited its performance. For example, older versions of Claude often refused to respond to queries that were harmless, the company said, because they appeared to the software to be problematic. The new models introduced Monday do this much less often, the company said.
"The science of controlling the training of AI systems remains imperfect — it's getting better every day but it remains imperfect," Dario Amodei said.
In an effort to help users feel that they can trust the results they get from Claude, the latest versions of the software will soon start citing sentences in reference materials to back up the responses they generate, the company said. In an interview with Bloomberg Television Monday, Daniela Amodei said that in addition to making Claude more capable broadly, the new models' increased accuracy would help Anthropic better serve its business clients.
The new models will be able to analyze images, a feature Bloomberg previously reported the company had been working on. That technology enables the programs to perform tasks like identifying the breed of a dog in a photo, comparing two pictures of T-shirts or describing a piece of art — something that Alphabet Inc.'s Google Gemini and OpenAI's ChatGPT already offer.
The company is opting against adding the ability to generate images, as OpenAI and Google's chatbots can. Dario Amodei said Anthropic customers aren't clamoring for such a feature.
In a move that the CEO admitted is "very Silicon Valley-centric," the company tested the new models' abilities to recall specific bits of text placed in lengthy documents with the use of a set of essays written by Paul Graham, who co-founded startup accelerator Y Combinator — a training practice that other startups have said they have done, too.
The company's middle-tier model, Sonnet, is now powering the publicly available version of Claude online. People who pay for a Claude Pro subscription can use the most powerful version, Opus.
About the Author
You May Also Like