OpenAI debuts GPT-4 and claims it’s less prone to ‘hallucinations’

15 Mar 2023

OpenAI CEO Sam Altman at TechCrunch Disrupt SF 2017. Image: Steve Jennings/Getty Images for TechCrunch (CC by 2.0)

The new AI model is available for ChatGPT subscribers, while various companies including Microsoft and Intercom have already integrated GPT-4 into their products.

OpenAI has revealed GPT-4, the latest large language model which it claims to be its most reliable AI system to date.

The company says this new system can understand both text and image inputs and is able to “solve difficult problems with greater accuracy”. However, image inputs is a developing feature and not publicly available.

The model is available for ChatGPT premium subscribers, while a waitlist has been set up for developers that want to dive into the model.

OpenAI said GPT-4 has been in development behind the scenes for months now, being developed with the assistance of feedback from those using ChatGPT – which runs on GPT-3.5.

“We spent 6 months making GPT-4 safer and more aligned,” OpenAI said. “GPT-4 is 82pc less likely to respond to requests for disallowed content and 40pc more likely to produce factual responses than GPT-3.5 on our internal evaluations.”

The company has also worked with several companies to create early products that already utilise GPT-4. Microsoft has revealed that its AI-boosted Bing is now running on a customised version of GPT-4.

The tech giant has been one of OpenAI’s biggest supporters, investing $1bn in 2019 and helping OpenAI’s services with the support of an Azure supercomputer.

Other early adopters of GPT-4 include Stripe, which is using the system to improve user experiences and combat fraud. Duolingo has also launched a new learning experience that can explain answers, which is powered by the OpenAI creation.

Intercom has also revealed a new customer service bot – Fin – which is powered by GPT-4. The company said this bot is optimised for accuracy and to reduce “hallucinations”, which is an obstacle noted in GPT-3.5.

“We’ve seen significant accuracy improvements with GPT-4 that reduced the so-called hallucinations and we’re excited to now start testing Fin with customers and see how well it performs in the real world,” said Intercom co-founder Des Traynor.

Challenges remain

In a blog post, OpenAI said the distinction between GPT-4 and the previous GPT-3.5 is “subtle” at first glance.

“The difference comes out when the complexity of the task reaches a sufficient threshold – GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5,” OpenAI said.

However, the company acknowledged that the latest system still has “similar limitations” to previous models, including a risk of making reasoning errors and hallucinating facts.

OpenAI CEO Sam Altman tweeted that the model is still flawed and limited, adding that it “still seems more impressive on first use than it does after you spend more time with it”. But he still expressed the improvements the model has to its predecessors.

“It is more creative than previous models, it hallucinates significantly less, and it is less biased,” Altman said. “It can pass a bar exam and score a five on several AP exams.

“We hope you enjoy it and we really appreciate feedback on its shortcomings.”

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.

OpenAI CEO Sam Altman at TechCrunch Disrupt SF 2017. Image: Steve Jennings/Getty Images for TechCrunch via Flickr (CC by 2.0)