ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A…ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A…

ChatGPT named least reliable work chatbot in new AI reliability report

2025/12/11 02:38
3 min read
For feedback or concerns regarding this content, please contact us at [email protected]

ChatGPT may dominate the AI chatbot market, but a new report suggests popularity does not equal trustworthiness. A December 2025 study examining how leading AI chatbots perform in everyday work scenarios has ranked ChatGPT as the least reliable option for professional tasks. The findings raise fresh concerns for businesses that increasingly depend on AI tools for daily operations.

The study, conducted by Relum, didn’t just look at specs on paper; they stress-tested ten major AI chatbots in real-world professional scenarios. The results? A massive disconnect between hype and reality.

The study assessed each chatbot across four key criteria. These were hallucination rate, customer product ratings, response consistency across tasks, and downtime frequency. Each factor contributed to a composite reliability risk score, with higher scores indicating greater potential workplace issues.

Here is the stat that should keep business leaders up at night: Despite controlling a massive 81% of the market and boasting high user ratings, ChatGPT recorded a hallucination rate of 35%.

In plain English, that means more than one out of every three answers it gives contains fabricated or incorrect information. If you are using it to draft a fantasy novel, that’s fine, but if you are using it for compliance reports or financial decision-making, that is a recipe for disaster. Consequently, the study slapped ChatGPT with a reliability risk score of 99 out of 99, the worst in the group.

ChatGPT named least reliable work chatbot in new AI reliability reportChatGPT

Google didn’t fare any better. While Gemini had better uptime, it actually performed worse on pure accuracy, registering the highest hallucination rate of the entire group at 38%. It highlights a weird paradox in the current AI market: the tools we use the most are often the ones struggling the hardest to keep their facts straight.

Claude and Meta AI occupy a murky middle ground. Claude, despite being a favourite for its writing style, ranked as the second least reliable due to frequent downtime and a 17% hallucination rate. Meta AI was more accurate (15% hallucination), but users seem not to like the experience, giving it the lowest satisfaction rating of the bunch (3.4 out of 5).

The “underdogs” – Grok and DeepSeek steal the show from ChatGPT

If the big names are dropping the ball, who is actually doing the work? Surprisingly, the study points to Grok and DeepSeek as the most reliable tools for professional use. They don’t have the massive marketing budgets or brand recognition of OpenAI, but they simply worked better. DeepSeek recorded zero service outages and kept hallucinations to a minimum.

Kimi also scored well, finding a sweet spot between consistency and uptime. Meanwhile, paid options like Perplexity AI were solid but raised questions about whether the subscription cost is worth it when cheaper, lesser-known alternatives are outperforming them.

ChatGPT named least reliable work chatbot in new AI reliability report

Relum’s Chief Product Officer, Razvan-Lucian Haiduc, warned that reliability should be a central factor in AI adoption decisions. He noted that around 65% of US companies now use AI chatbots in daily workflows. Nearly 45% of employees admit to sharing sensitive company information with these tools.

As AI becomes more embedded in routine work, the risks of misinformation multiply. Haiduc emphasised that the most widely used chatbot is not always the best fit for every industry. Accuracy, uptime and task-specific performance should outweigh brand familiarity.

The report serves as a reality check for the industry. Trust shouldn’t be given just because a chatbot is famous; it should be earned through consistent, verifiable truth. Right now, it looks like the market leaders have some serious catching up to do.

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

The most popular open-source project in history almost became a "trophy" in the cryptocurrency world.

The most popular open-source project in history almost became a "trophy" in the cryptocurrency world.

Author: Nancy, PANews A dark horse has emerged in the open-source world. In just three months, OpenClaw has become the most popular and fastest-growing open-source
Share
PANews2026/03/04 11:48
BlackRock Increases U.S. Stock Exposure Amid AI Surge

BlackRock Increases U.S. Stock Exposure Amid AI Surge

The post BlackRock Increases U.S. Stock Exposure Amid AI Surge appeared on BitcoinEthereumNews.com. Key Points: BlackRock significantly increased U.S. stock exposure. AI sector driven gains boost S&P 500 to historic highs. Shift may set a precedent for other major asset managers. BlackRock, the largest asset manager, significantly increased U.S. stock and AI sector exposure, adjusting its $185 billion investment portfolios, according to a recent investment outlook report.. This strategic shift signals strong confidence in U.S. market growth, driven by AI and anticipated Federal Reserve moves, influencing significant fund flows into BlackRock’s ETFs. The reallocation increases U.S. stocks by 2% while reducing holdings in international developed markets. BlackRock’s move reflects confidence in the U.S. stock market’s trajectory, driven by robust earnings and the anticipation of Federal Reserve rate cuts. As a result, billions of dollars have flowed into BlackRock’s ETFs following the portfolio adjustment. “Our increased allocation to U.S. stocks, particularly in the AI sector, is a testament to our confidence in the growth potential of these technologies.” — Larry Fink, CEO, BlackRock The financial markets have responded favorably to this adjustment. The S&P 500 Index recently reached a historic high this year, supported by AI-driven investment enthusiasm. BlackRock’s decision aligns with widespread market speculation on the Federal Reserve’s next moves, further amplifying investor interest and confidence. AI Surge Propels S&P 500 to Historic Highs At no other time in history has the S&P 500 seen such dramatic gains driven by a single sector as the recent surge spurred by AI investments in 2023. Experts suggest that the strategic increase in U.S. stock exposure by BlackRock may set a precedent for other major asset managers. Historically, shifts of this magnitude have influenced broader market behaviors as others follow suit. Market analysts point to the favorable economic environment and technological advancements that are propelling the AI sector’s momentum. The continued growth of AI technologies is…
Share
BitcoinEthereumNews2025/09/18 02:49
Forward Industries Files $4 Billion ATM Offering to Boost Solana Treasury

Forward Industries Files $4 Billion ATM Offering to Boost Solana Treasury

Forward Industries filed an automatic shelf to offer up to $4 billion in at-the-market common stock to support its Solana (SOL) treasury strategy.
Share
Blockchainreporter2025/09/18 05:10