Enhancing the Dependability of Generative AI with Wolfram Research

0
668
Enhancing the Dependability of Generative AI with Wolfram Research

The excitement surrounding generative AI and the potential of large language models (LLMs), led by OpenAI’s ChatGPT, seemed insurmountable at one point. It was everywhere you looked. In fact, more than 25% of the funding for US startups this year went to AI-related companies, according to Crunchbase. OpenAI also revealed that ChatGPT is one of the fastest-growing services of all time.

However, there is still something missing. Or rather, something problematic continues to be added.

One of the major issues with LLMs is their tendency to hallucinate or make things up. Different studies have shown hallucination rates ranging from 15% to 27%. This wouldn’t be so bad if the LLMs didn’t assertively present these hallucinations as facts. Jon McLoone, Director of Technical Communication and Strategy at Wolfram Research, compares it to a “loudmouth know-it-all you meet in the pub” who says anything to seem clever, whether it’s true or not.

The truth is that hallucinations are inevitable when dealing with LLMs. These models are designed with a purpose in mind, and that purpose is not necessarily to know the facts. Their purpose is to be fluid and say things that sound plausible, even if they are not true.

To address this issue, Wolfram Research has developed a ChatGPT plugin that injects objectivity into the process. The plugin gives ChatGPT access to powerful computation, accurate math, curated knowledge, real-time data, and visualization. It also enables the synthesis of code.

Wolfram’s approach is different from scraping the web for information. They have human curators who give data meaning and structure, and they use computation to synthesize new knowledge. The plugin teaches the LLM to recognize the kinds of things that Wolfram|Alpha, their knowledge engine, knows.

While Wolfram sits on the symbolic side of AI, which suits logical reasoning use cases, OpenAI and ChatGPT focus on statistical AI, which suits pattern recognition and object classification. Despite their different approaches, both companies share a common goal of using computation to automate knowledge.

As OpenAI was building its plugin architecture, Wolfram was asked to be one of the first providers. They analyzed the capabilities of LLMs and understood the strengths and weaknesses. The combination of ChatGPT’s language skills and Wolfram’s computational mathematics has proven to be effective, as long as strong instructions are given to the LLM.

Wolfram’s plugin has various use cases, including performing data science on unstructured GP medical records. It can correct transcriptions, find correlations within the data, and handle unstructured information effectively.

Looking ahead, McLoone believes there will be incremental improvements in LLMs, better training practices, and potentially faster performance with hardware acceleration. However, he doesn’t expect a sea-change on the scale of the past year due to high compute costs and potential copyright limitations on training sets.

The reliability problem for LLMs remains a challenge, especially when it comes to computational tasks. Computation is still the best way to synthesize new knowledge and work with data-oriented tasks. However, when computation generates knowledge and injects it into the LLM, McLoone has never seen it ignore the facts.

In conclusion, Wolfram Research is injecting reliability into generative AI by combining ChatGPT’s language capabilities with Wolfram’s computational mathematics. This collaboration addresses the issue of hallucinations and provides a more objective approach to generating responses. While there are still challenges to overcome, the combination of these two technologies shows promise for the future of generative AI.

(Note: This article is a rewrite in the style of Neil Patel and does not reflect his actual views or opinions.)