5 Limitations of Generative AI & LLMs
As covered in our overview blog looking at Generative AI and Large Language Models (LLMs), programs like ChatGPT and Bard are revolutionising how different industries are approaching their work. Whether it be by streamlining operations or providing inspiration for creative tasks, these innovative tools are quickly becoming invaluable resources for professionals wishing to employ smarter approaches to work.
That being said, these programs are not without their limitations. If individuals and corporations wish to implement Generative AI & LLMs into their day-to-day operations, they must be aware of the pitfalls that come with doing so. To that end, we’ve provided five of the most common limitations of generative AI & LLM users’ experience alongside invaluable advice on mitigating their impact. While it’s tempting to rely heavily on this new technology for a variety of different tasks, it’s paramount that users also employ a degree of scepticism. With this blog, we hope to shed some light on how users can make the most out of Generative AI & LLMs without letting over-reliance compromise the quality of their work.
1. Incorrect Responses
Looking principally at ChatGPT 3.5 (easily the most popular LLM for general users), the program boasts an extensive vocabulary that, while impressive, often conceals a lack of understanding behind complex terminology. This will be apparent for any subject matter expert who takes the time to scrutinise what is being generated. The concern is that the confident tone ChatGPT 3.5 employs is more than capable of convincing non-subject matter experts that its information is accurate when the opposite is often true. These hallucinatory responses take on a variety of different and often amusing forms, although the common denominator is misinformation that risks compromising the authority of your website and overall brand. Below is one such example of this inaccurate generation arising from a relatively simple prompt.
Although spoken very confidently, ChatGPT 3.5 makes a series of errors within this response. The furthest recorded 3-pointer shot was a whopping 89 feet made by Baron Davis in 2001. Some confusion has arisen from the fact that Stephen Curry set a record on November 7, 2016, although this was for the number of 3-pointers scored in a single game and had nothing to do with how far the ball travelled. Often, these mistakes can be all the more damaging because they’re partially based on reality, making it hard to distinguish fact from fiction. It’s important to remember that while these programs are incredibly sophisticated, they’re also not infallible and shouldn’t be trusted over the word of experienced professionals.
“These AI tools can simply make up facts and get things wrong. So, sure, use it to help you create your content, but you better edit these pieces and research the responses before posting it on your site.” – Barry Schwartz (Executive Editor – Search Engine Roundtable)
2. Outdated Information
One of the reasons behind these hallucinatory responses lies in the outdated nature of the data LLMs possess. In this case, the data set fed to ChatGPT 3.5 only dates up until September 2021. As such, it’s incapable of providing users with up-to-date information on queries that rely heavily on current trends and data such as keyword research or technical website auditing. Thankfully, ChatGPT 3.5 often draws attention to this limitation as you can see in the following example:
They simply don’t have access to the information concerning recent search trends and the most updated version of your website. Furthermore, business owners shouldn’t take the suggestions of Generative AI as gospel, it’s much more suited for providing inspiration to bolster your own independent research rather than being relied on entirely.
“LLMs have a very high wow factor, but they have no clue about your website; don’t use them for diagnosing potential issues with it.” – Gary Illyes (Google Search Analyst)
3. Biased Responses
Unfortunately, Generative AI & LLMs are often susceptible to biased, discriminatory, and sometimes even offensive results. This stems from the data these LLMs have been fed within their training sets. As you can imagine, not all the content found on websites like Wikipedia and Reddit is of the highest quality both in an ethical as well as a language-orientated sense (typos and grammatical errors abound). As a result, outputs can be offensive and/or nonsensical. Davey Alba has written an insightful article on some of the more egregious examples of the Biased Musings of ChatGPT that’s worth checking out as it suggests that the safeguards currently in place for LLMs are ineffective.
4. Americanised Content
One of the tell-tale signs that Generative AI or LLMs have been used without human review lies in the Americanised English that ChatGPT 3.5 produces by default. The most common offender when it comes to these “Americanisms” lies in the saturation of the letter ‘z’ throughout content although there are myriad other examples. ‘Personalized’ instead of ‘Personalised’, ‘fiber’ rather than ‘fibre’, ‘flavor’ over ‘flavour’. You get the idea.
While not a problem for American business owners, it reads as impersonal and frankly quite careless when coming from UK companies. This can be resolved by specifying the use of Standard English in the initial prompt. While slightly embarrassing for business owners, this can be a damning signal to clients and managers alike that thorough reviewing has not taken place. Just remember that while you’re dotting your ‘i’s and crossing your ‘t’s, you’re also watching out for those ‘z’s as well.
5. Formulaic Content
The final limitation of Generative AI & LLMs we’d like to address lies in how formulaic content created by these tools can be unless properly instructed with targeted prompting. For this example, we’ll be comparing how ChatGPT 3.5 approaches writing blogs and how the output adheres to a surprisingly rigid structure unless otherwise specified. While not glaringly obvious for any individual post, the pattern becomes clear when ChatGPT 3.5 is repeatedly employed using the same prompts. See below the side-by-side comparison between two blog posts on wildly different subjects that employ the same prompt structure (you don’t need to read them fully to see the pattern):
As you can see, the resultant output for both prompts is nearly identical structurally. Without properly specifying desired parameters to Generative AI programs, they often revert to default templates for headings, subheadings and the lengths of individual sections. The result is tiring, repetitive content completely lacking in originality. If you want the content these programs produce to engage audiences, then further specification is required within the prompt itself. To help with this there is a range of plugins currently available such as AIPRM which serves as a free extension to ChatGPT 3.5 and allows users to access a range of targeted prompts that generate far more compelling responses.
Get in Touch Today
For more information on how to effectively utilise Generative AI & LLMs in your business without compromising overall work quality, reach out to a member of our experienced Digital Marketing team. We’ll help deliver tailored, data-driven solutions that keep your business at the forefront of an ever-evolving digital landscape.