
Getty Images
What comes after LLMs? The next wave in generative AI
Large language models ushered in a new era of AI, but they're not without their fair share of limitations. What technology comes next?
Without large language models, generative AI as we know it wouldn't exist.
But LLMs are also subject to some significant limitations -- including a propensity to hallucinate, extensive demand for compute resources and no capacity to reason.
Because these problems pose major challenges for certain generative AI use cases, it's reasonable to expect that other types of models will enter the generative AI ecosystem sooner or later. LLMs aren't likely to disappear entirely within the foreseeable future, but they might be complemented by other forms of AI that excel in areas where LLMs fall short.
Research on LLM alternatives remains limited, so it's impossible to accurately predict the next big thing in generative AI after LLMs. But it's not too early to make some educated guesses -- looking to technology such as logical reasoning systems, real-time learning models, liquid learning networks and small language models.
What is an LLM, exactly?
Broadly speaking, the term LLM refers to any machine learning model trained on large amounts of textual data -- hence the "large" descriptor.
However, when some people talk about LLMs, they're referring specifically to ones that use the transformer architecture. That architecture's hallmark feature is the ability to translate text into tokens and analyze multiple tokens simultaneously, using a technique called an attention mechanism to estimate the relative importance of various words and other tokens in a passage of text. This makes transformers different from other model types that process data sequentially.
All major LLMs that have made big splashes in recent years -- such as OpenAI's GPT models, Anthropic's Claude models and Meta's Llama models -- use the transformer architecture. But there are other types of LLMs, such as Mamba, that use different approaches.
The category can also be ambiguous because there is no formal definition of how large a model's training data set must be for the model to qualify as an LLM. The big-name LLMs train on massive volumes of data, likely including most information freely available on the internet.
If developers were to train a model on a smaller subset of data, it's debatable whether it would still count as an LLM or would be better labeled as a small language model (SLM). In this sense, the term LLM is similar to the term big data -- there are no strict rules governing how "large" or "big" the relevant data sets need to be.
The imperfections of LLMs
LLMs can do impressive things, like interpret natural-language prompts and generate novel content in response. LLMs can also be combined with image and video generators to create multimodal content.
Yet LLMs are also subject to some notable challenges and drawbacks:
- Hallucination risks. LLMs hallucinate, meaning they generate responses that include false information, primarily due to how they process input. Certain types of LLMs are more prone to hallucinations than others; some non-transformer LLMs, like EMMA, report low hallucination rates. Still, it does not appear possible to create a hallucination-proof LLM.
- Demand for compute resources. While some types of LLMs are more computationally efficient than others (those based on the transformer architecture are not very efficient), they require vast compute resources during training and inference.
- Lack of memory. LLMs "remember" the data they are trained on because it is intrinsic to how they operate. But they can't retain data they encounter during inference or recall a previous prompt when processing a new one. Some generative AI services, like ChatGPT, can consider a user's previous prompts when processing new ones, but they appear to do this using capabilities external to the underlying models.
- No support for continuous learning. LLMs can't learn new information continuously. They can only generate content based on whatever data they were trained on. To update an LLM's knowledge base, you need to train it on additional data, and there is currently no reliable way to train a model in real time so that its knowledge remains continuously up to date.
- Inability to reason. LLMs cannot reason because they can't use logic to interpret data unrelated to their training. They can only compare input to patterns within their training data and use those relationships to generate output.
These challenges can create roadblocks for certain use cases. For instance, hallucinations make generative AI unreliable in contexts where accuracy is absolutely critical -- like the law, as one attorney famously learned the hard way when he used ChatGPT to assist with legal research.
Likewise, any use case that requires information to be completely up to date -- for example, searching recently published news articles -- is challenging to support with an LLM because the LLM's knowledge base will never be fully in sync with data that changes in real time. And the computationally intense nature of LLMs makes them expensive to build and operate, which could potentially limit the long-term growth of LLM technology.
Thinking beyond LLMs
No single LLM alternative can solve all the issues facing LLMs and guarantee a better UX. However, various AI design techniques and approaches can mitigate some of their drawbacks.
Logical reasoning systems
Although LLMs can't reason, other types of AI systems can process data based on logic. In a basic sense, logical reasoning is one of the oldest forms of AI; it's the approach that was used to train computers to play checkers based on predefined procedures as far back as the 1950s. AI programming languages designed for logical reasoning, like Prolog, have also been around for decades.
The drawback of logical reasoning as an AI tool is that it requires developers to define logic rules, and it's impossible to anticipate every type of scenario that might require an AI tool to perform reasoning. Logical reasoning will, therefore, probably never be sufficient on its own to power generative AI tools. However, logical systems could be combined with LLMs to mitigate some of the latter's limitations. For example, logic rules could evaluate LLM output and identify content that appears to be hallucinated.
Real-time learning models
While LLMs have been making headlines in recent years, some researchers have been working on other types of models capable of "learning" new data continuously. An example is AIGO, whose developers designed the model using an approach they call an integrated neurosymbolic architecture (INSA). Publicly available technical details on exactly how the model works are limited, but it's not an LLM, and it's evidently capable of continuously adding to its knowledge base.
If LLM alternatives like AIGO catch on, they could open the door to a host of novel generative AI use cases that require models to operate based on fully up-to-date information.
Liquid learning networks
Liquid learning networks (LLNs) are another alternative to LLMs that promise the ability to learn new information on a continuous basis. Unlike LLMs, LLNs can modify their parameters in real time based on incoming data.
Historically, LLNs have been used mostly for processing time-series data, not for interpreting open-ended natural-language queries and generating novel content. But they could potentially be adapted for generative AI use cases as well.
Small language models
As noted above, an SLM is like an LLM but is trained on a smaller set of data.
In general, bigger is better when it comes to training data, especially if you want to create a model that can support a wide range of use cases -- as most LLMs do. But models trained on narrower data sets require less computing power and tend to be less likely to hallucinate, since hallucination can result from situations where a model's training data is so vast that it can't identify relevant connections between input data and training data.
Because of these advantages, it's possible that we'll see an increase in the use of SLMs as LLM replacements, particularly for use cases that are narrow in scope -- like responding to user queries for a specific type of business -- and require a higher degree of accuracy than LLMs can provide.
Chris Tozzi is a freelance writer, research adviser, and professor of IT and society who has previously worked as a journalist and Linux systems administrator.