A large language model is an artificial intelligence system that is designed to understand, process and generate human-like text. Though it does not process text directly. Instead text is broken into numerical tokens, which could represent part of a word, a whole word or multiple words.

LLM’s are typically trained on massive text datasets containing up to trillions of tokens. From these large corpuses of text the LLM is trained to learn complex patterns and relationships with the language of the dataset.

The primary function of an LLM is to process and generate text, focusing on tasks like:

  • text generation - writing stories, poems, articles and more
  • translation - converting text from one language to another
  • summarization - condensing larger texts into more condense summaries
  • answering questions - providing answers to questions based on the given context
  • dialogue systems - engaging in human-like conversations

Weaknesses

One common mistake when using LLM systems is to treat them like a Google search. While an LLM has learned many facts during its training, it is happy to make up answers to any question you give it. Math problems are still quite challenging to LLMs.

Don’t treat an LLM like a database of knowledge, at least not one you fully trust.

Strengths

LLMs a text processors. They are great at manipulating text, through tasks like text summarization. They can be helpful during brainstorming sessions, and bouncing ideas around with.

Examples

Some examples of commercially available LLM’s are:

  • GPT by OpenAI
  • Gemini by Google
  • Claude by Anthropic
  • Grok by xAI

Some examples of open source LLM’s are:

  • Gemma by Google
  • Llama by Meta
  • Mixtral by Mistral AI
  • Phi-3 by Microsoft