Skip to main content

đź§  LLMs

A very well-read librarian

The Well-Read Librarian Analogy

Imagine a librarian who has read every book in a massive library - billions of pages of text from the internet, books, and articles.

Now you ask them a question. They don't look anything up. Instead, they predict what words would most likely come next based on everything they've read.

That's how a Large Language Model (LLM) works.

An LLM is a neural network trained on vast amounts of text. It learns patterns in language - grammar, facts, reasoning styles, writing conventions - and uses those patterns to generate human-like text.


How LLMs Actually Work

At its core, an LLM is a prediction machine. Given some text, it predicts what word comes next.

The Generation Process

Input: "The capital of France is"

LLM calculates probabilities:
  "Paris"      → 95%
  "a"          → 2%
  "located"    → 1%
  "the"        → 1%
  ...

Output: "Paris"

Then it feeds the new token back in and predicts the next one, repeating until done.

Tokens, Not Words

LLMs don't think in words - they think in tokens. A token might be a word, part of a word, or a single character.

"ChatGPT is amazing" → ["Chat", "G", "PT", " is", " amazing"]
# 5 tokens for 3 words

The Scale

"Large" is an understatement:

  • Parameters: often tens of billions (or more)
  • Training data: huge text corpora (web pages, books, code)
  • Training compute: lots of hardware for a long time

What LLMs Can Do

1. Text Generation

prompt = "Write a haiku about programming:"
# Bugs hide in the code
# The debugger seeks them out
# Console.log saves

2. Question Answering

prompt = "What causes earthquakes?"
# Earthquakes occur when tectonic plates suddenly slip
# past each other, releasing built-up energy as seismic waves.

3. Summarization

prompt = f"Summarize this article:\n{long_article}"
# The article discusses the impact of remote work on...

4. Code Generation

prompt = "Write a Python function to check if a number is prime"
# def is_prime(n):
#     if n < 2:
#         return False
#     import math
#     for i in range(2, math.isqrt(n) + 1):
#         if n % i == 0:
#             return False
#     return True

5. Translation

prompt = "Translate to French: Hello, how are you?"
# Bonjour, comment allez-vous?

What LLMs Cannot Do

They Don't Actually "Know" Things

LLMs don't have a database of facts. They have learned statistical patterns. This means:

  • They can be confidently wrong (hallucinations)
  • They can't verify their own outputs
  • Their knowledge often has a training cutoff date

They Can't Truly Reason

Despite appearing intelligent, LLMs are pattern matchers. They can simulate reasoning by following patterns from training data, but they don't understand in the human sense.

They Don't Remember Conversations

Each API call is stateless. The model doesn't remember previous conversations unless you include them in the prompt (context window).

They Can't Access the Internet

By default, most LLMs don't have live internet access. Some apps integrate tools (search, browsing, databases), so it depends on the system you're using.


Common Mistakes and Gotchas

Trusting Output Without Verification

LLMs can generate plausible-sounding but incorrect information. Verify facts, especially for:

  • Medical advice
  • Legal information
  • Statistics and dates
  • Code that handles edge cases

Ignoring the Context Window

Every LLM has a maximum context size. Exceeding it means the model forgets earlier content.

Context windows vary a lot.
Some are a few thousand tokens; some are far larger.
Check your specific model/app's limits.

Prompt Quality Matters

Vague prompts get vague answers. Be specific:

Bad:  "Write something about dogs"
Good: "Write a 100-word paragraph explaining why golden retrievers
       make good family pets, focusing on their temperament"

Expecting Consistency

Ask the same question twice, get different answers. LLMs sample from probability distributions. Using a low temperature (often 0) can make outputs more consistent.


LLMs vs Search Engines

AspectSearch EngineLLM
How it worksFinds existing pagesGenerates new text
AccuracyLinks to sourcesMay hallucinate
FreshnessReal-timeTraining cutoff
OutputList of linksDirect answer
Often good forFinding sourcesExplaining, creating

FAQ

Q: What is the difference between GPT and LLM?

LLM is the category (any large language model). GPT (Generative Pre-trained Transformer) is OpenAI's specific family of models. GPT-4 is an LLM. Not all LLMs are GPT.

Q: Why do LLMs hallucinate?

Because they're predicting likely next words, not retrieving facts. If trained text often says X follows Y, the model will say it, even if it's factually wrong in this specific context.

Q: What is the context window?

The maximum amount of text an LLM can consider at once - both your input and the model's output. Think of it as the model's "working memory."

Q: Can LLMs learn from conversations?

Not during normal inference. The model's weights are fixed. It can adapt within the current prompt (in-context), but to permanently change behavior you need fine-tuning (or updating the underlying model/data).

Q: Why are LLMs so expensive to train?

Training requires processing huge amounts of data on lots of GPUs/TPUs for weeks or months.

Q: What is the difference between LLM and chatbot?

An LLM is the underlying model. A chatbot (like ChatGPT) is an application built on top of an LLM, adding conversation handling, safety filters, and a user interface.


Summary

LLMs are neural networks that predict text based on patterns learned from vast training data. They're remarkably capable but fundamentally different from human intelligence.

Key Points:

  • LLMs predict the next token, word by word
  • They learn patterns, not facts
  • Hallucinations happen because they prioritize plausibility over truth
  • Context window limits how much text they can consider
  • Prompt engineering significantly affects output quality
  • They don't truly understand or reason - they pattern match

Understanding these fundamentals helps you use LLMs effectively while avoiding their pitfalls. They're powerful tools, not magical oracles.

Leave a Comment

Comments (0)

Be the first to comment on this concept.

Comments are approved automatically.