The Well-Read Librarian Analogy
Imagine a librarian who has read every book in a massive library - billions of pages of text from the internet, books, and articles.
Now you ask them a question. They don't look anything up. Instead, they predict what words would most likely come next based on everything they've read.
That's how a Large Language Model (LLM) works.
An LLM is a neural network trained on vast amounts of text. It learns patterns in language - grammar, facts, reasoning styles, writing conventions - and uses those patterns to generate human-like text.
How LLMs Actually Work
At its core, an LLM is a prediction machine. Given some text, it predicts what word comes next.
The Generation Process
Input: "The capital of France is"
LLM calculates probabilities:
"Paris" → 95%
"a" → 2%
"located" → 1%
"the" → 1%
...
Output: "Paris"
Then it feeds the new token back in and predicts the next one, repeating until done.
Tokens, Not Words
LLMs don't think in words - they think in tokens. A token might be a word, part of a word, or a single character.
"ChatGPT is amazing" → ["Chat", "G", "PT", " is", " amazing"]
# 5 tokens for 3 words
The Scale
"Large" is an understatement:
- Parameters: often tens of billions (or more)
- Training data: huge text corpora (web pages, books, code)
- Training compute: lots of hardware for a long time
What LLMs Can Do
1. Text Generation
prompt = "Write a haiku about programming:"
# Bugs hide in the code
# The debugger seeks them out
# Console.log saves
2. Question Answering
prompt = "What causes earthquakes?"
# Earthquakes occur when tectonic plates suddenly slip
# past each other, releasing built-up energy as seismic waves.
3. Summarization
prompt = f"Summarize this article:\n{long_article}"
# The article discusses the impact of remote work on...
4. Code Generation
prompt = "Write a Python function to check if a number is prime"
# def is_prime(n):
# if n < 2:
# return False
# import math
# for i in range(2, math.isqrt(n) + 1):
# if n % i == 0:
# return False
# return True
5. Translation
prompt = "Translate to French: Hello, how are you?"
# Bonjour, comment allez-vous?
What LLMs Cannot Do
They Don't Actually "Know" Things
LLMs don't have a database of facts. They have learned statistical patterns. This means:
- They can be confidently wrong (hallucinations)
- They can't verify their own outputs
- Their knowledge often has a training cutoff date
They Can't Truly Reason
Despite appearing intelligent, LLMs are pattern matchers. They can simulate reasoning by following patterns from training data, but they don't understand in the human sense.
They Don't Remember Conversations
Each API call is stateless. The model doesn't remember previous conversations unless you include them in the prompt (context window).
They Can't Access the Internet
By default, most LLMs don't have live internet access. Some apps integrate tools (search, browsing, databases), so it depends on the system you're using.
Common Mistakes and Gotchas
Trusting Output Without Verification
LLMs can generate plausible-sounding but incorrect information. Verify facts, especially for:
- Medical advice
- Legal information
- Statistics and dates
- Code that handles edge cases
Ignoring the Context Window
Every LLM has a maximum context size. Exceeding it means the model forgets earlier content.
Context windows vary a lot.
Some are a few thousand tokens; some are far larger.
Check your specific model/app's limits.
Prompt Quality Matters
Vague prompts get vague answers. Be specific:
Bad: "Write something about dogs"
Good: "Write a 100-word paragraph explaining why golden retrievers
make good family pets, focusing on their temperament"
Expecting Consistency
Ask the same question twice, get different answers. LLMs sample from probability distributions. Using a low temperature (often 0) can make outputs more consistent.
LLMs vs Search Engines
| Aspect | Search Engine | LLM |
|---|---|---|
| How it works | Finds existing pages | Generates new text |
| Accuracy | Links to sources | May hallucinate |
| Freshness | Real-time | Training cutoff |
| Output | List of links | Direct answer |
| Often good for | Finding sources | Explaining, creating |
FAQ
Q: What is the difference between GPT and LLM?
LLM is the category (any large language model). GPT (Generative Pre-trained Transformer) is OpenAI's specific family of models. GPT-4 is an LLM. Not all LLMs are GPT.
Q: Why do LLMs hallucinate?
Because they're predicting likely next words, not retrieving facts. If trained text often says X follows Y, the model will say it, even if it's factually wrong in this specific context.
Q: What is the context window?
The maximum amount of text an LLM can consider at once - both your input and the model's output. Think of it as the model's "working memory."
Q: Can LLMs learn from conversations?
Not during normal inference. The model's weights are fixed. It can adapt within the current prompt (in-context), but to permanently change behavior you need fine-tuning (or updating the underlying model/data).
Q: Why are LLMs so expensive to train?
Training requires processing huge amounts of data on lots of GPUs/TPUs for weeks or months.
Q: What is the difference between LLM and chatbot?
An LLM is the underlying model. A chatbot (like ChatGPT) is an application built on top of an LLM, adding conversation handling, safety filters, and a user interface.
Summary
LLMs are neural networks that predict text based on patterns learned from vast training data. They're remarkably capable but fundamentally different from human intelligence.
Key Points:
- LLMs predict the next token, word by word
- They learn patterns, not facts
- Hallucinations happen because they prioritize plausibility over truth
- Context window limits how much text they can consider
- Prompt engineering significantly affects output quality
- They don't truly understand or reason - they pattern match
Understanding these fundamentals helps you use LLMs effectively while avoiding their pitfalls. They're powerful tools, not magical oracles.
Related Concepts
Leave a Comment
Comments (0)
Be the first to comment on this concept.
Comments are approved automatically.