The Fingerprint Analogy
A fingerprint is unique to you:
- Can identify you among millions
- Can't recreate your body from a fingerprint
- Same person typically = Same fingerprint
Hashing creates a digital fingerprint of data. For a given hash function, the same input produces the same output, but you can't feasibly reverse it.
What Is Hashing?
Input of (almost) any size → Fixed-size output (hash/digest)
"Hello" → 185f8db32271fe25
"Hello World" → 64ec88ca00b268
A large file → 7d865e959b2466
Fixed size for a chosen algorithm, regardless of input size.
Key Properties
1. DETERMINISTIC
Same input → Same hash (for the same algorithm and encoding)
2. ONE-WAY
Can't get input back from the hash in any practical way (for strong cryptographic hashes)
3. FIXED SIZE
Any input → Same output size
4. AVALANCHE EFFECT
Tiny input change → Completely different hash
One-Way: Can't Reverse It
Forward (easy):
"password123" → hash → "ef92b778bafe..."
Reverse (not practical):
"ef92b778bafe..." → ??? → Can't get "password123"
For strong cryptographic hashes, reversing is designed to be computationally infeasible.
The information is lost during hashing.
Why This Matters
Storing passwords:
Store hash, not password
Even if database leaked, passwords unknown!
Attacker has: "ef92b778bafe..."
Attacker wants: "password123"
Attacker method: Guess and check every possible password
Avalanche Effect
"hello" → 5d41402abc4b2a76b9719d911017c592
"Hello" → 8b1a9953c4611296a827abf8c47804d7
"HELLO" → eb61eead90e3b899c6bcbe27ac581660
One tiny change → Completely different hash!
This prevents patterns from being exploited.
Common Hash Functions
For General Use (Fast)
| Algorithm | Digest Size | Status |
|---|---|---|
| MD5 | Fixed | ⚠️ Legacy (collisions exist) |
| SHA-1 | Fixed | ⚠️ Legacy (collisions exist) |
| SHA-256 | Fixed | âś… Common choice |
| SHA-3 | Varies | âś… Modern family |
For Passwords (Intentionally Slow)
| Algorithm | Purpose | Status |
|---|---|---|
| bcrypt | Password hashing | âś… Standard |
| scrypt | Memory-hard | âś… Good |
| Argon2 | Password hashing | âś… Recommended |
| PBKDF2 | Key derivation | âś… Acceptable |
Why Password Hashes Are Different
Fast Hashes = Bad for Passwords
SHA-256: Can be extremely fast on GPUs
Attacker tries every password:
"a" → hash → compare
"b" → hash → compare
"aa" → hash → compare
...
"password123" → MATCH!
Fast hash = Fast cracking.
Slow Hashes = Good for Passwords
bcrypt: Often tens to hundreds of hashes per second (intentionally slow)
Same attack:
Billions fewer guesses possible
Cracking takes years instead of hours
Slow down the attacker!
Work Factor
bcrypt(password, cost=12)
Approximate rates on a modern CPU (varies by hardware):
cost=10: ~10-25 hashes/sec
cost=12: ~2-6 hashes/sec (PHP 8.4 default)
cost=14: <1 hash/sec
Higher cost = Slower = More resistant to guessing
Common tuning goal: on the order of a fraction of a second per verification.
Salting: Defeating Rainbow Tables
The Problem
Rainbow table: Pre-computed hashes
"password" → ef92b...
"123456" → e10adc...
"qwerty" → d8578e...
Attacker looks up hash → Instant password!
The Solution: Salt
Salt = Random data added before hashing
User 1: hash("password" + "x7g9A2") → [unique hash 1]
User 2: hash("password" + "k3mP9z") → [unique hash 2]
Same password → Different hashes!
Rainbow tables useless.
Each user gets unique random salt.
Salt stored with the hash (not secret).
Use Cases for Hashing
1. Password Storage
Don't store plaintext passwords.
Registration:
hash = bcrypt(password, salt)
store(hash)
Login:
hash = bcrypt(entered_password, stored_salt)
compare(hash, stored_hash)
2. File Integrity
Download a file, verify it wasn't tampered:
Original SHA-256: a7b3c9f2e...
Your file SHA-256: a7b3c9f2e...
Match? File is authentic!
Match? File matches the expected hash (integrity)
No match? File was modified!
3. Data Structures
Hash tables use hashes for fast lookup:
hash(key) → bucket location
~O(1) average-case lookup!
4. Digital Signatures
Hash the document, sign the hash.
Verify hash of document matches signed hash.
Common Mistakes
1. Using MD5 or SHA-1 for Security
These have known collision vulnerabilities.
Use SHA-256 or SHA-3 for new projects.
2. Using Fast Hashes for Passwords
SHA-256(password) is fast to crack.
Use bcrypt, scrypt, or Argon2 instead.
3. No Salt
Same password → Same hash
Rainbow tables can work well.
Use a unique salt per password.
4. Weak Salt
Salt should be:
- Random
- Unique per user
- Long enough (often 16+ bytes)
Not: username, empty, or constant.
FAQ
Q: Hashing vs Encryption?
Hashing: one-way, can't get original back Encryption: two-way, can decrypt with key
Q: Why not encrypt passwords?
If you can decrypt, so can attackers who get the key. One-way hashing is safer.
Q: Is bcrypt still good?
Yes. bcrypt is still widely used. Argon2 is newer and often recommended for new projects.
Q: How do I verify a hashed password?
Hash the entered password with same algorithm/salt, compare hashes.
Summary
Hashing creates one-way, fixed-size fingerprints of data - essential for password storage and data integrity.
Key Takeaways:
- Hash = one-way function, can't reverse
- Same input → Same output (deterministic)
- Small change → Completely different hash
- Use bcrypt/Argon2 for passwords (slow!)
- Use a unique salt with passwords
- Use SHA-256/SHA-3 for general hashing
Hashing helps protect passwords even if your database is stolen.
Related Concepts
Leave a Comment
Comments (0)
Be the first to comment on this concept.
Comments are approved automatically.