The Guest List Analogy
At a party with a guest list:
- Each person can only appear once on the list
- Checking if someone is invited is fast (just scan the list)
- Order doesn't matter - just "are they on the list or not?"
Sets work like this - they store unique elements with O(1) lookup, and duplicates are automatically ignored.
What Makes Sets Special
Uniqueness
Adding to a set:
add("Alice") → Set: {Alice}
add("Bob") → Set: {Alice, Bob}
add("Alice") → Set: {Alice, Bob} ← Alice not added twice!
Duplicates are automatically ignored.
Fast Membership Testing
Is "Alice" in the set?
Array: Check each element → O(n)
Set: Hash lookup → O(1)
For 1 million elements:
Array: 1,000,000 comparisons
Set: 1 comparison!
Set Operations
Core Operations
ADD - Add element (O(1))
Set: {1, 2}
add(3) → {1, 2, 3}
add(2) → {1, 2, 3} (no change, 2 already exists)
REMOVE - Remove element (O(1))
Set: {1, 2, 3}
remove(2) → {1, 3}
CONTAINS - Check membership (O(1))
Set: {1, 2, 3}
contains(2) → true
contains(5) → false
SIZE - Count elements (O(1))
Set: {1, 2, 3}
size() → 3
Mathematical Set Operations
A = {1, 2, 3}
B = {2, 3, 4}
UNION (A ∪ B): All elements in either set
{1, 2, 3, 4}
INTERSECTION (A ∩ B): Elements in both sets
{2, 3}
DIFFERENCE (A - B): Elements in A but not B
{1}
SYMMETRIC DIFFERENCE: Elements in one but not both
{1, 4}
Visual Example
Removing duplicates from: [1, 2, 2, 3, 1, 4, 3, 5]
Add each to set:
1 → {1}
2 → {1, 2}
2 → {1, 2} (duplicate ignored)
3 → {1, 2, 3}
1 → {1, 2, 3} (duplicate ignored)
4 → {1, 2, 3, 4}
3 → {1, 2, 3, 4} (duplicate ignored)
5 → {1, 2, 3, 4, 5}
Result: {1, 2, 3, 4, 5} - all unique!
Where Sets Are Used
1. Remove Duplicates
Input: [1, 2, 2, 3, 1]
unique = set(input)
Output: {1, 2, 3}
2. Fast Lookup Tables
Banned words: {"spam", "scam", "phishing"}
For each comment:
if any word in banned_words:
flag comment
3. Tracking Visited Items
BFS/DFS traversal:
visited = set()
if node in visited:
skip
else:
visited.add(node)
process node
4. Finding Common Elements
Friends of Alice: {"Bob", "Carol", "Dave"}
Friends of Eve: {"Carol", "Frank", "Dave"}
Mutual friends = Alice ∩ Eve = {"Carol", "Dave"}
5. Set Theory in Queries
Users who like Python AND JavaScript:
python_users ∩ javascript_users
Users who like Python OR JavaScript:
python_users ∪ javascript_users
Classic Set Problems
1. Contains Duplicate
Array: [1, 2, 3, 1]
Has duplicate? Yes!
Approach:
Add each element to set.
If element already in set → duplicate found!
2. Intersection of Two Arrays
nums1 = [1, 2, 2, 1]
nums2 = [2, 2]
set1 = {1, 2}
set2 = {2}
intersection = {2}
3. Single Number
Every element appears twice except one.
Array: [2, 2, 1]
Answer: 1
Approach:
XOR all elements (a XOR a = 0)
Or use set: add if not present, remove if present
4. Happy Number
Repeatedly sum squares of digits until 1 or cycle.
Use set to detect cycles:
If sum seen before → cycle → not happy
If sum = 1 → happy!
Set Implementations
Hash Set
Based on hash table:
Pros:
- O(1) add, remove, contains
- Fast!
Cons:
- Unordered
- Uses more memory
Tree Set (Sorted Set)
Based on balanced tree:
Pros:
- Elements maintained in sorted order
- Range queries possible
Cons:
- O(log n) operations
- More complex
Which to Use?
| Need | Use |
|---|---|
| Fast lookup | Hash Set |
| Sorted order | Tree Set |
| Min/max element | Tree Set |
| Simple dedup | Hash Set |
Set vs Other Structures
| Operation | Set | List/Array | Hash Map |
|---|---|---|---|
| Contains | O(1) | O(n) | O(1) |
| Add | O(1) | O(1) | O(1) |
| Remove | O(1) | O(n) | O(1) |
| Get by index | ❌ | O(1) | ❌ |
| Allows duplicates | ❌ | ✓ | ❌ for keys |
Set = Hash Map without values (keys only).
Common Mistakes
1. Expecting Order
Most sets are unordered. Don't rely on iteration order!
2. Unhashable Elements
Lists and dicts can't be in sets (mutable). Use tuples or frozensets.
3. Modifying During Iteration
Don't add/remove while looping over a set!
4. Forgetting O(n) for Set Operations
Union/intersection are O(n), not O(1).
FAQ
Q: Set vs list for uniqueness?
Set: O(1) lookup, unique by design, unordered. List + manual check: O(n) lookup, preserves order.
Q: Can sets contain sets?
No (sets are mutable). Use frozenset for nested sets.
Q: Why is order not preserved?
Hash sets use hash positions, not insertion order. Python 3.7+ dicts preserve order, but sets may not.
Q: When is a list better than a set?
When you need: ordering, duplicates, or index access.
Summary
A set is a collection of unique elements with O(1) membership testing.
Key Takeaways:
- Unique elements only - duplicates ignored
- O(1) add, remove, contains
- Unordered (hash set) or ordered (tree set)
- Set operations: union, intersection, difference
- Used for: deduplication, fast lookup, visited tracking
- Elements must be hashable (immutable)
Sets answer "is this element in the collection?" instantly!
Related Concepts
Leave a Comment
Comments (0)
Be the first to comment on this concept.
Comments are approved automatically.