Skip to main content
Back to AI Explorations

Behind the Build: SR Mesh – Your Thoughts as a 3D Galaxy

2025-12-22
5 min read

I built a personal knowledge graph where your notes cluster by meaning - and it never phones home.

Every note becomes a star in a 3D galaxy. Similar thoughts drift together. Related ideas connect with glowing edges. And all of it runs 100% in your browser.


The Problem

Every AI-powered note app sends your personal thoughts to the cloud. Your journal entries, half-formed ideas, raw reflections - all processed on someone else's servers.

I wanted semantic search and intelligent clustering without sacrificing privacy.

The technical challenge: run meaningful AI on-device, without the APIs, without the latency, without the data leaving.


The Learning Curve

This project forced me to learn everything from scratch:

Cosine Similarity: How do you measure if two pieces of text are "similar"? Turns out, you convert them to vectors and compute the angle between them. Closer to 1 means more similar.

similarity = dotProduct(a, b) / (magnitude(a) * magnitude(b))

I implemented this by hand because I wanted to understand what the numbers actually meant.

IndexedDB: Browser storage that's async, persistent, and weird. Unlike localStorage, it survives across sessions and can store structured objects. But the API is callback-based and unintuitive. I used the idb wrapper library to make it sane.

D3-force-3d: Physics simulation for 3D graphs. Nodes attract their neighbors and repel distant nodes. Getting the forces balanced took three complete iterations before notes stopped colliding or flying off-screen.


Browser-Based Embeddings

The core AI runs entirely client-side using Transformers.js with the all-MiniLM-L6-v2 model.

It's just 23MB. Downloads once, caches in IndexedDB, runs forever without network.

Each note gets converted to a 384-dimensional vector. These vectors capture semantic meaning - "machine learning" and "neural networks" end up close together, while "grocery list" sits far away.

The embedding happens in a Web Worker so the UI stays responsive. Even with hundreds of notes, there's no freeze.


K-means++ Clustering from Scratch

I implemented K-means clustering myself. Not because I had to - there are libraries - but because I wanted to understand exactly how my notes were being grouped.

The "plus-plus" initialization is the key insight. Standard K-means picks random starting centroids, which often produces suboptimal clusters. K-means++ picks centroids that are spread apart:

  1. Pick the first centroid randomly
  2. For each remaining centroid, pick the point that's farthest from existing centroids
  3. Weight the selection by distance squared (so very distant points are strongly preferred)

This prevents the algorithm from converging to local minima where half your notes end up in one giant cluster.

I also made K dynamic: Math.ceil(Math.sqrt(noteCount)). Two notes get distinct clusters. Twenty notes get five clusters. A hundred notes get ten.


The Classification Layer

Here's something I discovered after implementing K-means: semantic clustering tells you what's similar, but not what it means.

A cluster might contain notes about cooking AND chemistry - they're semantically similar ("mixing ingredients", "combining elements") but topically different.

So I built a 422-line text classifier that analyzes content patterns:

CategoryPattern Examples
QuestionsEnds with "?", starts with "what", "why", "how"
FactsContains "is a", "refers to", "defined as"
Learning"need to learn", "studying", "tutorial"
Personal"I feel", "my goal", "I want"
Projects"building", "implement", "deploy"
Work"meeting", "deadline", "client"
Ideas"what if", "maybe", "could be"
Creative"poem", "story", "song"
Insights"I think", "in my opinion", "seems like"

Each note gets both:

  • A cluster color from K-means (semantic similarity)
  • A category label from text classification (content type)

The galaxy view uses colors from clustering; the labels come from pattern matching. Dual signals, better understanding.


Cosine Similarity for Edges

Notes in the same cluster are positioned near each other in 3D space. But I also wanted to show explicit connections between highly related notes.

I compute pairwise similarity between all notes and draw edges where similarity exceeds a threshold (default: 0.6). The edge weight determines its brightness.

This is O(n²), which gets slow at scale. For now, I recalculate everything when notes change. With 1000+ notes, I'd need incremental updates.


IndexedDB + Web Workers = Smooth UX

The architecture:

  1. Main thread: React UI, Three.js rendering, user input
  2. Web Worker: Embedding generation (Transformers.js is heavy)
  3. IndexedDB: Persistent storage for notes and embeddings

When you add a note:

  • Main thread sends text to Worker
  • Worker generates embedding, returns it
  • Main thread saves to IndexedDB
  • Clustering recalculates
  • Galaxy updates

The UI never blocks. Even generating embeddings for a 1000-word note feels instant.


What I Learned

23MB is plenty. Modern sentence embedding models are surprisingly capable at small sizes. You don't need GPT-4 for every task. The all-MiniLM-L6-v2 model handles English semantic similarity beautifully.

Rule-based classifiers work. With ~200 carefully chosen patterns, you can categorize text accurately without any ML. My classifier handles questions, facts, opinions, and projects reliably.

Privacy through architecture. By designing for local-first from day one, privacy isn't a feature - it's a constraint that shapes every decision. No user accounts. No sync. No tracking. Just your notes, in your browser, on your device.


What I'd Do Differently

Currently, adding a note recalculates all clusters. I'd implement incremental updates - shift the centroid a tiny bit rather than recomputing everything.

I'd also add hierarchical clustering. Right now, all notes are at the same level. But "Machine Learning" should contain "Supervised Learning" which contains "Linear Regression." A tree structure would be more powerful than a flat galaxy.

And I'd export to Obsidian format. My notes are trapped in IndexedDB - I'd love to sync them as Markdown files.


Try It

Leave a Comment

Comments (0)

Be the first to comment on this post.

Comments are approved automatically.