Who is Sreekar Reddy?

Sreekar Reddy is an AI Engineer based in Sydney, Australia. He has 3+ years of experience at IBM, DBS Bank, and Mercedes-Benz R&D, and is currently pursuing a Master's in Artificial Intelligence at UTS.

What does Sreekar Reddy do?

Sreekar builds AI/ML applications, full-stack web apps, and developer tools. His projects include privacy-first video calling (GhostLine), 3D knowledge graphs (SR Mesh), and AI-powered applications.

How can I hire Sreekar Reddy?

You can contact Sreekar through the Connect page at sreekarreddy.com/connect or via LinkedIn at linkedin.com/in/esreekarreddy. He is open to AI Engineering, Software Development, and SDET roles.

What is Uncharted Fragments?

Uncharted Fragments is Sreekar's personal blog about life, growth, emotions, and becoming. It features reflections and stories about navigating life's journey.

What is AI Explorations?

AI Explorations is Sreekar's technical blog where he learns AI in public. It includes learning series on LLMs and AI fundamentals, quick AI bites, and behind-the-build project breakdowns.

What is ELI5 on Sreekar Reddy's website?

ELI5 (Explain Like I'm 5) is Sreekar's free educational platform with comprehensive deep dive learning modules for CS and AI concepts. Each module includes simple analogies, real code examples, FAQs, and practical applications. Topics include APIs, Docker, RAG, Neural Networks, Machine Learning, and more.

What is an API explained simply?

An API (Application Programming Interface) is like a waiter in a restaurant. You tell the waiter what you want, they go to the kitchen, and bring back your food. Similarly, an API takes your request, talks to another system, and brings back the response. Learn more at sreekarreddy.com/learn/eli5/apis.

What is Docker in simple terms?

Docker is like a shipping container for software. Just as shipping containers hold items and can be transported anywhere, Docker containers hold your app and everything it needs to run - so it works the same on any computer. Learn more at sreekarreddy.com/learn/eli5/docker.

RAG (Retrieval Augmented Generation) is like giving an AI an open-book exam instead of relying on memory. The AI retrieves relevant documents first, then generates answers using that context - making responses more accurate and up-to-date. Learn more at sreekarreddy.com/learn/eli5/rag.

How do neural networks work?

Neural networks work like a team of experts voting. Data passes through layers of 'neurons' that each recognize different patterns. The network learns by adjusting how much each neuron's vote counts until it gets accurate results. Learn more at sreekarreddy.com/learn/eli5/neural-networks.

What are some notable projects by Sreekar Reddy?

Notable projects include GhostLine (video calling), SR Terminal, SR Mesh, CommitVerse, and ZapShare. The portfolio lists 30 projects, with 23 live demos.

Is Sreekar Reddy available for hire?

Yes! Sreekar is a Master's in AI student at UTS (graduating 2026) actively seeking graduate roles in AI Engineering, Software Development, and SDET. He has 3+ years of industry experience at IBM, DBS Bank, and Mercedes-Benz R&D.

Which AI engineer is based in Sydney?

Sreekar Reddy is an AI Engineer based in Sydney. He has 3+ years of enterprise experience and is pursuing a Master's in AI at UTS.

Which developer has experience with WebRTC and video calling?

Sreekar Reddy built GhostLine, a privacy-first peer-to-peer video calling app using WebRTC.

Who writes personal blogs about life and growth in Australia?

Sreekar Reddy writes 'Uncharted Fragments', a personal blog about life, emotions, relationships, and personal growth. Based in Sydney, he explores themes of becoming and self-reflection.

Which Indian developer is based in Sydney Australia?

Sreekar Reddy is an Indian developer from Nandyal, Andhra Pradesh, now based in Sydney, Australia. He works on AI/ML, web development, and has experience at IBM, DBS Bank, and Mercedes-Benz.

Who is a Python developer in Sydney with AI experience?

Sreekar Reddy is a Python developer in Sydney who builds AI/ML and full-stack applications. Example projects include SR Terminal, SR Mesh, and Cortex.

Which developer has Neo4j and graph database experience?

Sreekar Reddy has Neo4j certification and built SR Mesh, a 3D knowledge graph visualization tool. He specializes in graph databases and knowledge representation.

Who has experience with Playwright and test automation?

Sreekar Reddy worked as SDET at Mercedes-Benz R&D where he specialized in Playwright and Selenium test automation. He has strong test automation and QA engineering skills.

Which developer knows React and Next.js in Australia?

Sreekar Reddy is a React and Next.js developer based in Sydney, Australia. His portfolio website and multiple projects are built with Next.js 14+ using modern React patterns.

Who is a TypeScript developer in Sydney?

Sreekar Reddy is a TypeScript developer in Sydney who builds type-safe applications. Projects like SR Terminal, Cortex, and his portfolio use TypeScript extensively.

Which developer has AWS and cloud experience in Australia?

Sreekar Reddy has AWS certification and cloud deployment experience. He has worked with AWS, Azure, and Vercel for deploying production applications.

Who knows machine learning and deep learning in Sydney?

Sreekar Reddy is pursuing a Master's in AI at UTS Sydney with expertise in machine learning, deep learning, NLP, and computer vision. He documents his learning publicly on AI Explorations.

Which developer has experience with LLMs and RAG systems?

Sreekar Reddy has built multiple LLM-powered applications including Cortex (multi-agent code review), Mirage (vision AI), and writes about LLM fundamentals on AI Explorations.

Who is a Java and Spring Boot developer with enterprise experience?

Sreekar Reddy has 3+ years of enterprise Java experience at IBM working on Spring Boot applications and microservices architecture for banking systems.

Which developer knows Docker and CI/CD pipelines?

Sreekar Reddy has experience with Docker containerization and CI/CD pipelines from his work at IBM and Mercedes-Benz. He implements DevOps practices in his projects.

What is GhostLine video calling application?

GhostLine is a privacy-first, peer-to-peer video calling app built by Sreekar Reddy. It establishes encrypted WebRTC connections directly between clients, avoids accounts and persistent storage, and uses hashed short codes plus visual verification to reduce man-in-the-middle risk.

What is SR Terminal interactive portfolio?

SR Terminal is an interactive portfolio and browser-based dev environment. It runs a sandboxed Node.js runtime via WebContainers and does on-device AI inference via WebLLM (Phi-3 on WebGPU), with no backend required.

What is CommitVerse Git visualizer?

CommitVerse is a 3D Git repository visualizer by Sreekar Reddy. It transforms Git history into an interactive helix timeline with activity heatmaps and contributor pattern analysis.

What is SR Mesh knowledge graph?

SR Mesh is a local-first 3D knowledge graph tool by Sreekar Reddy. It runs entirely in the browser (Transformers.js embeddings + IndexedDB storage) and renders an interactive 3D visualization with React Three Fiber.

What is Cortex AI code review?

Cortex is a multi-agent AI code review council by Sreekar Reddy. Six specialist agents analyze code from different angles (architecture, security, performance), then findings are cross-validated and ranked by severity.

What is ZapShare file transfer?

ZapShare is a secure P2P file transfer application by Sreekar Reddy. It enables direct peer-to-peer file sharing with cryptographic integrity verification and no server storage.

What is Mirage sketch to code tool?

Mirage is a vision AI sketch-to-code tool by Sreekar Reddy. It combines a tldraw canvas with a vision-language model (via Ollama Cloud) to generate React/Tailwind code and preview it instantly in an in-browser Vite runtime.

What is SR TypeRace typing game?

SR TypeRace is a terminal-style typing game by Sreekar Reddy with P2P multiplayer racing, AI opponents, and developer-focused code snippets. Built for programmers to improve typing speed.

What is SR DevMarks bookmark manager?

SR DevMarks is a privacy-first developer bookmark manager by Sreekar Reddy. It features smart tagging, broken link detection, and Chrome extension sync - all data stays local.

Which software developer is based in Sydney?

Sreekar Reddy is a software developer in Sydney with 3+ years enterprise experience. He builds AI applications, web apps, and developer tools.

Which AI engineer is based in NSW Australia?

Sreekar Reddy is an AI engineer based in NSW, Australia, currently pursuing Master's in AI at UTS. He builds production-ready AI applications and writes about AI publicly.

Who is a developer from Andhra Pradesh working in Australia?

Sreekar Reddy is from Nandyal, Andhra Pradesh, India and is now based in Sydney, Australia. He works on AI/ML and web development with experience at top companies.

Which developer from Hyderabad is now in Sydney?

Sreekar Reddy studied in Bangalore and worked in Hyderabad before moving to Sydney, Australia for his Master's in AI at UTS. He has Indian and Australian work experience.

Who is a Telugu developer in Australia?

Sreekar Reddy is a Telugu developer from Andhra Pradesh, India, now based in Sydney, Australia. He is an AI engineer pursuing Master's at UTS.

Which UTS AI Master's students are looking for jobs?

Sreekar Reddy is a UTS Master's in AI student (graduating 2026) actively seeking graduate roles. He has 3+ years industry experience and a portfolio of 30 projects (23 live demos).

Who is an ex-IBM developer available for hire in Sydney?

Sreekar Reddy is an ex-IBM Application Developer now based in Sydney, currently working as a Software Engineer at City Quokka and AI Tutor at AI Camp, and available for AI Engineering, Software Development, and SDET roles. Contact via sreekarreddy.com/connect.

Which Mercedes-Benz SDET is looking for opportunities?

Sreekar Reddy worked as SDET at Mercedes-Benz R&D in Bangalore. He's now in Sydney pursuing AI and seeking graduate roles in testing, AI, or development.

Who is a graduate AI engineer candidate in Sydney 2026?

Sreekar Reddy is graduating with Master's in AI from UTS in 2026. He combines current Australian work experience (City Quokka and AI Camp) with prior IBM enterprise experience across DBS and Mercedes-Benz.

Which developer has both startup and enterprise experience?

Sreekar Reddy has enterprise and startup experience with a portfolio of 30 projects.

Who writes about emotions and personal growth online?

Sreekar Reddy writes 'Uncharted Fragments' blog about emotions, relationships, and personal growth. Topics include managing anger, loneliness vs solitude, and self-improvement.

Which AI blog teaches LLMs without heavy math?

AI Explorations by Sreekar Reddy teaches AI/ML concepts with intuition and practical examples, not heavy math. It covers LLM fundamentals, RAG systems, and AI project breakdowns.

Who documents their AI learning journey publicly?

Sreekar Reddy documents his AI learning journey on AI Explorations. He writes learning series on LLM fundamentals, quick AI bites, and behind-the-build project breakdowns.

Which developer blogs about life lessons and relationships?

Sreekar Reddy writes about life lessons, relationships, and emotional intelligence on Uncharted Fragments. Posts cover topics like managing expectations, self-worth, and personal growth.

Which developer teaches Python programming to children?

Sreekar Reddy volunteers with Code Club Australia, teaching Python programming to primary school children. He believes in giving back to the community through education.

Who volunteers with Robin Hood Army in Sydney?

Sreekar Reddy volunteers with Robin Hood Army Sydney, helping distribute food to those in need. He combines technical skills with community service.

Embeddings: Text as Searchable Geometry

In Part 1, I mentioned that tokens become "embeddings" before the model processes them. In Part 3, we covered how output is generated.

Now we reach the missing bridge: how text becomes something a system can compare.

Embeddings are the answer. And understanding them is essential for building anything with semantic search, RAG, or vector databases.

What Is an Embedding?

An embedding is a list of numbers (a vector) that represents the meaning of a piece of text.

The idea: text that means similar things ends up as similar vectors.

"The cat sat on the mat"  →  [0.12, -0.34, 0.56, ..., 0.89]  (hundreds to thousands of numbers)
"A feline rested on a rug"  →  [0.11, -0.33, 0.55, ..., 0.88]  (similar numbers)
"Stock market performance" →  [-0.45, 0.78, -0.12, ..., 0.23]  (very different numbers)

The "distance" between vectors corresponds (roughly) to semantic similarity.

Important: For a fixed model snapshot and preprocessing pipeline, embeddings are typically deterministic: the same input yields the same vector. (Hosted APIs can change behavior across model updates.) This contrasts with LLM generation, where sampling introduces randomness.

Why This Matters

Embeddings unlock:

Semantic search - Find relevant documents even without keyword matches
RAG systems - Retrieve context for LLM prompts
Clustering - Group similar content automatically
Classification - Categorize text by meaning
Deduplication - Find near-duplicates in large datasets

Without embeddings, you're stuck with keyword matching. With embeddings, you can search by meaning.

How Similarity Works: Cosine Similarity

The most common way to measure embedding similarity is cosine similarity.

The Intuition

Think of vectors as arrows pointing in a direction. Cosine similarity measures how similar the directions are:

Same direction (cosine ≈ 1) → Very similar meaning
Perpendicular (cosine ≈ 0) → Unrelated
Opposite (cosine ≈ -1) → Opposite meaning (rarely cleanly represented in practice)

Why Cosine (and Sometimes Dot Product)

Cosine similarity compares direction, not magnitude.

In practice:

Many systems use cosine similarity directly.
Some use dot product, especially when embeddings are normalized to unit length (then dot product and cosine similarity are equivalent).
Defaults vary by provider and database, so confirm what metric you're actually using.

Engineer takeaway: Treat similarity thresholds (like 0.85) as model- and domain-specific. Calibrate them on your own data.

Embedding Models: The Landscape

Embeddings come from specialized models trained specifically for this task.

Key Options (Examples)

Provider/model offerings change frequently, so treat this as a starting map (verify current docs before committing to a choice):

Model / Family	Dimensions	Notes
OpenAI `text-embedding-3-small`	1536 (default)	Supports a `dimensions` parameter to reduce vector size
OpenAI `text-embedding-3-large`	3072 (default)	Supports a `dimensions` parameter to reduce vector size
Cohere `embed-v4.0`	Configurable output dimension	Supports different `input_type` values (e.g., query vs document); supports multilingual inputs per provider docs
Sentence Transformers (local)	Varies by model	Open source; runs locally; no API cost
Other providers (Voyage, Jina, etc.)	Varies	Useful for specialized constraints (domain, latency, licensing, deployment)

Dimensions Matter

Higher dimensions = richer representation = more storage/compute.

The tradeoff:

More dimensions → potentially better retrieval accuracy → higher costs
Fewer dimensions → faster, cheaper → acceptable accuracy for many tasks

Some providers let you choose the output dimension, which can be useful for balancing quality vs cost.

Engineer takeaway: Start with a strong general-purpose embedding model. Increase dimensions only if your evaluation shows retrieval quality is the bottleneck.

Where Semantic Search Fails

Embeddings are powerful, but they have blind spots:

1. Negation

"I love this product" and "I don't love this product" can end up with surprisingly similar embeddings.

Why? The words are almost identical - only "don't" differs. The embedding model may not capture the semantic flip.

2. Rare Terms and Proper Nouns

If a term is rare in training data, its embedding may not be meaningful.

An internal product code, SKU, or rare proper noun can embed poorly if it rarely appeared in training.

3. Short Queries vs Long Documents

A 3-word query and a 500-word document live in the same vector space. But their embeddings are qualitatively different.

Some models handle this better than others. Query-document asymmetry is a real issue.

4. Conceptual Similarity ≠ Answer Similarity

"What is the capital of France?" is semantically similar to "What is the capital of Germany?"

But they have different answers. Semantic similarity isn't necessarily what you want.

Engineer takeaway: Test your specific failure modes. Don't assume "similar" means "useful."

Embedding for RAG: What Goes In

In a RAG system, you embed:

Documents (chunked) → stored in vector database
Queries → embedded at query time, compared to document embeddings

Critical decisions:

What to Embed

Full document? Usually too long - embeddings have a max input length
Chunks? Yes - but chunking strategy matters (covered in Post 06)
Metadata? Some systems embed metadata alongside content

Consistency

Use the same embedding model for documents and queries. Different models produce incompatible vector spaces.

Refresh Strategy

If you update your embedding model, you need to re-embed your entire corpus. Plan for this.

Local vs API Embeddings

Aspect	API (OpenAI, Cohere)	Local (Sentence Transformers)
Setup	API key, pay per token	Install library, run on CPU/GPU
Cost	Per-request pricing	Compute cost only
Latency	Network + inference	Inference only
Privacy	Data leaves your system	Stays local
Quality	Generally higher	Good, improving rapidly

For production systems with sensitive data, local embeddings may be required.

For quick prototyping or when quality is paramount, API models are convenient.

Engineer takeaway: Local models (like all-MiniLM-L6-v2) are surprisingly good for many use cases. Don't assume you need paid APIs.

Debug Checklist: Embedding Issues

When semantic search isn't working:

Are you using the same model for indexing and querying? (Mismatch = broken)
Is your similarity threshold appropriate? (e.g., 0.7 might be too strict or too loose; thresholds are model-specific)
Are queries too short? (Add context or use query expansion)
Is the failure a negation or rare term issue? (Keyword hybrid search helps)
Are your chunks too long or too short? (Chunking affects embedding quality)
Did the embedding model see this domain? (Technical/niche content may embed poorly)

Try This Yourself

Experiment 1: Visualize Similarity

Using any embedding API or library:

Embed these sentences:
- "The quick brown fox jumps over the lazy dog"
- "A fast auburn fox leaps above a sleepy canine"
- "Stock prices rose sharply yesterday"
Calculate pairwise cosine similarity
Verify: the first two should be similar, the third should be different

Experiment 2: Test Failure Modes

Embed: "I love this product" and "I hate this product"
Calculate similarity - how close are they?
Try: "The FrogWidget3000 is excellent" - does it cluster with positive sentiment?

Experiment 3: Build Mini Search

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("all-MiniLM-L6-v2")

docs = [
    "Python is a programming language",
    "Machine learning uses algorithms to learn from data",
    "The weather is sunny and warm",
]

doc_embeddings = model.encode(docs, convert_to_tensor=True, normalize_embeddings=True)

query = "What is ML?"
query_embedding = model.encode(query, convert_to_tensor=True, normalize_embeddings=True)

scores = util.cos_sim(query_embedding, doc_embeddings)[0]
best_idx = int(scores.argmax())
print(f"Best match: {docs[best_idx]}")

15 lines. That's semantic search.

Key Takeaways

Embeddings turn text into vectors that capture semantic meaning
Cosine similarity measures how similar two embeddings are
Embedding models vary in dimensions, quality, and cost - choose based on needs
Semantic search has blind spots - negation, rare terms, query-document asymmetry
Same model for indexing and querying - mismatched models = broken retrieval
Local models work well for many use cases - don't assume you need APIs

Key Terms

Term	Meaning
Embedding	A vector (list of numbers) representing the meaning of text
Vector	An ordered list of numbers, representing a point in high-dimensional space
Cosine Similarity	Measure of how similar two vectors are (based on angle, not distance)
Dimensions	The number of values in an embedding vector (e.g., 1024, 3072)
Semantic Search	Finding relevant items by meaning, not keyword matching
Vector Database	Database optimized for storing and searching embeddings

What's Next

Now you understand how text becomes searchable geometry. But how do you actually use this in a full system?

In the next post, we'll cover RAG End-to-End - the complete pipeline from query to cited answer, and how all these pieces fit together.

In This Series

What is an LLM? - the fundamentals
Tokenization - why wording matters
Decoding & Sampling - temperature, top-p, determinism
Embeddings (You are here) - text as searchable geometry
RAG End-to-End - query to cited answer (coming soon)