Beyond Search: What You Can Build with Fully In-Browser Semantic AI
The video “In-Browser Semantic AI Search – free and open source!” demonstrates something far more powerful than a clever search demo. It introduces a new application architecture: AI that runs entirely inside the browser, with no backend, no API calls, and no data leaving the user’s device.
Once you understand this methodology, it becomes clear that semantic search is just the beginning. This article explores what else you can build using the same approach—and why it matters.
The Core Methodology (In Plain Terms)
The architecture shown in the video follows a repeatable pattern:
Run an embedding model locally using Transformers.js
Store vectors locally in PGlite (Postgres running in the browser)
Enable pgvector for similarity search
Persist data in IndexedDB
Query by meaning, not keywords
Operate fully offline after initial load
This creates a self-contained AI system:
No servers
No inference APIs
No per-request costs
No privacy concerns
The browser becomes the runtime, the database, and the AI engine.
Why This Architecture Is a Big Deal
Traditionally, AI apps depend on:
Cloud inference
External databases
Network latency
Usage-based pricing
This approach flips that model entirely.
Instead of sending user data to AI, you bring AI to the data.
That unlocks:
Privacy-first applications
Offline-first experiences
Zero marginal cost per query
New classes of edge and personal software
Once embeddings and vector search are local, any problem that can be expressed as “find similar meaning” becomes fair game.
Below is a step-by-step follow-along guide for the video “In-Browser Semantic AI Search – free and open source!” using the same architecture:
Next.js (frontend-only)
Transformers.js to generate embeddings in-browser
Web Worker to keep the UI responsive while the model downloads/runs
PGlite (Postgres in the browser) + pgvector for local vector search
IndexedDB for persistence
Inner product similarity query to return top matches
1) Create the project
Terminal
npx create-next-app@latest semantic-search-browser
cd semantic-search-browser
npm run dev
2) Install dependencies (video chapter “Installing the dependencies”)
Terminal
npm i @xenova/transformers
npm i @electric-sql/pglite
npm i @electric-sql/pglite/vector
3) Create a Web Worker for embeddings (chapter “Generate embeddings in a web worker”)
The video offloads model download + embedding computation to a web worker so the main thread stays smooth.
Prompt (use in Cursor/ChatGPT)
Create a web worker that uses Transformers.js to generate embeddings.
Use a singleton pipeline pattern so the model loads once.
Use the feature-extraction task with the model "supabase/gte-small".
For each message { text }, return { status: "complete", embedding }.
Use pooling and normalization (pooling: "mean", normalize: true).
Also post progress updates during model loading.
✅ In the video they use supabase/gte-small, a small embeddings model with 384 dimensions.
4) Create the PGlite database in the browser using IndexedDB (chapter “Use PGlite in the browser with indexedDB”)
The video uses an IndexedDB-backed database name (example shown: something like "idb://super-semantic-search").
Prompt
Create a database module that initializes PGlite in the browser using IndexedDB.
Load the pgvector extension.
Expose:
- getDb(): returns a ready db instance
- initSchema(db): creates extension + table + index
- seed(db): inserts a small set of sample texts with embeddings
- search(db, embedding, matchThreshold, limit): runs a vector similarity query
5) Initialize schema + enable pgvector (chapters “Set up and seed the PGlite database” + “Enable pgvector in PGlite”)
Your schema should match the model dimensionality:
embedding vector(384)because gte-small = 384 dims
Prompt
Write SQL for PGlite that:
1) creates extension if not exists vector;
2) creates a table embeddings(id bigserial primary key, content text, embedding vector(384));
3) creates an HNSW index on embedding for fast similarity search.
(The video specifically calls out creating the vector column with 384 dims and adding an HNSW index.)
6) Seed the database (store text + vectors)
The video seeds a small list of “items” (content) and stores their vectors, then searches those locally.
Prompt
Add a seed step that inserts ~20 short items (mix of furniture, food, animals, electronics).
For each item:
- send the text to the embedding worker
- insert { content, embedding } into the embeddings table
Only seed if the table is empty (count(*) = 0).
7) Wire the UI: query → worker → search(db, embedding) (chapter “How semantic search works”)
The video’s runtime flow is:
user types a query
worker returns an embedding
app runs a DB query using pgvector
results are returned ranked (top 3 by default)
Prompt
In the Next.js page:
- Add an input box for the user query
- On submit, post the query text to the embedding web worker
- When the worker returns { status: "complete", embedding }:
call search(db, embedding, matchThreshold, limit)
- Render the top 3 matching rows with their content and score
Show a loading state while the model is downloading/embedding is generating.
8) Implement the pgvector inner product search (chapter “Perform inner product search with pgvector”)
The video uses an inner product-style query and returns a limited number of results (default top 3).
Prompt
Implement search(db, embedding, matchThreshold, limit) using pgvector inner product similarity.
- stringify the embedding array to pass it into SQL
- use a match threshold filter
- order by similarity descending
- limit results (default 3)
Return [{ content, similarity }] rows.
9) Test the demo queries shown in the video
Try searches like:
furniture→ desk/bed/chairfood→ banana/tomato/hot dogelectronicsvsfruit(shows “apple” ambiguity)animalsvs “mouse” (mouse ambiguity)
This is the “meaning-based” behavior the video highlights (not strict keyword matching).
10) Run and iterate
Terminal
npm run dev
Then iterate using prompts like:
Prompt pack (copy/paste)
1) Add progress events from the worker (model download %, loading, ready).
2) Cache the model so it doesn't re-download on refresh.
3) Add a "Reset DB" button that clears the IndexedDB database.
4) Increase the dataset size and show how HNSW index keeps queries fast.
5) Add a slider for matchThreshold and a dropdown for limit (topK).
If you tell me whether you want this as a Notion-ready checklist, a README.md, or a PDF, I’ll format it accordingly (same steps, just cleaner packaging).
Categories of Apps You Can Build
1. Offline Knowledge Bases
You can build personal or organizational knowledge tools that work without internet access.
Examples:
Local documentation search
Offline company handbooks
Research archives on laptops
Use case:
A consultant, researcher, or field worker can search large document collections anywhere, anytime.
2. Semantic Notes & Personal Knowledge Management
Traditional note search breaks down when wording changes. Semantic search doesn’t.
Examples:
Notes apps that understand intent
Journals searchable by concept
Meeting notes discoverable by topic
This turns unstructured personal data into something genuinely usable.
3. Resume, Portfolio, and Matching Tools
Semantic similarity excels at matching concepts across different phrasing.
Examples:
Resume ↔ job description matching
Portfolio search by skill intent
Candidate ranking tools
All without uploading sensitive personal data to third-party services.
4. Product & Catalog Search
E-commerce search is a classic failure case for keyword matching.
Examples:
“Comfortable chair for long hours”
“Beginner-friendly camera”
With embeddings, results are driven by meaning, not exact words—entirely in the browser.
5. Chat & Conversation History Search
Conversations are especially hard to search with keywords.
Examples:
Slack exports
Customer support logs
Personal chat histories
Semantic search understands context, not just vocabulary.
6. Code & Developer Tools
Developers think in intent, not filenames.
Examples:
“Example of pagination”
“Database connection snippet”
“Auth middleware pattern”
A local semantic index of code snippets becomes a powerful personal developer assistant.
7. Education & Study Tools
Learning materials are often large, unstructured, and offline-friendly.
Examples:
Textbook search
Lecture note exploration
Exam revision tools
Students can search concepts, not page numbers.
8. Privacy-Sensitive Domains
Some fields simply cannot send data to the cloud.
Examples:
Legal references
Medical guidelines
Compliance documentation
Local semantic search enables AI assistance without regulatory risk.
9. Media & Metadata Exploration
Embeddings aren’t limited to raw text.
Examples:
Image caption search
Audio transcript exploration
Video metadata discovery
As long as content can be represented as text, it can be indexed semantically.
10. Educational AI Demos & Tools
Finally, this architecture is ideal for teaching and experimentation.
Examples:
AI playgrounds
Workshops and hackathons
Interactive explanations of embeddings and vectors
It demystifies AI by making everything visible and local.
The Bigger Insight
This methodology isn’t really about search.
It’s about local AI systems:
Local models
Local databases
Local intelligence
Local ownership of data
Search just happens to be the most obvious first application.
Final Thought
For years, powerful AI has required cloud infrastructure. This approach shows that assumption is no longer true.
When AI runs entirely in the browser:
Privacy improves
Costs drop to zero
Latency disappears
New product categories emerge
The video demonstrates the technology—but the real opportunity lies in how broadly this pattern can be applied.
Semantic search is only the start.
If you’d like, I can:
Adapt this article for a blog, newsletter, or documentation
Expand one app idea into a full product concept
Create a step-by-step build guide for a specific use case
Just tell me what you want to do next.