Beyond Search: What You Can Build with Fully In-Browser Semantic AI

The video “In-Browser Semantic AI Search – free and open source!” demonstrates something far more powerful than a clever search demo. It introduces a new application architecture: AI that runs entirely inside the browser, with no backend, no API calls, and no data leaving the user’s device.

Once you understand this methodology, it becomes clear that semantic search is just the beginning. This article explores what else you can build using the same approach—and why it matters.

The Core Methodology (In Plain Terms)

The architecture shown in the video follows a repeatable pattern:

  1. Run an embedding model locally using Transformers.js

  2. Store vectors locally in PGlite (Postgres running in the browser)

  3. Enable pgvector for similarity search

  4. Persist data in IndexedDB

  5. Query by meaning, not keywords

  6. Operate fully offline after initial load

This creates a self-contained AI system:

  • No servers

  • No inference APIs

  • No per-request costs

  • No privacy concerns

The browser becomes the runtime, the database, and the AI engine.

Why This Architecture Is a Big Deal

Traditionally, AI apps depend on:

  • Cloud inference

  • External databases

  • Network latency

  • Usage-based pricing

This approach flips that model entirely.

Instead of sending user data to AI, you bring AI to the data.

That unlocks:

  • Privacy-first applications

  • Offline-first experiences

  • Zero marginal cost per query

  • New classes of edge and personal software

Once embeddings and vector search are local, any problem that can be expressed as “find similar meaning” becomes fair game.


Below is a step-by-step follow-along guide for the video “In-Browser Semantic AI Search – free and open source!” using the same architecture:

  • Next.js (frontend-only)

  • Transformers.js to generate embeddings in-browser

  • Web Worker to keep the UI responsive while the model downloads/runs

  • PGlite (Postgres in the browser) + pgvector for local vector search

  • IndexedDB for persistence

  • Inner product similarity query to return top matches

1) Create the project

Terminal

npx create-next-app@latest semantic-search-browser
cd semantic-search-browser
npm run dev

Open http://localhost:3000

2) Install dependencies (video chapter “Installing the dependencies”)

Terminal

npm i @xenova/transformers
npm i @electric-sql/pglite
npm i @electric-sql/pglite/vector

3) Create a Web Worker for embeddings (chapter “Generate embeddings in a web worker”)

The video offloads model download + embedding computation to a web worker so the main thread stays smooth.

Prompt (use in Cursor/ChatGPT)

Create a web worker that uses Transformers.js to generate embeddings.
Use a singleton pipeline pattern so the model loads once.
Use the feature-extraction task with the model "supabase/gte-small".
For each message { text }, return { status: "complete", embedding }.
Use pooling and normalization (pooling: "mean", normalize: true).
Also post progress updates during model loading.

✅ In the video they use supabase/gte-small, a small embeddings model with 384 dimensions.

4) Create the PGlite database in the browser using IndexedDB (chapter “Use PGlite in the browser with indexedDB”)

The video uses an IndexedDB-backed database name (example shown: something like "idb://super-semantic-search").

Prompt

Create a database module that initializes PGlite in the browser using IndexedDB.
Load the pgvector extension.
Expose:
- getDb(): returns a ready db instance
- initSchema(db): creates extension + table + index
- seed(db): inserts a small set of sample texts with embeddings
- search(db, embedding, matchThreshold, limit): runs a vector similarity query

5) Initialize schema + enable pgvector (chapters “Set up and seed the PGlite database” + “Enable pgvector in PGlite”)

Your schema should match the model dimensionality:

  • embedding vector(384) because gte-small = 384 dims

Prompt

Write SQL for PGlite that:
1) creates extension if not exists vector;
2) creates a table embeddings(id bigserial primary key, content text, embedding vector(384));
3) creates an HNSW index on embedding for fast similarity search.

(The video specifically calls out creating the vector column with 384 dims and adding an HNSW index.)

6) Seed the database (store text + vectors)

The video seeds a small list of “items” (content) and stores their vectors, then searches those locally.

Prompt

Add a seed step that inserts ~20 short items (mix of furniture, food, animals, electronics).
For each item:
- send the text to the embedding worker
- insert { content, embedding } into the embeddings table
Only seed if the table is empty (count(*) = 0).

7) Wire the UI: query → worker → search(db, embedding) (chapter “How semantic search works”)

The video’s runtime flow is:

  1. user types a query

  2. worker returns an embedding

  3. app runs a DB query using pgvector

  4. results are returned ranked (top 3 by default)

Prompt

In the Next.js page:
- Add an input box for the user query
- On submit, post the query text to the embedding web worker
- When the worker returns { status: "complete", embedding }:
  call search(db, embedding, matchThreshold, limit)
- Render the top 3 matching rows with their content and score
Show a loading state while the model is downloading/embedding is generating.

8) Implement the pgvector inner product search (chapter “Perform inner product search with pgvector”)

The video uses an inner product-style query and returns a limited number of results (default top 3).

Prompt

Implement search(db, embedding, matchThreshold, limit) using pgvector inner product similarity.
- stringify the embedding array to pass it into SQL
- use a match threshold filter
- order by similarity descending
- limit results (default 3)
Return [{ content, similarity }] rows.

9) Test the demo queries shown in the video

Try searches like:

  • furniture → desk/bed/chair

  • food → banana/tomato/hot dog

  • electronics vs fruit (shows “apple” ambiguity)

  • animals vs “mouse” (mouse ambiguity)

This is the “meaning-based” behavior the video highlights (not strict keyword matching).

10) Run and iterate

Terminal

npm run dev

Then iterate using prompts like:

Prompt pack (copy/paste)

1) Add progress events from the worker (model download %, loading, ready).
2) Cache the model so it doesn't re-download on refresh.
3) Add a "Reset DB" button that clears the IndexedDB database.
4) Increase the dataset size and show how HNSW index keeps queries fast.
5) Add a slider for matchThreshold and a dropdown for limit (topK).

If you tell me whether you want this as a Notion-ready checklist, a README.md, or a PDF, I’ll format it accordingly (same steps, just cleaner packaging).


Categories of Apps You Can Build

1. Offline Knowledge Bases

You can build personal or organizational knowledge tools that work without internet access.

Examples:

  • Local documentation search

  • Offline company handbooks

  • Research archives on laptops

Use case:
A consultant, researcher, or field worker can search large document collections anywhere, anytime.

2. Semantic Notes & Personal Knowledge Management

Traditional note search breaks down when wording changes. Semantic search doesn’t.

Examples:

  • Notes apps that understand intent

  • Journals searchable by concept

  • Meeting notes discoverable by topic

This turns unstructured personal data into something genuinely usable.

3. Resume, Portfolio, and Matching Tools

Semantic similarity excels at matching concepts across different phrasing.

Examples:

  • Resume ↔ job description matching

  • Portfolio search by skill intent

  • Candidate ranking tools

All without uploading sensitive personal data to third-party services.

4. Product & Catalog Search

E-commerce search is a classic failure case for keyword matching.

Examples:

  • “Comfortable chair for long hours”

  • “Beginner-friendly camera”

With embeddings, results are driven by meaning, not exact words—entirely in the browser.

5. Chat & Conversation History Search

Conversations are especially hard to search with keywords.

Examples:

  • Slack exports

  • Customer support logs

  • Personal chat histories

Semantic search understands context, not just vocabulary.

6. Code & Developer Tools

Developers think in intent, not filenames.

Examples:

  • “Example of pagination”

  • “Database connection snippet”

  • “Auth middleware pattern”

A local semantic index of code snippets becomes a powerful personal developer assistant.

7. Education & Study Tools

Learning materials are often large, unstructured, and offline-friendly.

Examples:

  • Textbook search

  • Lecture note exploration

  • Exam revision tools

Students can search concepts, not page numbers.

8. Privacy-Sensitive Domains

Some fields simply cannot send data to the cloud.

Examples:

  • Legal references

  • Medical guidelines

  • Compliance documentation

Local semantic search enables AI assistance without regulatory risk.

9. Media & Metadata Exploration

Embeddings aren’t limited to raw text.

Examples:

  • Image caption search

  • Audio transcript exploration

  • Video metadata discovery

As long as content can be represented as text, it can be indexed semantically.

10. Educational AI Demos & Tools

Finally, this architecture is ideal for teaching and experimentation.

Examples:

  • AI playgrounds

  • Workshops and hackathons

  • Interactive explanations of embeddings and vectors

It demystifies AI by making everything visible and local.

The Bigger Insight

This methodology isn’t really about search.

It’s about local AI systems:

  • Local models

  • Local databases

  • Local intelligence

  • Local ownership of data

Search just happens to be the most obvious first application.

Final Thought

For years, powerful AI has required cloud infrastructure. This approach shows that assumption is no longer true.

When AI runs entirely in the browser:

  • Privacy improves

  • Costs drop to zero

  • Latency disappears

  • New product categories emerge

The video demonstrates the technology—but the real opportunity lies in how broadly this pattern can be applied.

Semantic search is only the start.

If you’d like, I can:

  • Adapt this article for a blog, newsletter, or documentation

  • Expand one app idea into a full product concept

  • Create a step-by-step build guide for a specific use case

Just tell me what you want to do next.