Building Real-World RAG Apps

Building a Retrieval-Augmented Generation (RAG) demo is easy.
Shipping a production-ready AI application with authentication, secure data access, file ingestion, embeddings, and a real chat interface is not.

In The missing pieces to your AI app (pgvector + RAG in prod), the Supabase team walks through an end-to-end architecture that solves this gap. This article distills that methodology into actionable steps, then shows how the same architecture can power many real-world applications simply by changing the data you ingest and how retrieval is scoped.

The Core Production RAG Architecture

The video implements a repeatable pattern:

  1. Authentication & Authorization (RLS)

  2. File ingestion (object storage)

  3. Document chunking

  4. Embedding generation (pgvector)

  5. Similarity search & retrieval

  6. Chat interface with grounded responses

Once this pipeline exists, you can reuse it across products.


Step by step guide

Below is a follow-along, step-by-step checklist for Supabase’s workshop video “The missing pieces to your AI app (pgvector + RAG in prod)” (the “ChatGPT your files” style app). It’s organized to match the workshop flow: auth → file storage → document splitting → embeddings (pgvector) → retrieval → chat UI.

Notes

The video uses checkpoint branches (Step 1/2/3/4). Expect to occasionally need a DB reset when switching checkpoints due to migration name changes.

If you run Edge Functions locally, you can serve them and watch logs with npx supabase functions serve.

0) Prereqs (do this once)

Install & verify

  • Node.js (LTS)

  • Docker (running)

  • Supabase CLI (supabase --version)

Get the workshop code

git clone https://github.com/supabase-community/chatgpt-your-files.git
cd chatgpt-your-files

Install dependencies

Use whatever the repo expects (often npm or pnpm):

npm install
# or: pnpm install

1) Run Supabase locally (baseline)

Start the local Supabase stack

npx supabase start

(Optional but common) Reset to a clean slate

Use this whenever migrations/checkpoints get out of sync:

npx supabase db reset

This “clean slate” move is explicitly called out in the workshop when switching checkpoints.

2) Step 0 / Demo: run the web app

In a second terminal:

npm run dev
# or: pnpm dev

Open the local app URL (usually http://localhost:3000).

3) Step 1 — Auth + File Storage + RLS (uploads)

Checkout the Step 1 checkpoint

The workshop uses step-based checkpoints. Typically:

git checkout step-1

If you had local changes:

git stash

Reset DB after switching steps (recommended)

npx supabase db reset

(Again: migration naming differences between checkpoints are a known gotcha.)

Apply migrations (if not covered by reset)

If you’re not using reset:

npx supabase db push

Confirm storage bucket creation

The workshop emphasizes creating buckets via migrations (not only the UI) so it’s reproducible in teams.

Improve upload RLS (policy tightening)

The video introduces a helper like “uid_or_null” to harden upload checks and avoid malformed owner IDs.

Run and test Step 1

  1. Sign up / sign in

  2. Upload a file

  3. Confirm it appears in Storage and the app UI

Test prompt (in your head / notes):

  • “Upload a PDF, then verify only your user can see it (RLS).”

(Optional) Serve Edge Functions locally (if Step 1 introduces one)

npx supabase functions serve

The video mentions running this manually is useful to monitor logs, even though local supabase start can serve them automatically.

4) Step 2 — Documents table + Chunking / Splitting (ingestion pipeline)

Checkout Step 2

git stash
git checkout step-2
npx supabase db reset

What you’re building in Step 2

  • A documents table (file metadata)

  • A document_sections (chunks) table

  • A pipeline that splits uploaded file text into chunks (“sections”) suitable for embeddings + retrieval

Sanity checks

  • Upload a file

  • Confirm you now see chunk records (sections) created in the DB for that file

Good test prompt to keep in mind

  • “If I upload a long doc, do I get many document_sections rows?”

5) Step 3 — Embeddings generation + pgvector storage (production-style)

Checkout Step 3

git stash
git checkout step-3
npx supabase db reset

What you’re building in Step 3

  • Generating embeddings for each document chunk

  • Storing them on document_sections (or similar)

  • Wiring a trigger/queue so embeddings are created automatically

  • Using an Edge Function named embed to generate embeddings

Create the embed Edge Function (as shown in the workshop)

The workshop explicitly creates an Edge Function called embed, which generates embeddings.

npx supabase functions new embed

Then:

  • Open supabase/functions/embed/index.ts

  • Replace the placeholder code with the workshop’s embed implementation (the video says to delete everything and paste the provided code).

Serve functions locally (so you can see embed logs)

npx supabase functions serve

Verify embeddings are being created

Upload a file, then check:

  • The chunks exist

  • Embeddings are populated per chunk (“embedding generated on each and every document section”).

Quick test prompt

  • “Upload a file and confirm embeddings trickle in for each chunk.”

6) Step 4 — Retrieval + Chat UI (RAG end-to-end)

Checkout Step 4

git stash
git checkout step-4
npx supabase db reset

The workshop again notes a DB reset is commonly required at this checkpoint.

Add the vector search SQL function: match_document_sections

In Step 4, you create a final SQL migration that defines a Postgres function responsible for similarity search (“matching logic”).

Typical flow:

  1. Create a new migration file in supabase/migrations

  2. Paste in the workshop function: match_document_sections(...)

Conceptually, the function takes:

  • (1) a query embedding

  • (2) some limit / parameters
    …and returns the most similar document sections.

Chat approach used in the workshop

The workshop’s Step 4 chat generates embeddings for the user’s query and then uses match_document_sections to fetch relevant chunks before calling the chat completion. It explicitly calls out that user messages need embeddings too.

It also mentions doing query-embedding generation in the browser/frontend for this implementation.

End-to-end test prompts (copy/paste)

Use prompts that force retrieval from the uploaded file:

  • “What kinds of food did they eat?”

  • “Summarize the document in 5 bullets and cite the relevant section.”

  • “Find the paragraph that mentions pricing/termination/security and quote it.”

What success looks like

  • Chat “Send” becomes enabled once the embedding model loads (the workshop demonstrates this UI change).

  • Responses clearly reflect content from your uploaded docs (not generic answers).

7) Debugging cheatsheet (most common issues)

“Things are weird after switching steps”

Do:

npx supabase db reset

This is the workshop’s go-to fix for checkpoint migration mismatches.

“My embed function isn’t running”

  • Ensure it exists (supabase/functions/embed)

  • Serve functions locally and watch logs:

npx supabase functions serve

“No matches from retrieval”

  • Confirm embeddings exist for document sections

  • Confirm your match_document_sections migration is applied

  • Confirm your query text is being embedded and passed into the function

If you tell me whether you’re running local Supabase or Supabase Cloud, I can tailor the commands to the exact workflow (deploying functions vs serving locally, linking projects, env vars, etc.).


What You Can Build with This Architecture

Once the pipeline works, new apps are mostly data + policy changes.

1. Internal Knowledge Assistant (Company GPT)

Changes

  • Data: internal docs, tickets, wikis

  • RLS: org / team scoped

Use cases

  • “How does our billing system work?”

  • “What was decided in the Q2 architecture review?”

2. Customer Support Copilot

Changes

  • Data: help docs, FAQs

  • Retrieval filters: product, plan, language

Use cases

  • Draft support replies

  • Customer-facing AI chat

3. Legal / Contract Review Assistant

Changes

  • Data: contracts, policies

  • Chunking: clause-aware

Use cases

  • “Does this contract allow termination for convenience?”

  • “Compare NDA v2 vs v3”

4. Research & Literature Review Tool

Changes

  • Data: academic PDFs

  • Filters: author, year, topic

Use cases

  • “Summarize findings on X since 2021”

  • “What contradicts this paper?”

5. Personalized Learning Tutor

Changes

  • Data: textbooks, notes

  • RLS: per student

Use cases

  • “Explain this concept using my notes”

  • “Quiz me on chapter 4”

6. Sales Enablement Assistant

Changes

  • Data: pitch decks, CRM notes

  • Filters: industry, deal stage

7. Codebase Q&A / Dev Assistant

Changes

  • Data: repo files, ADRs

  • Chunking: syntax-aware

8. Personal Knowledge Vault

Changes

  • Data: journals, emails

  • RLS: user-only

9. Compliance & Policy Checker

Changes

  • Data: regulations + policies

  • Output: structured compliance gaps

10. Vertical-Specific AI Apps

VerticalExampleHealthcareClinical guideline assistantFinanceInvestment memo analyzerReal estateLease & zoning Q&AHREmployee handbook GPTEducationCourse-specific tutor

Why This Methodology Scales

This architecture works because:

  • Auth & RLS are solved once

  • Embeddings are reusable

  • Retrieval is data-agnostic

  • The chat UI never changes

Only data + access rules evolve.

Next Steps

You can extend this system by:

  • Adding citations per answer

  • Introducing multi-query retrieval

  • Supporting multiple embedding models

  • Designing multi-tenant org schemas

If you want, I can help you:

  • Design a schema + RLS policy for one app

  • Choose the best retrieval strategy

  • Convert this into a multi-tenant SaaS

  • Optimize for cost and latency

Just tell me which direction you want to go. 🚀

If you’d like, I can also:

  • Turn this into a README.md

  • Adapt it for Medium / Dev.to

  • Add architecture diagrams

  • Create a “copy-paste” checklist version

Just say the word.