From Semantic Image Search to Full-Scale AI Apps: What You Can Build with Embeddings

Modern AI applications are increasingly built on a simple but powerful idea: represent meaning as vectors, store them efficiently, and search by similarity instead of keywords.
A recent demo using Amazon Bedrock, Supabase, and the Amazon Titan multimodal embedding model shows just how far this approach can go—starting with image search and extending to entire product categories .

This article explores that methodology and highlights the wide range of applications you can build on top of it.

The Core Methodology: One Pattern, Many Apps

At its heart, the system demonstrated in the video follows a repeatable pattern:

  1. Ingest data (images, text, or both)

  2. Generate embeddings using a multimodal model

  3. Store vectors in a Postgres database with pgvector

  4. Query by semantic similarity

  5. Optionally filter using structured metadata

  6. Return results directly—or feed them into a generative model

In the demo, images are converted to base64, embedded using Amazon Titan via Amazon Bedrock, stored in Supabase Vector, and queried using natural language like “happy remote worker” or “bike in front of a wall” fileciteturn1file0.

Once you understand this loop, the use cases expand rapidly.

1. Visual & Media Applications

Visual Product Search

Users upload a photo and instantly find visually similar products. This is ideal for fashion, furniture, or collectibles, where traditional text-based search often fails.

Smart Photo Libraries

Instead of tagging photos manually, users search their personal or company photo archives with natural language like “team offsite by the ocean”.

Stock Image & Creative Marketplaces

Creators and agencies can search for images by concept, mood, or style, rather than relying on brittle keyword systems.

2. Document & Knowledge Systems

Semantic Document Search

PDFs, docs, and slides are embedded and searched by meaning. Queries like “how do we handle security incidents?” return relevant sections, not just keyword matches.

Internal Company Knowledge Bases

Engineering docs, policies, and onboarding materials become queryable in plain language—perfect for internal tools.

Legal & Compliance Research

Contracts and clauses can be searched by semantic similarity, helping teams find related language across large document sets.

3. AI Assistants & RAG Systems

Retrieval-Augmented Generation (RAG)

Vector search retrieves relevant context, which is then passed into a text-generation model. This enables accurate, grounded AI assistants.

Customer Support Assistants

Support tickets, FAQs, and documentation are embedded so the system can suggest answers based on prior cases and official docs.

Learning & Research Companions

Students and researchers can query large collections of notes and textbooks conceptually, not just by keyword.

4. Recommendations & Creative Tools

Recommendation Engines

Movies, music, or books can be recommended based on vibe or theme, not just genre labels.

Fashion & Interior Design Tools

Users can discover outfits or room designs that match a specific aesthetic, using images or text prompts interchangeably.

5. Developer & Operational Intelligence

Log & Incident Analysis

Error logs and incident reports are embedded, making it easy to find similar past issues and their resolutions.

Internal Code Search

Engineering teams can search internal snippets by intent, such as “pagination with cursor-based Postgres queries”.


Step-by-step guide

This guide follows the workflow shown in the video: create a Supabase Postgres DB (pgvector) → generate multimodal embeddings with Amazon Titan via Bedrock → upsert vectors → run semantic search queries.

0) Prerequisites

You need

  • An AWS account with Bedrock access enabled

  • A Supabase account (to create a Postgres DB with pgvector enabled)

  • Python 3.10+ recommended

  • Either:

    • Poetry (as used in the video), or

    • pip/venv (I include both options)

1) Create a Supabase project with pgvector

  1. Go to Supabase Dashboard → New project

  2. Choose an org, name the project (e.g. amazon-bedrock-test)

  3. Generate a strong DB password and save it (you’ll need it for the connection string)

  4. Wait for the project to finish provisioning

  5. Go to Project Settings → Database

  6. Copy your connection string (the video uses the “connection pooler” and mentions “session” vs “transaction” mode; session is fine for a direct script run)

✅ Keep your connection details secret (don’t commit them).

2) Get AWS credentials for Bedrock

You’ll need:

  • AWS region (e.g. us-west-2 as shown)

  • Access key ID

  • Secret access key

  • (Optional/if using temporary credentials) session token

3) Set environment variables (recommended)

Create a .env file (or export env vars in your shell). Example:

macOS/Linux

export DATABASE_URL="postgresql://USER:PASSWORD@HOST:PORT/postgres"
export AWS_REGION="us-west-2"
export AWS_ACCESS_KEY_ID="YOUR_KEY"
export AWS_SECRET_ACCESS_KEY="YOUR_SECRET"
export AWS_SESSION_TOKEN="YOUR_SESSION_TOKEN"  # only if you have one

Windows PowerShell

setx DATABASE_URL "postgresql://USER:PASSWORD@HOST:PORT/postgres"
setx AWS_REGION "us-west-2"
setx AWS_ACCESS_KEY_ID "YOUR_KEY"
setx AWS_SECRET_ACCESS_KEY "YOUR_SECRET"
setx AWS_SESSION_TOKEN "YOUR_SESSION_TOKEN"

4) Create the Python project + install dependencies

Option A: Poetry (matches the video)

poetry new bedrock-supabase-image-search
cd bedrock-supabase-image-search
poetry add vecs boto3
poetry shell

Then install (if you’re using a repo with a pyproject.toml already):

poetry install

(The video calls out vecs + boto3 and uses Poetry to manage deps.)

Option B: pip + venv

python -m venv .venv
source .venv/bin/activate  # macOS/Linux
# .\.venv\Scripts\activate # Windows

pip install vecs boto3

5) Prepare a small image dataset

Create a folder like:

mkdir images

Put a few JPG/PNG images in there (the video uses a tiny set like grapes, bike/wall, etc.)

6) Create a “seed” script to embed images + upsert vectors

Create seed.py with these responsibilities (matching the video’s logic):

  • Connect to Supabase Postgres using vecs

  • Create a collection (e.g. image_vectors) with 1024 dimensions (must match Titan config)

  • For each image:

    • Read file → convert to base64

    • Build the Bedrock JSON request body

    • Call Bedrock invoke_model for Titan multimodal embeddings

    • Upsert the embedding + metadata into Supabase Vector

  • Create an index for performance

Terminal step: run the seed script

If you wire it as a Poetry script, you can run it like the video:

poetry run seed

Or directly:

python seed.py

✅ Expected output in the video: it prints embeddings and confirms inserts + index creation.

7) Verify vectors in Supabase Dashboard

In Supabase Dashboard:

  • Navigate to the vecs schema / vector table/collection

  • Confirm you see rows/vectors inserted (the demo shows 4 vectors)

8) Create a “search” script to do semantic image search

Create search.py that:

  1. Creates the vecs client

  2. Loads the image_vectors collection

  3. Uses Bedrock Titan multimodal to create a text embedding from your query

  4. Runs a similarity search (limit=1 in the video)

  5. Optionally applies metadata filtering

  6. Prints/opens the matching image

Terminal step: run searches (with prompts)

The video runs queries like these:

Prompt 1 (simple semantic match)

poetry run search "bike in front of wall"

Prompt 2 (conceptual / associative query)

poetry run search "Jesus turned water into what"

Prompt 3 (vibe-based query)

poetry run search "happy remote worker"

If you’re not using Poetry:

python search.py "bike in front of wall"
python search.py "Jesus turned water into what"
python search.py "happy remote worker"

✅ Expected behavior (as in the video): it returns the most semantically similar image, even when the prompt isn’t a literal description.

9) Troubleshooting checklist

  • No results / weird results

    • Confirm you used 1024 dimensions consistently (collection + Titan output embedding length)

  • Bedrock permission errors

    • Ensure Bedrock is enabled for your account/region and your IAM policy allows bedrock:InvokeModel

  • DB connection fails

    • Re-check Supabase connection string + password

  • Serverless environment

    • The video mentions “transaction mode” pooling for serverless; for local scripts, session mode is usually fine

10) Optional next steps

  • Add metadata filters (e.g., image type, tags, source)

  • Store multiple embeddings (caption embedding + image embedding)

  • Build a tiny UI:

    • Upload image → embed → store

    • Search box → embed query → display top-k results

  • Swap in a RAG workflow: retrieve images/docs, then have an LLM explain “why” the match was chosen


Why This Approach Scales So Well

What makes this methodology powerful is its generality:

  • The same embedding pipeline works for text, images, or multimodal data

  • Postgres + pgvector keeps the system simple and production-ready

  • Metadata filtering allows precision on top of semantic recall

  • The system integrates naturally with generative AI when needed

In other words, image search is just the beginning.

Conclusion

The demo built with Amazon Bedrock, Amazon Titan, and Supabase Vector illustrates a foundational pattern for modern AI apps: store meaning, not just data.
Once you adopt this approach, you can build search engines, assistants, recommendation systems, and internal tools—all on the same architectural backbone fileciteturn1file0.

The real question is no longer “Can we build this?” but rather:

Which problem should we solve first?

If you’d like, I can:

  • Rewrite this for a more technical or business audience

  • Turn it into a tutorial or case study

  • Map one of these ideas to a concrete system architecture

Just tell me what you’d like to adapt or expand.