Family PA To List App
1. Product Objective
To reduce the mental load of household management by providing a "Personal Assistant" interface via WhatsApp. The system will use event-driven messaging to capture, categorise, and track family tasks without requiring users to leave their primary messaging app.
2. Target Audience
• Busy Parents: Individuals who need to capture tasks on the move using voice.
• Family Units: Groups requiring a shared authoritative state for chores and schedules.
3. Core Features
A. Voice-to-Task Conversion (The "PA" Capability)
• Requirement: Users must be able to share a WhatsApp voice note with the bot.
• Mechanism: An Edge Function receives the audio file event, validates the request, and sends the audio to a Language Model (LLM) for transcription and structured parsing.
• Output: The PA extracts the task title, assignee, and due date, then inserts it into the Supabase tasks table.
B. WhatsApp-Native Interface
• Requirement: Interaction occurs entirely within WhatsApp, using OTP-based authentication linked to real-world phone ownership.
• Functionality: Users can query the PA (e.g., "What is on the list for today?") and receive a real-time streamed response or a formatted summary.
C. Shared Family "Brain" (RAG)
• Requirement: The assistant should "remember" previous instructions and context.
• Mechanism: Use Retrieval-Augmented Generation (RAG) to store family notes and tasks as embeddings in a vector database. This allows the PA to answer conceptual questions like, "When did we say the plumber was coming?".
D. Conflict-Free Scheduling
• Requirement: Prevent overlapping family commitments.
• Mechanism: Use database-level constraints in Postgres to block conflicting task assignments or calendar events automatically.
STEP BY STEP GUIDE
0) Prereqs
Install: Node.js, Git, Docker Desktop
Create accounts/projects:
GitHub repo
Supabase project
Vercel project
Twilio account with WhatsApp enabled (Sandbox or Business)
OpenAI API key (for transcription + parsing)
1) GitHub: repo + safety rails
Create repo:
family-paLocally:
git initadd
.gitignorewith:node_modules/.env*MCP.json(and any agent tool caches)
Add branch discipline:
mainprotected (PRs required)work in
feat/*branches
Commit convention (helps AI + humans):
intent:product stepschema:migrations/RLSfeat:featuresfix:repairs
2) Supabase: local stack + project link
Install CLI; then:
supabase initsupabase start(local Postgres/Auth/Storage)
Link to cloud project:
supabase link --project-ref <ref>
Set secrets for Edge Functions (cloud):
supabase secrets set OPENAI_API_KEY="..." TWILIO_AUTH_TOKEN="..." WEBHOOK_SHARED_SECRET="..."
3) Supabase: schema-first migrations (families, tasks, transcriptions)
Create migration files in
supabase/migrations/(never edit applied ones; create new).Tables (recommended minimal):
families (id uuid pk, name, created_at)family_members (family_id, user_id, phone_e164, role, created_at)tasks (id uuid pk, family_id, title, category, assignee_user_id, due_at, status, source_type, source_media_url, confidence, created_by_user_id, created_at)voice_transcriptions (id uuid pk, family_id, from_phone, media_url, transcript, raw_payload, created_at)
Enable extensions if using semantic search later:
create extension if not exists vector;
Apply locally:
supabase db reset
4) Supabase: RLS isolation by family_id
Enable RLS on all tables.
Policies pattern:
membership table controls access
reads/writes allowed only if
exists (select 1 from family_members where user_id = auth.uid() and family_id = <row>.family_id)
Enforce identity in DB:
default
created_by_user_id = auth.uid()on inserts (don’t trust client-provided IDs)
5) Supabase Edge Function: whatsapp-webhook (event → action)
Scaffold:
supabase functions new whatsapp-webhook
Local test serving:
supabase functions serve whatsapp-webhook --no-verify-jwt(local only; do not ship open endpoints) Supabase+1
Implement core logic:
Accept
POSTwebhook from TwilioValidate:
Twilio signature (recommended) and/or a shared secret query param
Parse inbound fields:
Twilio sends webhook params in
application/x-www-form-urlencoded, including media fields when present Twilio+1
Idempotency:
store Twilio
MessageSid(or hash) and ignore duplicates (Twilio retries)
Fetch audio from
MediaUrl0if present, store raw payload for auditCall OpenAI transcription (next section)
LLM parse transcript →
{title, assignee, due_date, category}Insert into
tasks+voice_transcriptionsReply to WhatsApp via Twilio response (TwiML) or Twilio API (confirmation message)
6) OpenAI: speech-to-text + structured task extraction
Use OpenAI Audio Transcriptions:
endpoint:
audio/transcriptionsmodels include
gpt-4o-mini-transcribe,gpt-4o-transcribe,whisper-1OpenAI Platform+2OpenAI Platform+2
Recommended flow inside Edge Function:
Download audio bytes from Twilio media URL
Send bytes to OpenAI transcription
Send transcript to an LLM prompt that outputs strict JSON:
title, assignee (optional), due date (optional), category, confidence, clarifying_question (optional)
Confidence gate:
High confidence → create task + confirm
Low confidence → ask a single clarification question in WhatsApp
7) Twilio + WhatsApp: inbound webhook wiring
In Twilio Console:
Set the WhatsApp sender (Sandbox or approved number)
Configure “When a message comes in” webhook URL → your Supabase Edge Function URL
Confirm inbound request shape:
WhatsApp inbound uses the same general webhook format as SMS/MMS, with
From/Toprefixedwhatsapp:Twilio
For media:
ensure Twilio is configured to pass through media URLs (you’ll receive
MediaUrl0, etc.) Twilio
8) Next.js app: UI + Supabase client
Scaffold Next.js (App Router + TS + Tailwind)
Add Supabase client:
npm i @supabase/supabase-js
Use environment variables:
NEXT_PUBLIC_SUPABASE_URLNEXT_PUBLIC_SUPABASE_ANON_KEY
(NEXT_PUBLIC_is required for browser exposure; these get inlined at build time) GitHub
Build pages:
Family dashboard (tasks list + filters)
Task create/edit
“Recent transcriptions” feed
Auth (phase after CRUD):
OTP / passwordless (aligning with your WhatsApp identity model)
9) Vercel: deploy + env vars
Push to GitHub; import repo in Vercel
Add env vars (Preview + Production):
NEXT_PUBLIC_SUPABASE_URLNEXT_PUBLIC_SUPABASE_ANON_KEYany server-only keys only if used in server routes (avoid if possible)
Deploy
Optional: install Supabase ↔ Vercel integration to manage env vars more easily Vercel+2Supabase+2
10) Cursor: agent-first implementation loop
Create a
BUILD_PLAN.mdchecklist:Schema → RLS → Edge Function → UI → Auth → hardening
Add
.cursorrules:Next.js App Router + TS + Tailwind
SQL migrations in
supabase/migrationsAll data access via Supabase client
No secrets in repo
Every change includes tests or a manual verification checklist
Work phase-by-phase:
Cursor Agent implements one phase
You run the app + webhook tests
Commit with an intent-based message
11) Production hardening checklist
Webhook security:
Verify Twilio signature + shared secret
Idempotency:
no duplicate tasks on retries
Auditability:
store original payload + media URL + transcript
RLS verification:
role-switch tests in Supabase
Observability:
structured logs, error capture, replay dead-letter table
Rate limits:
protect OpenAI calls & media downloads