Matt Smith

Embedded

AI-powered voice memo app that transcribes, summarizes, identifies speakers, and makes everything searchable. Native iOS app with a full cloud AI pipeline and open-source connector ecosystem.

swift python openai gemini firebase supabase obsidian

#What it does

Embedded turns voice memos into searchable, summarized knowledge. Record a thought, and AI handles the rest — transcription, speaker identification, executive summaries, and semantic search across everything you’ve ever said.

#The AI Pipeline

Every voice memo flows through a multi-stage cloud pipeline:

  • Transcription — OpenAI Whisper converts speech to text with high accuracy
  • Summarization — GPT-4o-mini generates concise executive summaries
  • Speaker ID — Pyannote.ai identifies who said what in multi-person recordings
  • Embeddings — Google Gemini creates 3072-dimensional vectors for semantic search
  • Chunking — Long recordings are automatically split into overlapping chunks for unlimited file sizes

#Architecture

  • iOS App — Native Swift/SwiftUI with Firebase Auth and Storage
  • Backend — Google Cloud Functions (Python) triggered by audio uploads
  • Database — Firestore for metadata, Supabase + pgvector for embeddings
  • Security — Row Level Security ensures users only access their own data

#Embedded Connect

The open-source connector ecosystem lets you export your data to the tools you already use:

  • Obsidian — Smart Vault with dashboards, people routing, relationship graphs, and 1:1 file management
  • Notion — Database pages with structured properties and rich content
  • JSON Export — Full data dump with optional embedding vectors
  • Python SDK — Authenticate and fetch your memos in 3 lines of code

View Embedded Connect on GitHub

#Stack

LayerTechnology
AppNative iOS (Swift/SwiftUI)
TranscriptionOpenAI Whisper
SummariesGPT-4o-mini
Speaker IDPyannote.ai
EmbeddingsGoogle Gemini
Vector DBSupabase + pgvector
BackendGoogle Cloud Functions
StorageFirebase Storage + Firestore

Share this post