From S3 to PGVector: A Lesson in Operational Simplicity
This week was one of those weeks.
I moved our search stack across a hard path: S3 vectors → Pinecone → PGVector, across two services, without breaking the core user experience.
From the outside, this sounds like a backend detail. From the inside, this is the difference between “search kind of works” and “search is reliable enough to trust.”Why I Made This Move
As we scaled, I had to ask a simple founder question: Are we adding complexity because it helps users, or because it’s the default architecture everyone copies?
We already run Postgres at the center of our system. User data, media metadata, jobs, auth , all there.
Keeping vectors far away in another runtime sounded flexible, but in reality it created friction:
More moving parts
Harder debugging
Identity mismatches between metadata and vectors
Too many places where failures could hide
PGVector wasn’t just a cost or tooling choice. It was a coherence choice.
When your vectors live near your source-of-truth data, the system gets easier to reason about.What Broke (and What It Taught Me)
The hard part wasn’t ANN math. The hard part was identity discipline.
We had multiple ID forms floating around:
S3 object keys
Generated vector IDs
Canonical media IDs in Postgres
Those are not interchangeable.
In old flows, some embeddings were being written with IDs that looked valid but didn’t match relational truth. Once we enforced foreign keys in PGVector tables, those bad writes started failing loudly.
It hurt for a day. It saved us months of silent corruption.
That’s a founder lesson I keep relearning:Systems don’t fail because of one big wrong decision. They fail because of small naming mismatches no one thought were important.Why PGVector Won for Us (Right Now)
Not because it’s trendy. Because it matches our product reality.
We need strong ownership boundaries per user.
We need deterministic joins between vectors and metadata.
We need one operational surface we can monitor deeply.
We need to debug incidents quickly, not guess across multiple platforms.
PGVector gave us that.
Does this mean external vector systems are bad? No. It means for our current stage, operational simplicity beats architectural novelty.The Migration Principle I Care About Most
I didn’t want a one-way rewrite.
So I kept a provider abstraction in place:
postgres for current primary
s3 as switchable path
Why? Because founders don’t optimize for being “right forever.” We optimize for being adaptable without rewrites.
The goal is not “never change infra again.” The goal is: changing infra should not require rebuilding your company.What Changed Culturally in Our Engineering Process
This migration also reinforced a process shift for me:
No more “it passed locally, ship it.”
Every critical path gets deterministic smoke checks.
Every bug class gets explicit logs.
Every production-like failure gets turned into a codified test scenario.
And maybe most importantly: we now treat “debuggability” as a feature, not an afterthought.What I’m Proud Of
Not that we switched a vector backend. I’m proud that we did it while preserving product flow, learning from failures quickly, and making the system more understandable for the team after me.
Startups often celebrate speed. I still value speed. But I’m increasingly convinced that clarity is compounding.
Clear systems. Clear IDs. Clear ownership. Clear logs.
That’s what scales.

