Why Object Storage is Eating the Database World
Three teams. Three problems. Same architecture. That’s not a trend, that’s a pattern.
The Pattern
Dumb storage coupled with a smart index and a ranged GET request is what made the decoupling of compute and storage in the modern cloud era. This pattern was first invented and made possible by the HTTP range header, making HTTPS requests formatted as Range: bytes=X-Y. A standard HTTP feature since 1999.
When you scrub a YouTube video to 2:16, your browser doesn’t download the entire video file from the beginning. Instead, it sends an HTTP request with a Range header telling the server, “give me bytes 10, 230,000 to 10, 432, 000”. The server only returns that slice. Any modern object store, whether it be AWS S3, Google Cloud Storage, or Azure ADLS, supports the same HTTP ranged request natively. So instead of downloading 10 GBs worth of parquet files and reading like 6 row groups out of it, you can ask S3 for exactly the byte ranges of those row groups, provided you have a smart index that can tell you which byte range those 6 row groups are in. S3 doesn’t care what’s in those bytes; it just serves it up as long as the intelligence to ask which bytes to ask for lives in your index.
That’s the exact “dumb storage, smart index” pattern I explore in this post, across three distinct domains: Data Lakes, Physical AI and Robotics, and now applied to Vector Search at scale.
You might ask, if the technology has existed since 1999, why has the primitive only now caught on? The answer is a few recent advancements that have made the idea possible for data architectures, owing to a few notable advancements in the last decade.
The Latency Layer: NVMe in Cloud (2017)
NVMe killed the SATA bottleneck. For high-performance systems like Databricks or Snowflake, or turbopuffer, this is the essential “hot/cold” divide: hot data stays on local NVMe, while cold data sits on S3. This local SSD caching is exactly what makes sub-10ms latency possible; without it, S3 round-trips would destroy performance.
S3 Strong Consistency (2020)
Before 2020, S3 was “eventually consistent,” making it a nightmare for databases to perform immediate reads after writes. The shift to strong consistency was the “floodgate” moment. It transformed object storage from a simple dumping ground into a viable transactional substrate, ending the era of complex workarounds.
The Concurrency Fix: S3 Compare-and-Swap (2024)
Compare-and-swap (CAS) was the final piece of the puzzle. Until late 2024, S3 lacked this fundamental concurrency primitive, without which two concurrent writers could corrupt shared state. Forcing architects to use external coordinators like DynamoDB to safely create a metadata layer where multiple concurrent writers could safely update the same state. Now that S3 supports atomic updates, the entire metadata layer can live on object storage.
So now that all the technology advancements have brought us to this point in time, let me show you how the “dumb storage, smart index” pattern is already being implemented across 3 distinct domains.
Why Now?
This week, AWS announced S3 Files: S3 buckets accessible as POSIX file systems. Legacy file-based applications can now read and write S3 directly without code changes. Object storage isn’t just eating databases anymore. It’s set to eat file systems too.
Part 1 - Delta Lake (or your favorite kind of lettuce)
Delta Lake (or Iceberg or Hudi) implements a transaction log that acts as a metadata-based index, speeding up query performance through file-level statistics, partition pruning, and file pruning. The transaction log helps a reader/writer get to the exact files required for the current operation over an object storage like S3. This is a common “dumb storage, smart index” pattern developed by Databricks over the last decade that has significantly eaten into legacy data warehouse revenues (RIP Teradata and Netezza), and forced proprietary data warehouses like Snowflake to reluctantly adopt open formats like Iceberg.
Part 2 - Physical AI and MCAP
Most engineers don’t know this one. MCAP is an open standard much like Apache Parquet, designed by Foxglove to efficiently store and retrieve Robotics telemetry data. Follows a similar pattern where each MCAP file contains a Summary section (transaction log) and Chunk offsets (Parquet row groups). Where a reader could fetch the exact data required using three ranged GETs instead of one full download.
The Summary section at the end of every MCAP file is a chunk index: it records the byte offset and length of every chunk in the file, along with which topics it contains.
Three ranged GETs to retrieve only what you need:
Footer (28 bytes) - where does the Summary section start?
Summary section - which chunks contain
/robot/status, at what offsets?Only those chunks - everything else is never touched.
In production, MCAP files span hours of multi-robot telemetry, and this pattern skips gigabytes of unrelated sensor data.
Part 3 - Vector Search (turbopuffer and LanceDB)
Vector Search and Vector Databases are the backbone of every production AI application. However, we seem to be repeating the same pattern that we adopted in the legacy data warehouse world, i.e., use proprietary databases (Pinecone, Weaviate). With the cost of LLM token consumption spiralling, it doesn’t make any economic sense to add yet another proprietary black-box database to the mix. So how can we implement the same “dumb storage, smart index” pattern in the vector search world? That’s exactly what I have tried to implement in the demo codebase I built for this post.
My approach was to store embeddings in an object store, which solves the storage cost problem. Then my approach was to download all the vectors onto my system memory and perform a local cosine-similarity search. This works well with my demo dataset of around a thousand vector embeddings. However, I have scaled enough data pipelines in my career to know that my approach is pretty naive and would break as soon as I hit just GBs of vector embeddings.
The scalable, production solution is the same 3-step GET sequence as MCAP. A small index on S3 first (HNSW/IVF), ranged GET to find which embedding clusters match your query vector, and then ranged GET only those clusters.
I realized during building this demo that implementing a smart vector index is not a trivial problem to solve over a weekend. Thankfully, we have the likes of turbopuffer and LanceDB already solving for this.
Why Should I Care?
The “Dumb Storage, Smart Index” architecture is the strategic pivot that separates AI POCs from production-grade solutions.
Cursor pioneered this playbook. They were storing billions of vectors across millions of codebases, every index kept in memory, leading to astronomical costs. Moving to turbopuffer’s object-storage-native architecture cut their costs by 95%. But here’s the more interesting part: they didn’t just save money, they immediately started creating more vectors per user than before.
When infrastructure costs shrink, product ambition expands. Features that were shelved come back. That’s the real “so what” of this pattern. If you’re building on Pinecone, Weaviate, or Elasticsearch today, the question isn’t whether to move; it’s when. The architectural shift has already happened. The teams that adopt this pattern first build features that their competition can’t afford to build.
If you want to see the full pattern implemented across all three domains, Delta Lake, MCAP, and Vector Search, running against a local MinIO object store, the code is here: github.com/snudurupati/vectors-at-rest

