The practical 2026 guide to choosing a database for your product, from first principles, for startups and SMBs.
PostgreSQL is now the most-used database among professional developers at 55.6%, up from 48.7% a year earlier, the largest single-year jump in its history - Stack Overflow 2025 Developer Survey. That one number tells you most of what changed in the database market over the last eighteen months. The capital agrees with the developers: Databricks paid about $1 billion for serverless-Postgres startup Neon - CNBC, and database-platform Supabase raised $500 million at a $10.5 billion valuation, doubling in eight months on the back of AI coding tools - CNBC.
Here is the problem: the database market in 2026 has more good options than at any point in its history, and that abundance is the trap. A founder setting up a new product now faces dozens of credible managed databases, three different serverless billing models, a vector-database category that did not exist five years ago, and a wave of licensing changes that quietly moved the goalposts on what "open source" even means. Pick wrong and you either overpay for scale you will never use, or you build on a foundation that cannot follow you when you grow.
This guide breaks down exactly which databases are worth using in 2026, the real pricing behind each, who is winning and who is quietly dying, and a decision framework you can actually apply on day one. It assumes you are building a product, not writing a research paper, so it starts from the structural question (what does a database really do for a product, and what got cheaper) and reasons up from there. For deeper companion reading, we lean on our guide to building software with AI throughout.
Contents
- The 2026 database landscape: what actually changed
- First principles: how to choose a database for your product
- The scoreboard: databases ranked for product builders
- Managed Postgres: the default starting point
- The big clouds: RDS, Aurora, Cloud SQL, AlloyDB, Spanner
- MySQL and distributed SQL: PlanetScale, TiDB, CockroachDB
- SQLite at the edge: Turso, Cloudflare D1, Durable Objects
- NoSQL document databases: MongoDB, DynamoDB, Firestore
- In-memory and cache: Redis, Valkey, Upstash
- Vector databases: pgvector, Pinecone, Qdrant, turbopuffer
- Analytics and warehouses: ClickHouse, DuckDB, Snowflake
- Time-series and search: Tiger Data, Elasticsearch, Meilisearch
- Backend platforms: Supabase, Firebase, Convex, Appwrite
- Local-first and sync engines: PowerSync, Zero, Electric
- Lakehouse and open table formats: Iceberg, DuckLake, S3 Tables
- The AI-native shift: agents as the primary database user
- How to choose: a decision framework for startups and SMBs
1. The 2026 database landscape: what actually changed
The first thing to understand about databases in 2026 is that the market got bigger and more concentrated at the same time. The worldwide database management systems market reached $119.7 billion in 2024 and grew 13.4% year over year, and Gartner projects it past $200 billion by 2027 - Gartner via SiliconANGLE. Roughly two-thirds of that spend is now cloud, not on-premises, which is why almost every database company you will evaluate is really selling you a managed service, not software you install. Databases are the largest single category of infrastructure software, larger than security or analytics, because every application that stores state needs one.
The second thing to understand is why the money and the developers converged on the same answer. The structural cause is the AI coding boom. When a human writes an application, the choice of database is a deliberate architectural decision made once, early, by someone with opinions. When an AI agent writes the application, the database is just the most boring possible dependency it reaches for, and agents overwhelmingly reach for PostgreSQL because every frontier model has been trained on more Postgres than any other database. Supabase reported a 600% year-over-year increase in databases created on its platform, with Anthropic's Claude Code as the single largest contributor - Supabase. Neon, before its acquisition, found that over 80% of databases on its platform were created by AI agents, not humans - Databricks.
The official survey data shows just how decisively Postgres pulled ahead of the field. The image below is the Stack Overflow 2025 most-used databases chart, and the gap between PostgreSQL and second-place MySQL is the widest it has ever been. PostgreSQL sits at 55.6%, MySQL at 40.5%, SQLite at 37.5%, Microsoft SQL Server at 30.1%, and Redis at 28%, with MongoDB, MariaDB, and Elasticsearch filling out the rest of the top tier.
The third thing to understand is that the independents are consolidating fast, and the direction of travel matters for anyone betting a product on a vendor. Databricks bought Neon and turned it into Lakebase. Snowflake bought Crunchy Data for an estimated $250 million to ship Snowflake Postgres - TechCrunch. MongoDB bought embedding-model maker Voyage AI for $220 million to make vectors native - MongoDB. The independents that could not raise growth capital either shut down (FaunaDB) or went private in a buyout (Couchbase, sold for $1.5 billion to Haveli). The chart below maps the biggest database deals and rounds of the period, and the scale of the numbers explains why so many smaller names are disappearing.
What this means in practice is that the safe, boring, well-funded choices are concentrated, and the interesting, cheaper, more specialized choices are proliferating at the edges. A startup does not have to pick between those two worlds on day one, but it does need to know which world each option lives in. The rest of this guide walks both, with the startup and SMB builder (the audience we profile in our global founder data guide) squarely in mind. Before we get to specific products, though, you need a way to think about the decision that does not depend on whatever is trending, because the trend changes every quarter and your data lives for years.
2. First principles: how to choose a database for your product
Most database advice starts from the wrong place. It asks "which database is best" as if databases were interchangeable products competing on a single axis, like phones. They are not. A database is a machine for storing state and answering questions about it, and the entire field exists because different access patterns make radically different trade-offs cheap or expensive. The right first question is therefore not "which database" but "what shape are my reads and writes, and what am I willing to give up to make the common ones fast." Everything else is downstream of that.
There are really only a handful of fundamental workload shapes, and almost every database is optimized for one or two of them at the expense of the others. Transactional workloads (OLTP) are many small reads and writes of individual records, the heartbeat of any app: a user logs in, a row updates, an order is placed. Analytical workloads (OLAP) are few queries that each scan enormous numbers of rows to compute aggregates, the heartbeat of a dashboard. Caching workloads are blisteringly fast reads of recently-seen data where losing the data is survivable. Search and vector workloads rank documents by relevance or similarity rather than fetching them by key. A database tuned for one of these is usually mediocre at the others, which is why large products end up running several.
The single most useful heuristic in 2026 is the one the market itself adopted: start with Postgres and add specialized stores only when Postgres genuinely cannot do the job. This is not dogma, it is economics. A modern Postgres handles relational data, JSON documents, full-text search, geospatial queries, and vector similarity (through the pgvector extension) in one system with one set of backups, one access-control model, and one thing to operate. Every specialized database you add is a second system to secure, monitor, pay for, and keep in sync. The cost of that second system is almost always higher than founders expect, and the threshold at which it becomes worth paying is almost always further away than vendors imply. We frame this and the other early build decisions in our founder's guide to starting a company in 2026.
The second heuristic is about lock-in posture, and it is where most regret originates. A database choice is really a choice about how hard it will be to leave. Standard Postgres or MySQL is maximally portable: you can pg_dump your data and move to any of a dozen hosts. A proprietary serverless store like DynamoDB or Firestore is maximally convenient but binds you to one cloud and one data model. A reactive backend like Convex is wonderful to build on but stores your data in a model that is not raw SQL. None of these is wrong, but you should choose your lock-in deliberately rather than discover it during a painful migration two years later.
The third heuristic is about cost model, not headline price. In 2026 databases bill in at least four different ways, and the cheapest-looking option can become the most expensive depending on your traffic shape. The patterns below recur across every category in this guide.
- Per-instance (size a server, pay whether busy or idle): predictable, can waste money on idle capacity
- Serverless scale-to-zero (pay per unit of work, nothing when idle): great for spiky or dev workloads, harder to forecast
- Per-request (pay per read or write): punishing for high-volume or poorly-indexed apps
- Resource-based (pay for allocated CPU and RAM): predictable, decoupled from query volume
The practical implication is that your traffic shape should pick your billing model before any brand does. A steady, always-on production app with predictable load is cheapest on a right-sized per-instance plan. A side project, an internal tool, or a per-customer database that sits idle most of the day is cheapest on scale-to-zero serverless. A read-heavy app with unpredictable bursts wants resource-based or per-request pricing. Choosing a database whose billing model fights your traffic is the most common way founders end up with a surprise bill, and it is entirely avoidable. With those three lenses (workload shape, lock-in posture, cost model) you can evaluate any database in this guide, including the ones that have not launched yet.
3. The scoreboard: databases ranked for product builders
Before the detailed profiles, here is the master comparison. The table below scores the most decision-relevant databases for a single, specific question: how good a default is this as the primary or a key data store for a startup or SMB building a product in 2026? That lens matters. A pure cache or a vectors-only store can be excellent infrastructure and still score lower here, because it is a complement to a product's database rather than the database itself. Each option is scored 0 to 10 on five weighted criteria, and the final column is the weighted average, sorted highest first.
The five criteria, with weights, are: Cost and free tier (25%) because early-stage budgets are real and free tiers decide what gets prototyped; Versatility and workload fit (25%) because a database that covers many use cases saves you from running several; Developer experience and AI-readiness (20%) because in 2026 your database is increasingly operated by AI coding tools, not just people; Scale and reliability (15%) because you want a foundation that follows you up; and Openness and portability (15%) because your exit path is your insurance policy.
| # | Database | Category | Cost & Free Tier (25%) | Versatility & Fit (25%) | DX & AI-Ready (20%) | Scale & Reliability (15%) | Openness (15%) | Final |
|---|---|---|---|---|---|---|---|---|
| 1 | PostgreSQL | Relational | 9 - free OSS, runs on any host from $0 | 10 - relational, JSON, geo, full-text, pgvector in one | 8 - every LLM trained heavily on it | 8 - vertical scale plus read replicas to huge size | 10 - PostgreSQL License, zero lock-in | 9.1 |
| 2 | Supabase | Backend/Postgres | 9 - free 50,000 MAU, Pro $25/mo | 9 - DB, auth, storage, realtime, vector bundled | 9 - default of most AI app builders | 8 - one project vertical, Multigres sharding in preview | 9 - Apache 2.0, fully self-hostable | 8.9 |
| 3 | Neon / Lakebase | Serverless Postgres | 8 - free tier, usage-based, no minimum | 8 - serverless Postgres with branching | 10 - copy-on-write branches per agent/PR | 8 - autoscaling, now Databricks-backed | 7 - Neon core OSS, Lakebase managed | 8.2 |
| 4 | MongoDB Atlas | Document | 8 - perpetual M0 free, Flex from $8/mo | 8 - documents, native vector, search | 8 - mature drivers, Voyage AI embeddings | 9 - proven at massive scale, MongoDB 8 | 6 - SSPL core, Atlas proprietary | 7.9 |
| 5 | Convex | Backend platform | 8 - free tier, Pro $25/dev/mo | 7 - reactive document-relational backend | 9 - TypeScript end-to-end, AI-friendly | 7 - solid, younger track record | 7 - FSL converts to Apache 2.0 | 7.7 |
| 6 | Tiger Data | Time-series/Postgres | 7 - Tiger Cloud from $30/mo, OSS ext | 8 - Postgres plus time-series and vector | 8 - full Postgres tooling | 8 - 3M+ active databases | 7 - Timescale License plus Apache | 7.6 |
| 7 | Appwrite | Backend platform | 8 - generous free, Pro $25/project | 7 - DB, auth, storage, functions | 8 - clean SDKs, self-hostable | 6 - Cloud GA only in 2025 | 8 - BSD-3, fully self-hostable | 7.5 |
| 8 | PlanetScale | Relational (MySQL+PG) | 6 - no free tier, from $5/mo Postgres | 7 - MySQL and Postgres, NVMe Metal | 8 - branching, online schema change | 9 - Vitess sharding at YouTube scale | 7 - Vitess Apache, platform proprietary | 7.3 |
| 9 | ClickHouse | Analytics/OLAP | 7 - OSS core, Cloud from ~$66/mo | 6 - analytics, not transactional | 7 - SQL, agent-facing features | 9 - billions of rows, $250M ARR | 8 - Apache 2.0 core | 7.2 |
| 10 | Turso | Edge SQLite | 9 - free 500M reads/mo, $4.99/mo | 5 - read-heavy SQLite, single writer | 7 - embedded replicas, roadmap churn | 6 - Rust rewrite still beta | 8 - libSQL open source | 7.0 |
| 11 | Redis 8 / Valkey | Cache / KV | 8 - Valkey free, 20-33% cheaper managed | 4 - cache and KV, a complement | 7 - ubiquitous, simple model | 8 - battle-tested at scale | 9 - Valkey BSD-3, Redis 8 AGPLv3 | 6.9 |
| 12 | Firebase | Backend platform | 6 - Spark free, Blaze surprise bills | 7 - Firestore plus Data Connect Postgres | 8 - best-in-class mobile SDKs | 9 - Google-scale, mature realtime | 4 - proprietary backend | 6.8 |
| 13 | Cloudflare D1 | Edge SQLite | 8 - free 5M reads/day, paid from $5/mo | 5 - SQLite, 10GB per database cap | 8 - tight Workers integration | 7 - read replicas in beta | 5 - proprietary managed | 6.7 |
| 14 | CockroachDB | Distributed SQL | 6 - free under $10M revenue, then paid | 6 - Postgres-compatible distributed SQL | 7 - good DX, multi-region | 9 - strong consistency at scale | 5 - source-available, enterprise license | 6.5 |
| 15 | DynamoDB | Document / KV | 6 - 25GB free, on-demand cut 50% | 6 - KV/doc, access-pattern bound | 7 - zero-ops serverless | 10 - effectively unlimited scale | 3 - AWS-only, heavy lock-in | 6.4 |
| 16 | Aurora DSQL | Distributed SQL | 6 - free 100K DPU/mo, opaque billing | 6 - Postgres-compatible, not full PG | 6 - new, GA May 2025 | 9 - active-active multi-region | 5 - AWS-only managed | 6.3 |
| 17 | Qdrant | Vector | 7 - free tier, OSS core | 3 - vectors only, a complement | 7 - fast Rust engine, filtering | 7 - scales to billions of vectors | 8 - Apache 2.0 | 6.1 |
Read this table as a starting point, not a verdict. The scores reward general-purpose suitability, which is exactly why PostgreSQL and Postgres-based platforms dominate the top and why excellent but specialized tools like Qdrant or Valkey sit lower despite being best-in-class at their actual job. If your product genuinely is a high-concurrency analytics dashboard or a similarity-search engine, the specialist wins for you and the ranking inverts. The point of the table is to show that for the median product, the median right answer is a Postgres platform plus, at most, one or two specialized stores. The sections that follow explain each category in enough depth to know when you are the exception.
4. Managed Postgres: the default starting point
If you take one recommendation from this guide, it is this: for most new products, your first database should be a managed PostgreSQL, and the only real question is which flavor. Postgres won the developer mindshare war (55.6% usage, highest of any database) and the capital war simultaneously, and the result is a remarkably healthy ecosystem of managed providers competing on developer experience rather than raw engine differences, because the engine is the same open-source Postgres underneath. This is the best kind of market to buy in: the commodity is excellent and free, and vendors compete on convenience.
The reason Postgres is the right default is not popularity, it is range. A single Postgres instance is a relational database, a JSON document store, a full-text search engine, a geospatial database, and (with pgvector) a vector database. That range means a startup can ship its entire backend on one system and defer every "do we need a specialized database" decision until it has real usage data to answer the question. The proof point that matters most: OpenAI runs critical infrastructure on Postgres, scaling a single primary with dozens of replicas, as its engineers detailed at the POSETTE 2025 Postgres conference. If Postgres can carry OpenAI's load, it can carry yours for a very long time.
Among the managed providers, Supabase is the standout for founders who want a complete backend on day one. It is standard Postgres wrapped with authentication, file storage, realtime subscriptions, edge functions, and pgvector, all open source under Apache 2.0 and fully self-hostable. Its free tier is the most generous in the category at 50,000 monthly active users, and the Pro plan is a flat $25 per month with usage-based overages above the included quotas - Supabase pricing. The trade-off is that auth, storage, and realtime are coupled to the platform, so leaving for plain Postgres means re-implementing those layers. The image below is the chart Databricks itself used to argue Postgres is the fastest-growing operational database, the same trend that made Supabase a $10.5 billion company.
Neon takes a different angle: serverless Postgres that separates storage from compute, so it can scale to zero when idle and create instant copy-on-write branches of your entire database, like Git branches for data. That branching feature is why Neon became the default Postgres for AI workflows and why Databricks acquired it for around $1 billion in May 2025, relaunching it as Lakebase, which reached general availability in February 2026 - Databricks. Neon's pricing is genuinely pay-as-you-go with no monthly minimum, billed in compute-unit-hours plus storage. The catch is that usage-based billing is harder to forecast than a flat fee, and Neon is database-only: no built-in auth or storage like Supabase.
Beyond the two leaders, the managed-Postgres field has real depth, and the consolidation tells you who is serious. Prisma Postgres went GA in February 2025 with zero cold starts for teams already living in the Prisma ORM. Snowflake bought Crunchy Data to ship an enterprise Snowflake Postgres. Vercel wound down its own white-labeled Postgres and now routes its Marketplace to Neon, Supabase, and Prisma, so a Next.js team picking "Vercel Postgres" is really picking one of those three. The pricing table below shows the practical entry points.
| Provider | Free tier | Paid entry | Standout feature |
|---|---|---|---|
| Supabase | 50,000 MAU, 500 MB DB | $25/mo Pro | Auth + storage + realtime included |
| Neon / Lakebase | 0.5 GB, 100 CU-hrs | usage-based, no minimum | Copy-on-write branching |
| Prisma Postgres | 100K ops, 500 MB | $10/mo Starter | Zero cold starts, Prisma-native |
| Aiven for Postgres | 1 GB single-node | $5/mo Developer | Vendor-neutral multi-cloud |
The practical guidance is straightforward. Choose Supabase if you want a full backend (database plus auth plus storage) and value a clean open-source exit path. Choose Neon or Lakebase if you want serverless economics, database-per-branch workflows, or you are building on AI coding tools that spin databases up and down constantly. Choose Prisma Postgres if your team already standardizes on Prisma. Choose plain RDS or Cloud SQL (covered next) if you are already deep in one cloud and want the no-surprises option. In every one of these cases you are running real Postgres, so the decision is reversible, which is exactly why it is the safe default. For the broader build-with-AI context, our guide to coding and company-building with the latest AI tools covers how these backends slot into an AI-driven workflow.
5. The big clouds: RDS, Aurora, Cloud SQL, AlloyDB, Spanner
The three hyperscalers (AWS, Google Cloud, Microsoft Azure) sell managed relational databases that most startups will eventually touch, especially once they are already running compute on one of those clouds. The strategic thing to understand in 2026 is that each cloud now offers two distinct layers: classic per-instance managed Postgres and MySQL that you size by virtual CPU, and a newer wave of serverless or distributed engines billed by consumption units. Picking between those two layers is the real decision, and it again comes back to your traffic shape rather than the brand on the console.
The classic layer is the workhorse. Amazon RDS and Google Cloud SQL and Azure Database for PostgreSQL all give you a managed Postgres or MySQL instance sized by vCPU and memory, with predictable per-hour billing. Google Cloud SQL charges roughly $0.0413 per vCPU-hour for its Enterprise edition, with committed-use discounts up to about 52% for a three-year commitment - Google Cloud. The appeal is total predictability: you know your bill, the tooling is mature, and the database is boring in the best way. The downside is the absence of scale-to-zero, so you pay for the instance whether or not anyone is using it, which makes this layer a poor fit for spiky or intermittent workloads.
The serverless and high-performance layer is where the 2025 to 2026 launches concentrated. AWS Aurora Serverless gained the ability to scale all the way to zero in late 2024 and was renamed simply "Aurora serverless" in April 2026, billing in fine-grained Aurora Capacity Units at about $0.12 per ACU-hour - AWS. Google AlloyDB is a higher-performance Postgres-compatible engine with built-in AI vector acceleration, and its AlloyDB Omni edition can run the same engine anywhere, including on your laptop. The most consequential new entrant is Amazon Aurora DSQL, which reached general availability in May 2025 as a serverless, active-active, multi-region distributed SQL database with strong consistency, billed in Distributed Processing Units at $8 per million DPUs - AWS.
The distributed engines (Aurora DSQL, Google Spanner, Azure Cosmos DB) deserve a clear-eyed framing because their marketing oversells their relevance to most startups. They solve a genuine problem, multi-region strong consistency and effectively unlimited horizontal write scale, that the overwhelming majority of products do not have. AWS claims Aurora DSQL is 4x faster than Spanner on reads and writes, and Spanner has TrueTime-backed global consistency, and both are remarkable engineering. But a single Postgres with read replicas is cheaper, simpler, and sufficient until you are operating across continents with write traffic that genuinely cannot fit on one primary. We unpack a parallel infrastructure decision in our guide to payment platforms for your business, where the same "do not over-buy" logic applies.
What this means for a startup is a simple sequencing rule. If you are already on a cloud, start with its classic managed Postgres (RDS, Cloud SQL, Azure Database for PostgreSQL) for steady production load, reach for the cloud's serverless tier (Aurora serverless, Azure SQL serverless) for spiky or dev workloads to capture scale-to-zero savings, and only adopt a distributed engine when you can articulate exactly why single-region Postgres has stopped working for you. The clouds want you to skip straight to the expensive distributed tier; first principles say earn your way there. The one genuinely new reason to consider the distributed tier early is multi-region active-active availability for a global product, which is a real requirement for some businesses and a vanity requirement for most.
6. MySQL and distributed SQL: PlanetScale, TiDB, CockroachDB
MySQL is still the second-most-used database in the world at 40.5%, and its ecosystem went through a genuinely strange 2024 to 2026 that every founder evaluating it should understand. The headline is that the MySQL world bifurcated: the best-known managed MySQL company pivoted toward Postgres, the open-source MySQL fork went private in a fire sale, and the most interesting growth happened in distributed SQL engines that speak the MySQL or Postgres wire protocol while scaling horizontally underneath. If you are choosing MySQL in 2026, you are mostly choosing how you want it scaled, not the engine itself.
PlanetScale is the cautionary and instructive case. It built the best managed MySQL on the planet using Vitess, the sharding system that powers YouTube and Slack, then removed its free Hobby tier in April 2024, alienating exactly the hobbyists and bootstrappers who had championed it - PlanetScale. In 2025 it course-corrected hard: it launched PlanetScale for Postgres (its fastest-growing product), reset its entry price to $5 per month for single-node Postgres, and cut its NVMe Metal tier to $50 per month from a previous floor near $600 - PlanetScale. The lesson is not that PlanetScale is bad (the technology is excellent) but that a vendor's free-tier and pricing decisions are part of the product, and they can change under you.
The distributed-SQL category is where the most durable innovation sits, and it splits cleanly by wire protocol. TiDB Cloud from PingCAP is MySQL-compatible, horizontally scalable, and offers a genuinely large free tier: 250 million request units plus 25 GiB of storage per organization per month, with scale-to-zero serverless - PingCAP. It also does HTAP, serving analytics from columnar replicas without a separate warehouse. On the Postgres-compatible side, CockroachDB offers strong multi-region consistency, but its licensing changed in a way that matters: in November 2024 it retired its free open-source Core edition and moved to a single enterprise license that is free only for companies under $10 million in annual revenue - SD Times.
That CockroachDB license change is worth dwelling on because it captures a pattern that recurs across this guide. A database that is free and open today can become a paid enterprise product tomorrow, and the trigger is often your own success: CockroachDB is free until your company crosses $10 million in revenue, at which point you owe per-CPU fees. For a startup that is a reasonable deal (free when you are small, pay when you can afford it), but it is a deal you should enter with eyes open, because the alternative, the genuinely open-source YugabyteDB under Apache 2.0, exists precisely for teams that refuse that future tax. Cockroach Labs raised a $278 million Series G at a roughly $5 billion valuation in October 2025, so the company is healthy; the question is whether you want your scaling milestone to also be a billing milestone.
The practical recommendation for the MySQL and distributed-SQL space is to match the engine to your migration reality and your scale ambition. Choose PlanetScale (MySQL or Postgres) if you need proven horizontal sharding and want managed Vitess without operating it yourself. Choose TiDB Cloud if you want MySQL compatibility plus a large free tier and built-in analytics. Reach for CockroachDB or Aurora DSQL or Spanner only when multi-region strong consistency is a hard requirement, and prefer YugabyteDB if open-source licensing is non-negotiable for you. For the vast majority of products that think they need distributed SQL, a single well-indexed Postgres with read replicas is the cheaper, simpler answer, and you can always migrate later because the wire protocol is compatible.
7. SQLite at the edge: Turso, Cloudflare D1, Durable Objects
SQLite is the most-deployed database engine on earth (it runs in every phone and browser), and in 2026 it finally escaped the embedded niche to become a real production substrate for web products. The structural insight that made this happen is that SQLite is a library, not a server, so it can be placed wherever the compute is: inside an edge worker, replicated to the user's device, or co-located with application code. That changes the latency math entirely, because a read that never leaves the machine is faster than any network round-trip to a central database can ever be.
The category split into two distinct shapes, and they solve different problems. The first is ship the database to the edge, where the win is sub-millisecond local reads and per-row, scale-to-zero billing instead of an always-on instance. Turso pioneered this with libSQL and its embedded-replica model, which syncs a real local SQLite file for microsecond reads while forwarding writes to a primary. Its free tier is striking: 5 GB storage, 500 million row reads, and 10 million row writes per month, with a paid Developer plan at just $4.99 per month and unlimited databases - Turso. The catch is that Turso's ambitious Rust rewrite (originally Limbo, now Turso Database) is still in beta in mid-2026, so the production product relies on the older libSQL path, and the roadmap has churned.
The second shape is co-locate the database with the compute, which is the Cloudflare strategy. Cloudflare D1 is managed SQLite for Workers, billed per row read and written: the free tier allows 5 million row reads per day and the paid tier includes 25 billion monthly row reads for a $5 minimum with no egress charges - Cloudflare. Even more interesting architecturally, SQLite-backed Durable Objects put a tiny SQLite database inside the same object as your code, giving strongly-consistent per-entity state (a chat room, a document, a game session) with zero network latency to its own data. Cloudflare also ships Hyperdrive for teams who want to keep a conventional Postgres but accelerate it from the edge, and R2 SQL for querying large Iceberg datasets in object storage.
The per-row billing model deserves a warning because it is the most common way edge SQLite produces a surprise bill. When you pay per row read, an unindexed query that scans a million rows costs you a million row reads, even if it returns one result. On a per-instance database that full scan is just slow; on D1 or Turso it is slow and expensive. This makes edge SQLite a superb fit for read-heavy, well-indexed, partitionable workloads (per-tenant databases, read-mostly content, lookup tables) and a poor fit for write-heavy relational cores or analytics that scan large tables. The single-writer nature of SQLite reinforces this: it shines when you can shard state across many small databases rather than concentrating writes on one.
The recommendation here is narrower than for Postgres, because the technology is narrower. Use Turso when you want a real local SQLite copy with cheap remote sync, especially for database-per-customer architectures or local-first apps. Use Cloudflare D1 and Durable Objects when you are already building on Workers and want a zero-ops relational store with scale-to-zero economics, particularly for per-tenant or per-entity state. Use Litestream (free and open source) if you just want to stream-backup a single self-hosted SQLite file to object storage. And do not reach for edge SQLite as your primary store if your product is write-heavy or needs complex multi-table transactions across a large dataset; that is still Postgres territory.
8. NoSQL document databases: MongoDB, DynamoDB, Firestore
Document databases store flexible, nested records (usually JSON) rather than rigid relational tables, and they remain the right choice for a real set of products: those with rapidly-evolving schemas, deeply nested data, or access patterns that map cleanly to documents rather than joins. The 2026 reality, though, is that this category consolidated hard around a few well-capitalized survivors, and the independents that could not raise growth capital are gone. That consolidation actually simplifies the decision, because the credible options are now few and each is clearly differentiated.
MongoDB Atlas is the default document database for most products, and it spent its cash position aggressively to stay there. It has the best free tier in the category (a perpetual M0 shared cluster with no credit card), a serverless Flex tier starting at $8 per month, and after acquiring embedding-model maker Voyage AI for $220 million it now offers native vector search with built-in embeddings - MongoDB. MongoDB 8.0 brought up to 32% higher throughput and 56% faster bulk writes - MongoDB. The trade-off is that dedicated clusters get expensive quickly, and the Flex tier caps at $30 per month before you must jump to a roughly $57-per-month dedicated instance, a noticeable cost step.
Amazon DynamoDB is the serverless document and key-value store for teams already all-in on AWS, and it became materially more competitive when AWS cut on-demand pricing by 50% effective November 2024: write requests dropped from $2.50 to $1.25 per million and reads from $0.50 to $0.25 per million - AWS. The chart below shows that cut, which made true serverless DynamoDB pricing competitive with provisioned capacity for the first time.
DynamoDB's strength (true serverless scale to zero, predictable single-digit-millisecond latency, effectively unlimited scale) comes paired with its defining weakness: total AWS lock-in and an access-pattern-first data model that punishes you if your queries evolve. You design DynamoDB tables around the exact questions you will ask, and changing those questions later is painful. That is the opposite of Postgres, where you can ask new questions of existing data at will. Google Firestore rounds out the category as the fast-prototyping choice for mobile and web apps, and in 2025 it added a MongoDB-compatible API plus Firebase Data Connect (managed Postgres via Cloud SQL), giving Google a relational path alongside its NoSQL roots.
The clearest signal in this category is what died, and it is instructive. FaunaDB shut down its hosted service in May 2025, with the company stating plainly that it could not raise the capital required to run a global operational database independently - The Register. Couchbase went private in a $1.5 billion buyout. The structural lesson is that operating a database at global scale is enormously capital-intensive, so a startup betting its product on a document database should prefer the well-funded survivors (MongoDB, the hyperscaler offerings) over technically-elegant independents that may not have the balance sheet to last. For most products that genuinely need a document model, MongoDB Atlas is the safe default, with DynamoDB the right call only if you are committed to AWS and have stable, well-understood access patterns.
9. In-memory and cache: Redis, Valkey, Upstash
Caching is the one category in this guide where the database is almost never your product's primary store; it is a speed layer in front of one. An in-memory key-value store like Redis holds hot data (sessions, rate-limit counters, computed results, queues) and serves it in microseconds, accepting that losing the data is survivable because the source of truth lives elsewhere. Nearly every product at scale runs one. The 2026 story here is not about features, it is about a licensing earthquake that permanently reshaped who you should actually use, and it is the clearest example in the whole guide of why licensing belongs in your evaluation.
The sequence matters. In March 2024 Redis abandoned its permissive BSD license for the restrictive SSPL and RSALv2, a move aimed at the cloud providers who resold it. The community responded within days by forking the last open version into Valkey, placed under the neutral governance of the Linux Foundation and backed by AWS, Google, and Oracle - InfoQ. Faced with the cloud giants defecting en masse, Redis reversed course: Redis 8 returned to open source under AGPLv3 in May 2025, adding a new Vector Sets data type and claiming up to 87% faster commands. But the retreat came too late to win back the providers, who now ship Valkey by default.
The practical consequence is a genuine cost difference, not just a philosophical one. AWS ElastiCache for Valkey is priced 33% lower than other engines for serverless and 20% lower for node-based clusters, enabling caching-cost reductions of up to 60% - AWS. Valkey crossed 100 million Docker pulls at its two-year mark, up 17x year over year - AWS. For a new startup spinning up a cache in 2026, Valkey is the default: it is Redis-compatible, genuinely open under BSD-3, cheaper on every major cloud, and carries no relicensing risk because no single vendor controls it.
There are good reasons to choose something other than Valkey, and they are specific. Pick Redis 8 if you specifically want its bundled stack (Search, JSON, Time Series, and the new Vector Sets) under one open license and you are comfortable with AGPLv3. Pick Upstash if you are building serverless or edge applications and want per-request pricing with zero idle cost and an HTTP-callable API that works from edge runtimes where TCP Redis clients cannot reach; its free tier covers 500,000 commands per month and pay-as-you-go is $0.20 per 100,000 commands - Upstash. Pick DragonflyDB if you have outgrown a single Redis node and want vertical scale-up throughput without managing a cluster, or Momento if you want a fully serverless, zero-ops cache billed purely on data transferred. The default, though, is Valkey, and the reason is that the licensing drama handed the community a better-governed, cheaper, fully-compatible option, and there is rarely a reason to take on single-vendor risk for a commodity cache.
10. Vector databases: pgvector, Pinecone, Qdrant, turbopuffer
Vector databases store high-dimensional embeddings and answer similarity queries (find the most semantically-similar documents to this one), and they are the infrastructure under retrieval-augmented generation, semantic search, and AI memory. This category did not meaningfully exist five years ago and is now, per Gartner, the fastest-growing segment of the entire database market, driven entirely by generative AI. It is also the category where founders most often over-buy, because the marketing implies you need a dedicated vector database the moment you touch AI, and for most products that is simply false.
The first-principles question is not "which vector database" but "do I need a separate system at all." In 2026 the answer for the majority of products is no, because pgvector (the Postgres vector extension) plus its companions pgvectorscale and VectorChord make "just add vectors to the database you already have" viable up to roughly 10 to 50 million vectors. The benefits are enormous: your vectors live next to your relational data, share transactional consistency and backups and access control, and cost nothing beyond the Postgres host. Supabase benchmarks have shown pgvector with HNSW indexing matching or beating dedicated vector databases at the million-vector scale, which is why "just use the database you already have" is the right starting answer for most AI features.
When you genuinely outgrow Postgres-native vectors, the dedicated engines split by cost model, and 2026 introduced a structural cost breakthrough worth understanding. The old generation (Pinecone, Weaviate, Qdrant in their default modes) keeps vectors in memory, which is fast but expensive at billions of vectors. The new generation is object-storage-first: engines like turbopuffer and LanceDB store vectors on cheap object storage (around $0.02 per GB versus $2 or more per GB in memory), trading a little cold-read latency for up to 100x lower storage cost. The chart below shows why that matters at scale.
That cost difference is not theoretical; it is why specific large products migrated. turbopuffer powers Cursor and Notion: Cursor runs one namespace per codebase across tens of millions of namespaces, and Notion stores over 10 billion vectors, reportedly saving millions versus an in-memory approach - pmf.show. Among the more conventional options, Pinecone is the zero-ops managed default (Standard from $50/month minimum, fully closed-source), Qdrant is the high-performance open-source Rust engine with predictable resource-based pricing, Chroma is the simplest path from prototype to RAG demo, and Zilliz/Milvus is the choice once you exceed roughly 100 million vectors. The pricing table below frames the entry points.
| Engine | Model | Entry pricing | Best for |
|---|---|---|---|
| pgvector | Postgres extension | Free (your Postgres host) | Under ~10-50M vectors, default |
| Qdrant | Open-source, resource-based | Free tier, then per-resource | Predictable cost, self-host option |
| Pinecone | Closed, usage-based | $50/mo Standard min | Hands-off managed, bursty traffic |
| turbopuffer | Object-storage-first | $64/mo minimum | Billions of vectors, multi-tenant |
The recommendation is the most contrarian in this guide and the most likely to save you money: start with pgvector in the Postgres you already run, and only adopt a dedicated vector database when you hit a specific wall (scale beyond tens of millions of vectors, latency-at-recall requirements, or multimodal data that does not belong in Postgres). When you do graduate, choose by cost model rather than headline QPS: object-storage-first (turbopuffer, LanceDB) for huge multi-tenant vector volumes where storage dominates, resource-based open-source (Qdrant) for predictable cost, and managed usage-based (Pinecone) for bursty traffic you do not want to operate. The embedding models that feed these systems also matter; the current production options include OpenAI's text-embedding-3, Voyage AI's voyage-3.5 and voyage-4 (now owned by MongoDB), and Cohere Embed v4.
11. Analytics and warehouses: ClickHouse, DuckDB, Snowflake
Analytical databases answer the opposite kind of question from transactional ones: instead of fetching one user's record, they scan billions of rows to compute aggregates for a dashboard or report. You need one once your product has enough data that running analytical queries against your primary Postgres would slow down the app. The 2026 market splits along a clear axis: capital-rich incumbents (Snowflake, BigQuery, Databricks) optimizing for governed enterprise lakehouses, and an open-source wave led by ClickHouse winning latency-sensitive, customer-facing, and increasingly agent-facing analytics at a fraction of the cost.
ClickHouse is the standout, and its trajectory tells the story of the category. It is an open-source columnar engine built for blistering scan performance, and the market rewarded it: it raised a $350 million Series C at a $6.35 billion valuation in May 2025, then a $400 million Series D at a $15 billion valuation in January 2026, more than doubling in under a year - TechCrunch. By May 2026 it had crossed $250 million ARR with 4,000 customers and launched ClickHouse Agents, a Claude-powered analytics service. ClickHouse Cloud is consumption-based, with a small service running around $66 per month at light usage, and it is aggressively cheaper than Snowflake or BigQuery for high-volume scan workloads.
The most startup-friendly entry point, though, is the DuckDB stack, and it is genuinely cheap. DuckDB is an in-process analytical engine (think "SQLite for analytics") that is free, MIT-licensed, and runs embedded in your app, scripts, or notebooks with zero infrastructure. For small-to-medium analytics it is often all you need. When you want managed, multi-user analytics, MotherDuck offers a serverless DuckDB cloud with a genuinely usable free Lite tier and per-hour compute from $0.60 per hour - MotherDuck. And in 2026 DuckDB Labs shipped DuckLake, a new open lakehouse format that reached production-ready v1.0 in April 2026 and solves the "small files" problem by keeping table metadata in a plain SQL catalog rather than scattered files - DuckLake.
The incumbents remain the right answer for a specific buyer. Snowflake charges per-credit (roughly $2 to $4 per credit by edition) and wins when you need a governed, multi-cloud warehouse with deep data-sharing and enterprise trust. Google BigQuery charges $6.25 per TB scanned on-demand with the first terabyte free each month, and is the no-ops default for teams on Google Cloud. Databricks, now valued at $134 billion after a December 2025 round, is the unified lakehouse for data-and-AI-heavy companies that want engineering, warehousing, and ML in one platform. Tinybird sits in between, offering managed ClickHouse as a developer-friendly way to ship real-time analytics APIs without operating ClickHouse yourself.
The decision framework for analytics mirrors the rest of the guide: start small and embedded, scale to managed only when you must. Begin with DuckDB (or just analytical queries against a read replica of your Postgres) while your data is small. Move to MotherDuck or ClickHouse Cloud when you need managed multi-user analytics or sub-second customer-facing dashboards. Reach for Snowflake, BigQuery, or Databricks when you need enterprise governance, multi-cloud data sharing, or a unified ML platform and have the budget to match. The single most expensive mistake here is loading data into a proprietary warehouse before you need one, because warehouse credits compound and the cheapest query is the one that runs against open Parquet files in object storage, which is exactly what the lakehouse section covers.
12. Time-series and search: Tiger Data, Elasticsearch, Meilisearch
Two specialized categories sit alongside the analytics world and follow similar logic: time-series databases for metrics, IoT, and monitoring data, and search engines for full-text and relevance ranking. In both, the 2025 to 2026 story is consolidation around governance and engine rewrites rather than feature wars, and in both, the Postgres-or-open-source default holds. You reach for these specialists when your product has a genuine time-series or search-relevance need that a general database serves poorly, not as a reflex.
On the time-series side, the notable move is that TimescaleDB rebranded to Tiger Data in June 2025, repositioning from "time-series extension" to "the modern PostgreSQL for the analytical and agentic era" - Tiger Data. That repositioning reflects reality: the majority of workloads on its cloud are no longer time-series, because customers run entire applications on it. Tiger Cloud starts at $30 per month and, crucially, it is still PostgreSQL, so you get time-series superpowers without adopting a separate database. The main dedicated alternative, InfluxDB, completed a full Rust-and-Apache-Arrow rewrite with InfluxDB 3 reaching general availability in April 2025, and QuestDB offers an Apache-2.0 engine for maximum ingest performance. For most products with time-series needs, though, Tiger Data's "it's just Postgres" pitch is the path of least resistance.
On the search side, the licensing drama that hit Redis also hit search, and it resolved more cleanly. Elastic returned Elasticsearch to OSI-approved open source under AGPLv3 in August 2024, reversing its 2021 move away from open source - Elastic. The same month, AWS handed OpenSearch to the Linux Foundation, establishing vendor-neutral governance with premier members AWS, SAP, and Uber - TechCrunch. That double move removed the single-vendor risk that had originally forked the ecosystem, so a startup choosing search in 2026 can pick Elasticsearch or OpenSearch without the licensing anxiety that clouded the choice a year earlier.
For most startups, though, the heavyweight search platforms are overkill, and the cost-model distinction is what should drive the choice. Algolia is the managed, AI-capable search default for e-commerce and content, but it bills per search (Grow charges $0.50 per 1,000 searches, rising to $1.75 with AI features), which means your cost scales directly with traffic and can spike. The open-source alternatives flip that model: Typesense and Meilisearch bill on resources, not queries, so traffic spikes do not create surprise bills. Meilisearch Cloud starts at $20 per month and is MIT-licensed with built-in AI hybrid search; Typesense Cloud starts around $7 per month for a small cluster.
The guidance is to match the billing model to your query volume and the engine to your operational appetite. Choose Algolia if you want fully-managed search with minimal ops and your query volume is low-to-moderate, accepting the per-search cost. Choose Meilisearch or Typesense if you have high or spiky query volume and want predictable resource-based pricing, or if you want to self-host on a permissive license. Choose Elasticsearch or OpenSearch if you need search that doubles as a logging, observability, and security-analytics platform, now that both are safely open source. And remember the recurring theme: Postgres full-text search is genuinely good, so for a product with modest search needs you may not need a dedicated search engine at all until relevance ranking becomes a core feature.
13. Backend platforms: Supabase, Firebase, Convex, Appwrite
For a large and growing share of founders, the real question is not "which database" but "which complete backend," because they want a database plus authentication plus file storage plus an API in one place rather than wiring those pieces together themselves. This is the backend-as-a-service category, and in 2026 it matters more than ever because the integration tax (the engineering time to stitch a managed database to a separate auth provider to a separate storage bucket to an API layer) usually costs more than any per-service savings, and because AI coding agents work dramatically better against a single typed backend than against five separate SDKs.
The decisive insight is that pricing has largely converged (a $25-per-month paid floor is now standard across almost every vendor), so the real differentiator is not cost but lock-in posture: where does your data live, and how hard is it to leave? The options sort cleanly into three architectural camps. The first is a hosted convenience layer over open Postgres: Supabase and Nhost give you batteries-included backends where the data is portable SQL and the whole stack is self-hostable, so your exit path stays clean. The second is a proprietary-but-open-source reactive runtime: Convex and InstantDB offer beautiful developer experience and automatic reactivity, but store data in their own model rather than raw SQL. The third is a self-hostable single binary you own outright: PocketBase and Appwrite. This same assemble-versus-all-in-one tension recurs across your stack, for instance when picking email sending tools for your platform.
Each camp has a clear standout, and the choice follows from how much you value portability versus developer experience. Supabase leads the open-Postgres camp and is the overall default, for the reasons covered in section 4. Convex leads the reactive camp: its queries auto-update with zero manual websocket code, it is TypeScript-native end-to-end (an excellent fit for AI code generation), and it raised $24 million from a16z and Spark Capital in November 2025 - Convex. Appwrite leads the own-it-outright camp: genuinely open source under BSD-3, fully self-hostable, with per-project rather than per-seat pricing and an unusually generous Pro tier at $25 per month. Firebase remains the mobile-first incumbent, now offering both Firestore NoSQL and a managed-Postgres path via Data Connect, though its Blaze per-operation billing is notorious for surprise bills.
The practical sorting is the cleanest in any category. Choose Supabase if you want an all-in-one backend but refuse proprietary lock-in, which describes most teams. Choose Convex if you are a TypeScript team building collaborative, realtime apps and you value developer experience over raw SQL portability. Choose Appwrite or PocketBase if self-hosting and total ownership are priorities, with PocketBase (a single Go binary with SQLite) being the cheapest possible option for small apps and internal tools. Choose Firebase if you are mobile-first and already in the Google ecosystem. The deeper point is that for a product builder, a backend platform often is the database decision, and the same lock-in logic from section 2 applies: pick your exit path deliberately, because the convenience you buy today is the migration you pay for tomorrow.
14. Local-first and sync engines: PowerSync, Zero, Electric
Local-first is the most architecturally interesting database trend of 2026, and it inverts the classic client-server model entirely. Instead of every read and write traveling to a central server, data lives on the device first (in WASM Postgres, SQLite, or a conflict-free replicated data type) and syncs in the background. The payoff is dramatic: zero-latency reads and writes, automatic offline support, and built-in real-time multiplayer, all without loading spinners. This is not a niche; it is how the best collaborative and mobile apps increasingly feel, and the infrastructure to build it that way finally matured.
The category bifurcated into two architectural camps, and they suit different products. The first is sync engines that mirror a server database to an embedded client store. PowerSync syncs Postgres, MongoDB, MySQL, or SQL Server to on-device SQLite, with a free tier and Pro from $49 per month. Zero (from Rocicorp, successor to Replicache) reached its 1.0 release in June 2026 as an open-source, Apache-2.0 sync engine for Postgres-backed apps - InfoQ. Electric (formerly ElectricSQL) pairs a CDN-cached sync engine that scales to over a million concurrent readers with PGlite, a full Postgres compiled to WebAssembly that runs in the browser. These let you keep a normal server Postgres while delivering instant-feeling clients.
The second camp is CRDT-native stores that resolve conflicts mathematically for true peer-to-peer collaboration. Yjs is the production default with the largest ecosystem (around 920,000 weekly downloads and bindings for every major editor), Automerge offers the best document version history (and its 3.0 release cut memory usage by over 10x, dropping a Moby-Dick-sized document from 700MB to 1.3MB), and Loro is the fastest newcomer with a Rust core and 2-to-5x smaller documents. These are libraries you embed rather than services you pay for, and they are the foundation under collaborative editors, whiteboards, and design tools.
Local-first is genuinely powerful but it is not free of cost, and knowing when it wins is the whole game. It wins decisively for collaborative editors, offline-capable mobile and field apps, and instant-feeling UIs, where the latency and offline benefits are transformative. It is overkill, and adds real complexity, for simple CRUD apps, anything needing strong server-side consistency and authority (financial transactions, inventory with hard constraints), or backends where the data model is inherently centralized. The architectural shift (data on the client, sync in the background) is a paradigm change that not every product needs. The practical advice is to reach for local-first when your product's value depends on collaboration, offline capability, or sub-perceptible latency, and to stay with a conventional client-server database otherwise, because you can always add a sync layer later but you cannot easily remove the complexity once it is load-bearing.
15. Lakehouse and open table formats: Iceberg, DuckLake, S3 Tables
The lakehouse layer is where your data goes when there is a lot of it and you want to query it with multiple engines without paying a warehouse to hold it hostage. The core idea is simple and powerful: store your data as open Parquet files in cheap object storage (S3, GCS, R2) with an open table format on top that adds database-like features (transactions, schema evolution, time travel). This decouples storage from compute, kills vendor lock-in, and can cut cost by 5 to 20x versus loading the same data into a proprietary warehouse. For a data-heavy startup, understanding this layer is how you avoid a five-figure monthly warehouse bill you did not need.
The format war that defined this space for years is effectively over: Apache Iceberg won. The decisive moment was Databricks acquiring Tabular (the company founded by Iceberg's creators) for a reported $1 billion or more in June 2024 - CNBC. Iceberg now has the broadest multi-engine support (Spark, Trino, Flink, Snowflake, Databricks, DuckDB, BigQuery all read it), and even Delta Lake added UniForm to write Iceberg-compatible metadata over the same Parquet files, making the old Iceberg-versus-Delta choice increasingly moot. The battleground moved up a layer to catalogs (Apache Polaris versus Databricks Unity Catalog versus managed services), which track which tables exist and where.
Two 2025-to-2026 developments matter most for startups specifically. The first is AWS S3 Tables, launched at re:Invent 2024 as fully-managed Iceberg tables that handle compaction and maintenance for you, removing the operational burden of self-managing Iceberg at the cost of a premium over raw S3. The second is DuckLake, DuckDB Labs' radically simpler approach: put all table metadata in a plain SQL database (SQLite, Postgres, or DuckDB) rather than in scattered metadata files, which solves the "small files" problem that makes Iceberg inefficient for frequent small writes. DuckLake reached production-ready v1.0 in April 2026, and for a small team already in the DuckDB stack it is the simplest path to a real lakehouse.
The practical takeaway is that the lakehouse is no longer just an enterprise concern, and adopting open table formats early is cheap insurance against warehouse lock-in. If you expect to accumulate large analytical datasets, store them as Iceberg (or DuckLake) on object storage from the start, and point whatever query engine you like at them, rather than loading everything into a proprietary warehouse that charges you to read your own data. Use S3 Tables if you are AWS-native and want managed Iceberg without operating compaction yourself, DuckLake if you are a small team in the DuckDB ecosystem who wants the simplest possible lakehouse, and a vendor-neutral catalog like Apache Polaris if you want to keep your options open across engines. The structural win is the same one that runs through this entire guide: open formats and cheap storage decouple you from any single vendor, and that optionality compounds in your favor over the years your data lives.
16. The AI-native shift: agents as the primary database user
The deepest change in the 2026 database market is not a product, it is a change in who, or what, uses the database. For fifty years databases were designed for two consumers: applications written by humans, and analysts asking questions. In 2026 a third consumer became dominant in new-database creation: the AI agent. Neon's telemetry that over 80% of databases on its platform were created by agents, not humans is the single most important data point in this guide, because it explains the acquisitions, the product roadmaps, and where the puck is going - Databricks. When agents are the primary creators of databases, the features that matter change.
The features that win in an agent-first world are different from the ones that won in a human-first world. Raw query speed matters less; agent-friendly primitives matter more. The clearest example is instant copy-on-write branching: an AI coding tool building your app needs to spin up a throwaway database for each build, each preview, each checkpoint, and tear it down seconds later, which is exactly what Neon's branching enables and why Replit's time-travel feature is built on it. The second is native vector and embedding pipelines, so an agent can store and retrieve memories without a second system. The third is MCP servers for databases, which let an agent provision and query a database in natural language rather than hand-written SQL.
This is precisely why Databricks built Lakebase on Neon and why it is the clearest expression of the agent-first thesis: a fully-managed Postgres designed for AI applications, with instant zero-copy branching, point-in-time recovery, and native pgvector, which reached general availability in February 2026. The video below is Databricks introducing it.
The architecture diagram below, from the same Data + AI Summit keynote, shows the core idea: an operational Postgres layer that automatically syncs with the lakehouse, so the same data serves an application, a model, and an agent without a separate pipeline.
The same logic drove MongoDB to acquire Voyage AI and ship Automated Embedding (in public preview as of May 2026), so embeddings are generated and kept in sync inside the operational database, removing a whole class of RAG plumbing. It is also creating an entirely new category of agent memory layers: tools like Mem0 (a free tier of 10,000 memories, Pro at $249/month), Letta (the production evolution of Berkeley's MemGPT), and Zep (temporal knowledge-graph memory with graph features at $25/month) give agents persistent memory that outlives a single conversation. And the MCP server pattern, where Supabase, Neon, and others expose their databases to AI assistants over the Model Context Protocol, is making "the agent manages the database" a real workflow rather than a demo. This same agentic shift, riding the steadily improving capability and cost of the latest models, is the backdrop to our Claude Opus 4.8 benchmarks and guide.
This shift is something operators building autonomous companies have leaned into directly. Yuma Heymans (@yumahey), founder of O-mega (the company behind the autonomous-company builder Founden) and co-founder of the recruitment engine HeroHunt.ai (which searches across roughly a billion profiles), has written about running a company where AI agents, not people, perform the daily work and spin up the infrastructure, which is the practical face of the agents-as-database-users trend this section describes. The structural implication for your own database choice is that the ability to be driven by an agent is now a real evaluation criterion: branching, MCP support, native vectors, and clean typed schemas are not nice-to-haves if AI coding tools will be operating your backend, and they increasingly will be. The platforms that built for this (Supabase, Neon/Lakebase, Convex) are the same ones at the top of the scoreboard, which is not a coincidence.
17. How to choose: a decision framework for startups and SMBs
Having walked every category, the synthesis is simpler than the breadth suggests, and it compresses into a sequence you can apply on the day you start building. The structural truth underneath all of it is that intelligence and storage both got cheap, and the integration work between systems is now the expensive part. That single fact explains why the all-in-one Postgres platforms won, why agents reach for the most-trained-on database, and why every additional specialized system you bolt on has to clear a higher bar than founders instinctively apply. The framework below is built from that truth, not from whatever is trending.
Start with the default and earn your way off it. Pick a managed Postgres platform (Supabase if you want batteries included, Neon or Lakebase if you want serverless branching, plain RDS or Cloud SQL if you are already in a cloud) as your primary store, because it covers relational, JSON, full-text, geospatial, and vector workloads in one portable system. Add a cache (Valkey, or Upstash for serverless) only when you have measured a real read-latency problem. Add a vector database only when pgvector genuinely stops scaling. Add an analytics database (DuckDB, then ClickHouse) only when analytical queries threaten your primary. Each addition is a second system to secure, monitor, and pay for, so each one needs a specific, measured reason.
This balanced, measure-first posture is exactly what good practitioners advocate, and the short talk below makes the case well, including the situations where "just use Postgres" becomes its own trap and a specialized store really is the right call.
Then layer in the three lenses from section 2 to make each specific choice. Use cost model to match billing to your traffic: per-instance for steady load, scale-to-zero serverless for spiky or dev workloads, resource-based for predictable read-heavy apps, and avoid per-request billing for anything with high or poorly-indexed query volume. Use lock-in posture to choose your exit path deliberately: open Postgres and standard SQL for maximum portability, a proprietary serverless store only when its convenience clearly outweighs the binding. Use workload shape to know when you are the exception who genuinely needs a specialist on day one (a real-time analytics product needs ClickHouse early; a similarity-search product needs a vector engine early; a collaborative editor needs local-first early).
There is one more option worth naming honestly, because it sits at the highest level of abstraction: not choosing a database at all. AI app builders and autonomous-company platforms increasingly provision and manage the database for you as a byproduct of building the application, so the founder never touches a connection string. This is the category that Founden sits in: you describe the company or product you want, and the AI builds and runs the whole stack, picking and provisioning the database (almost always Postgres, because that is what the agents reach for) without you ever choosing one. For a non-technical founder shipping a first product, that is a legitimate and increasingly common answer, with the same caveat as any convenience layer: know what is underneath, so you can take ownership of it when you outgrow the abstraction. We rank the broader field of these tools in our guide to the top AI app builders, and map the website-first options in our AI website builders market map.
The final word is the most reassuring one. In 2026, for the median startup or SMB building a product, the right database is a managed PostgreSQL, and almost every wrong turn comes from adding complexity before you have the usage data to justify it. The market spent a billion dollars and several years arriving at the same conclusion that a careful first-principles analysis reaches in an afternoon: start simple, stay portable, add specialists only when measured need demands it, and let the data, not the trend, tell you when you have become the exception. Do that, and your database will be the least of your worries, which is exactly what a database should be.
This guide reflects the database landscape as of June 2026. Pricing, licensing, funding, and product availability in this market change frequently (often monthly), so verify current details on each provider's official pricing and documentation pages before committing.