> From 01979 until about 02000, Oracle's RDBMS software was probably the best in...

kragen · 2025-06-28T22:36:15 1751150175

I don't know. Oracle was written to run on a VAXCluster with a shared disk with a seek time in the tens of milliseconds, and things like Postgres are kind of architected for that world. The world has changed a lot. Anything you could fit on disk in 02000 fits in RAM today, most of our programmable computing power is in GPUs instead of CPUs, website workloads are insanely read-heavy (favoring materialized views and readslave farms), SSDs can commit transactions durably in 0.1ms and support random reads while imposing heavy penalties on nonsequential writes, and spinning up ten thousand AWS Lambda functions to process a query in parallel is now a reasonable thing to do.

I think you could make reasonable arguments for SQLite, Postgres, MariaDB, Impala, Hive, HSQLDB, SPARK, Drill, or even Numpy, TensorFlow, or Unity's ECS, though those last few lack the "internal representation independence" ("data independence") so central to Codd's conception.

What's your opinion?

mr_toad · 2025-06-29T13:33:31 1751204011

If you want high availability and scalability (and damn the expense), then Oracle is probably still number one. Especially for write heavy workloads. But not everyone can afford to burn money.

pjmlp · 2025-06-29T10:13:29 1751192009

MS SQL Server, but certainly not an opinion shared among many HNers.

cerved · 2025-06-28T22:30:32 1751149832

Postgres presumably

aleph_minus_one · 2025-06-28T22:57:23 1751151443

As far as I am aware (I may be wrong, or things have changed in the last years) Postgres still is is worse in horizontal scaling than Oracle and Microsoft SQL Server (and likely also DB2).

kragen · 2025-06-28T22:58:02 1751151482

Maybe, but I think MariaDB beats all of them at that.

Also, though, horizontal scaling is a lot less important now than it was 20 or 30 years ago. https://www.servethehome.com/2025-server-starting-point-inte... says AMD has 192 cores per socket, and you can get two-socket motherboards, so 384 cores total. And you can stick 12 128GiB DDR5-6000 DIMMs in it, so 1.5 tebibytes of RAM, and a single SSD is 30 terabytes, and SSDs can commit a transaction group durably in typically 0.1 milliseconds. And those 384 cores (EPYC 9005, so Zen 5c, https://www.servethehome.com/amd-epyc-9005-turin-turns-trans...) are 2.25GHz and typically about 2.2 instructions per clock (https://chipsandcheese.com/p/zen-5-variants-and-more-clock-f...), and they support AVX512.

As one rough estimate, 2.25GHz with AVX512 (at 1 IPC) means you can do 36 billion column-oriented 32-bit integer operations per core per second, which with 384 cores means 13 trillion 32-bit integer operations per core per second. On one server. So if you have a query that needs to do a linear scan of a column in a 13-million-row table, the query might take 300μs, but you should be able to do a million such queries per second. But normally you index your tables so that most queries don't need to do such inefficient things, so you should be able to handle many more queries per second than that!

(Each socket has 12 DDR5 channels, totaling 576 gigabytes per second to DRAM per socket or 1.13 terabytes per second across the two of them, so you'll get worse performance if you're out of cache. And apparently you can use 512GiB DIMMs and get 6 tebibytes of RAM!)

So, if you need more than one server for your database, it's probably because it's tens or hundreds of terabytes, or because you're handling tens of millions of queries per second, or because your database software is designed for spinning rust. Spinning rust is still the best way to store large databases, but now the cutoff for "large" is approaching the petabyte scale.

I think the space of databases that are under ten terabytes and under ten million queries per second is large enough to cover almost everything that most people think of when they think of "databases".

mr_toad · 2025-06-29T13:43:28 1751204608

> So, if you need more than one server for your database

Servers fail, any serious business obviously needs more than one server. And many need more than one data centre, just for redundancy. The more modern Oracle clusters act in a similar manner to RAID arrays, with virtual databases replicated across clusters of physical servers in such a way that a loss of a physical server doesn’t impact the virtual databases at all.

kragen · 2025-06-29T17:23:04 1751217784

Most databases aren't running a business. It isn't 01973 anymore. My web browser on my phone has a bunch of SQL databases.

I was talking about horizontal scalability, not failover. You don't need horizontal scalability, or for that matter any special software features, for failover. (Though PITR is nice, especially if your database is running a business.) With cloud computing vendors, you may not even need a second server to fail over to; you can bring it up when the failure happens and pay for it by the hour.

The features you're talking about made a lot of sense 30 years ago, maybe even 20 years ago. They still make a lot of sense if you need to handle hundreds of millions of queries per second or if you have hundreds of terabytes of data. But, for everything else, you can get by with vertical scaling, which is a lot less work. Unlike backups or rearchitecting your database, it's literally a product you can just buy.

(A lot of the open-source relational databases I listed also support clustering for both HA and scalability; I just think those features are a lot less important when you can buy an off-the-shelf server with tebibytes of RAM.)

mr_toad · 2025-06-29T20:25:58 1751228758

> But, for everything else, you can get by with vertical scaling

I’m not arguing with that, I’m just pointing out why very large organisations still use Oracle.

kragen · 2025-06-30T03:07:32 1751252852

That's not why.

The kinds of databases where your business stops working if the database does are OLTP databases (often, databases of transactions in the financial sense), and they aren't hundreds of terabytes or tens of millions of transactions per second. Oracle itself doesn't scale to databases so big or with such high transaction rates that they can't run on one server in 02025. (It does scale to databases so big or with such high transaction rates that Oracle can't run them on one server in 02025, but that's just because Oracle's VAXCluster-optimized design is inefficient on current hardware.)

People run Oracle because it's too late for them to switch.

cgh · 2025-06-29T18:09:00 1751220540

This is a fantastic comment, thank you.

kragen · 2025-06-30T05:38:46 1751261926

You're welcome! I'm glad you liked it.