Monday January 24, 2022
In November 2017, I was invited by Andy Pavlo to give a talk at the CMU about the internals of QuasarDB.
During that talk, I briefly mentioned why we don’t use memory-mapped files and why it’s generally a mistake to do this.
To be clear, I only had a strong intuition, backed by my experience working on the Windows NT and FreeBSD kernels memory managers, that it was a poor fit. I didn’t have any hard numbers or facts to back that up.
Andy told me that he was super happy someone said that, and it was a pet peeve of his to see memory-mapped files used as a persistence layer for DBMS.
“I want to work on a paper about this!”
Five years later, that paper is finally here, and the conclusions are unambiguous. I highly recommend you to read it, and I’m not saying that because it confirms a bias I had (I promise).
You could see this post as a continuation of this one, where I explained why we didn’t write a custom persistence layer. That’s because it’s much, much more complicated than what you think it is.
This specific passage is worth highlighting:
For example, you could decide to memory map tables and write them individually in separate files. However, you will quickly run into reliability and performance problems because the paging algorithms are unsuitable for database workloads (And the thing you think is stored on disk? Hope you like gambling).
Many databases use memory-mapped files to store data on disk, and this passage was seen as an unsubstantiated attack on those engines.
Nothing could be further from the truth. Although I can think of two very famous DBMS that use (or did use) this approach, my comment wasn’t targeted at anything in particular.
I was saying that you can (and should) do better than the operating system for I/O management because the needs of the OS are not the needs of a DBMS.
Memory map files are not designed for database workloads. For many problems, they work great, but any database engine that primarily uses memory-mapped files as a persistence mechanism cannot be used as a reliable storage option.
As time goes by, the probability of losing or corrupting data on a DBMS using memory-mapped files converges to 1.
The paper goes into concrete details about why this is a bad idea.
I encourage you to read the paper, and I hope that this approximate summary will give you further motivation to do so:
In other words, if you do simple ingestion benchmarks, you can have the illusion that memory-mapped files perform well and reliably, but as soon as you add in the chaos and intensity of an actual production setup (multiple data sources, out of order updates, queries running at the same time), pain ensues.
Don’t say we didn’t warn you!
Curious about what Quasar is? Learn more here!
Want to take Quasar for a spin? Try our free community edition!