Wednesday October 21, 2020
QuasarDB is the fastest timeseries database in the world by ingestion speed (and probably by querying speed, but it’s very hard to establish objectively), by a significant margin. In the 3.9 branch, we furthered our advance further, and we are working to deliver another major performance boost in the following releases.
This performance level is possible because, when using the batch writer, the client API and the server cooperate to distribute the workload optimally. However, this means that the client API must be used appropriately to match your insertion pattern.
In this blog post, we’ll explore the three main writing modes available from the batch writer, their pros and cons, and guidelines for choosing them.
QuasarDB splits timeseries into shards of fixed size. The default shard size is 24 hours. That means that data belonging to the same time bucket will be written to the same location. Shard size is decided at table creation and is set at the table level.
Choosing an appropriate shard size is essential. The sweet spot is when a shard contains between 100,000 and 1,000,000 rows (data is stored in columns, and thus, the number of columns has no impact on write speed).
An inappropriate shard size is the number one write bottleneck. The second most frequent bottleneck is using the wrong writing mode.
QuasarDB uses Multiversion Concurrency Control (MVCC) to ensure transaction isolation. When updating a bucket in a transaction, a copy of the old bucket with the new data is created. When a bucket is updated, the amount of data to disk is twice the bucket’s previous size, plus the new data added.
In theory, you want to minimize the number of updates to a bucket when you insert data, ideally, writing the whole bucket at once (this is what Railgun does). Unfortunately, that is not always possible in practice as it may result in substantial update lag or greatly increase the complexity of the client’s code.
The QuasarDB batch writer has three modes of operation:
By default, writes from the batch writer are synchronous and transactional. It offers the following guarantees:
Synchronous, transactional writes are the safest, most comfortable to work with, and easier to understand. This is why they are the default. They are also the slowest.
In a write-heavy scenario of timeseries data, you rarely need to be transactional. Consider using synchronous, non-transactional writes.
The next mode of operation for the batch writer, named “fast push”, is synchronous, non-transactional. The “fast push” mode bypasses MVCC and updates the bucket in place. It results in twice faster writing speed and dramatically reduces the LSM tree’s pressure, further decreasing I/O usage. However, a concurrent read transaction may see rows “appear” during its execution, which may give inaccurate results.
Fast push with batches of reasonable size can deliver outstanding write performance, satisfactory for most use cases. Railgun, by default, uses fast pushes.
The last mode of operation, “async push”, is asynchronous, non-transactional. Instead of writing the data directly to disk, data is sent to an asynchronous queue. All writes in the queue are then merged into a single large write and committed to disk, at a specified, configurable interval.
The exact write delay, size of the queue, and the number of threads dedicated to asynchronous writes are configurable.
When the call returns, the data has been sent to the server but is not immediately written to disk. However, it is more reliable than buffering data on the client, as when replication is active, the asynchronous data will exist in the memory of multiple servers at once.
Asynchronous writes can yield up to 100x write speed gains because it agglomerates small writes into a single large write. However, asynchronous writes may incur a data visibility delay of several seconds and don’t increase performance when batches are large enough.
Asynchronous writes are the perfect solution to ingest out of order data that arrives in small packets. However, keep in mind that when batches are large enough, they offer little to no performance benefit, and add a risk for potential data loss.
The batch writer maintains a structure before sending the data to the server. When you update a row in the batch writer, it copies the data passed to this internal structure. It can represent a significant overhead when giving large values or updating many tables in the same batch.
Pinned columns give direct access to the underlying structure. They thus enable you to save a copy and a lookup at each row update, which, as the number of tables grows, can result in tremendous speed gains.
Pinned columns have no impact on how data is written to the database but can make a difference if you see batch row updates slow down writes.
Why? Because it would be bad!
When you write from a client to a given table using a certain mode, mixing different write modes can result in undefined behavior.
For example, mixing asynchronous and synchronous writes can lead to very surprising results, and is not officially supported.
If QuasarDB isn’t delivering first-class write speed, we hope this blog post will help you troubleshoot what may be getting in the way of performance.