We decided to knock the dust off our copies of Designing Data-Intensive Applications to learn about transactions while Michael is full of solutions, Allen isn’t deterred by Cheater McCheaterton, and Joe realizes wurds iz hard.
Great statement from one of the creators of Google’s Spanner where the general idea is that it’s better to have transactions as an available feature even if it has performance issues and let developers decide if the performance is worth the tradeoff, rather than not having transactions and putting all that complexity on the developer.
Number of things that can go wrong during database interactions:
DB software or underlying hardware could fail during a write,
An application that uses the DB might crash in the middle of a series of operations,
Network problems could arise,
Multiple writes to the same records from multiple places causing race conditions,
Reads could happen to partially updated data which may not make sense, and/or
Race conditions between clients could cause weird problems.
“Reliable” systems can handle those situations and ensure they don’t cause catastrophic failures, but making a system “reliable” is a lot of work.
Transactions are what have been used for decades to address those issues.
A transaction is a way to group all related reads and writes into a single operation.
Either a transaction as a whole completes successfully as a “commit” or fails as an “abort, rollback”.
If the transaction fails, the application can choose what to do, like retry for example.
In general, transactions make error handling much simpler for an application.
That was their purpose, to make developing against a database much simpler.
Not all applications need transactions.
In some cases, it makes sense not to use transactions for performance and/or availability reasons.
How do you know if you need a transaction?
What are the safety guarantees?
What are the costs of using them?
Concepts of a transaction
Most relational DBs support transactions and some non-relational DBs support transactions.
The general idea of a transaction has been around mostly unchanged for over 40 years, originally introduced in IBM System R, the first relational database.
With the introduction of a lot of the NoSQL (non-relational) databases, transactions were left out.
In some NoSQL implementations, they redefined what a transaction meant with a weaker set of guarantees.
A popular belief was put out there that transactions meant anti-scalable.
Another popular belief was that to have a “serious” database, it had to have transactions.
The book calls out both as hyperbole.
The reality is there are tradeoffs for both having or not having transactions.
ACID is the acronym to describe the safety guarantees of databases and stands for Atomicity, Consistency, Isolation, and Durability.
Coined in 1983 by Theo Harder and Andreas Reuter.
The reality is that each database’s implementation of ACID may be very different.
Lots of ambiguity for what Isolation means.
Because ACID doesn’t specify the actual guarantees, it’s basically a marketing term.
Systems that don’t support ACID are often referred to as BASE, BAsically available, Soft state, and Eventual consistency.
Even more vague than ACID! BASE, more or less, just means anything but ACID.
Atomicity
Atomicity refers to something that can not be broken into smaller parts.
In terms of multi-threaded programming, this means you can only see the state of something before or after a complete operation and nothing in-between.
In the world of database and ACID, atomicity has nothing to do with concurrency. For instance, if multiple actions are trying to processes the same data, that’s covered under Isolation.
Instead, ACID describes what should happen if there is a fault while performing multiple related writes.
For example, if a group of related writes are to be performed in an operation and there is some underlying error that occurs before the transaction of writes can be committed, then the operation is aborted and any writes that occurred during that operation must be undone, i.e. rolled back.
Without atomicity, it is difficult to know what part of the operation completed and what failed.
The benefit of the rollback is you don’t have to have any special logic in your application to figure out how to get back to the original state. You can just simply try again because the transaction took care of the cleanup for you.
This ability to get rid of any writes after an abort is basically what the atomicity is all about.
Consistency
In ACID, consistency just means the database is in a good state.
But consistency is a property of the application as it’s what defines the invariants for its operations.
This means that you must write your application transactions properly to satisfy the invariants that have been defined.
The database can take care of certain invariants, such as foreign key constraints and uniqueness constraints, but otherwise it’s left up to the application to set up the transactions properly.
The book suggests that because the consistency is on the application’s shoulders, the C shouldn’t be part of ACID.
Isolation
Isolation is all about handling concurrency problems and race conditions.
The author provided an example of two clients trying to increment a single database counter concurrently, the value should have gone from 3 to 5, but only went to 4 because there was a race condition.
Isolation means that the transactions are isolated from each other so the previous example cannot happen.
The book doesn’t dive deep on various forms of isolation implementations here as they go deeper in later sections, however one that was brought up was treating every transaction as if it was a serial transaction. The problem with this is there is a rather severe performance hit for forcing everything serially.
The section that describes the additional isolation levels is “Weak Isolation Levels”.
Durability
Durability just means that once the database has committed a write, the data will not be forgotten, even if a database failure or hardware failure occurs.
This notion of durability typically means, in a single node database, that the data has been written to the drive, typically to a write-ahead log or similar implementation.
The write-ahead log ensures if there is any data corruption in the database, that it can be rebuilt, if necessary.
In a replicated database, durability means that the data has been written to the other nodes successfully.
The performance implication here is that for the database to guarantee that it’s durable, it must wait for those distributed writes to complete before committing the transaction.
PERFECT DURABILITY DOES NOT EXIST.
If all your databases and backups somehow got destroyed at the same time, there’s absolutely nothing you could do.
Longevity of Recordable CDs, DVDs and Blu-rays – Canadian Conservation Institute (CCI) Notes 19/1 (canada.ca)
Tip of the Week
The Bad Plus is an instrumental band that makes amazing music that’s perfect for programming. It’s a little wild, and a little strange. Maybe like Radiohead, but a saxophone instead of Thom Yorke? Maybe? (YouTube)
Correction, Piano Rock will quickly become your new favorite channel. (YouTube)
docker builder is a command prefix that you can use that specifically operates against the builder. For example you can prune the builder’s cache without wiping out your local cache. It can really save your bacon if you’re working with a lot of images. (docs.docker.com)
Ever want to convert YAML to JSON so you can see nesting issues easier? There’s a VSCode plugin for that! Search for hilleer.yaml-plus-json or find it on GitHub. (GitHub)
Spotify has a great interface, but Apple Audio has lossless audio, sounds great, and pays artists more. Give it a shot! If you sign up for Apple One you can get Apple Music, Apple TV+, Apple Arcade, Apple News+ and a lot more for one unified price. (Apple)