Some of the high-value low-risk items being piloted
Cloud Data Warehousing
High-value low-risk items deployed or being deployed
Security Orchestration Automation and Response
Digital Experience Monitoring
Robotic Process Automation
Virtual Machine Backup and Recovery
Integration Platform as a Service
SD-WAN (software-defined WAN)
Network Detection and Response
High-value high risk
Zero Trust Network Access
Artificial Intelligence IT Operations - AIOps
Cloud Application Discovery
Hybrid Cloud Computing
AI Cloud Services
Cloud Managed Networks - CMNs
Who have you partnered with?
Cloud Storage (Dropbox, OneDrive, iCloud, etc)
Backups (Do you still need them!?)
Tip of the Week
Have a presentation to do? Slidev is a VueJs and markdown-based way to create slides. Because it's web based you can do cool interactive type stuff, and it's portable. Bonus: recording and camera view support built in. Thanks Dave! (sli.dev)
There are a lot of great resources for Kubernetes on the official Kubernetes Certifications and Training page (kubernetes.io)
Notes in iOS are pretty good now! Did you know you can use it for inline images, videos, along with note taking…. (youtube.com)
Use Docker? Check out dive, it's a tool for exploring a docker image, layer contents, and discovering ways to shrink the size of your Docker/OCI image. (github.com)
We've got a smorgasbord of delights for you this week, ranging from mechanical switches to the cloud and beyond. Also, Michael's cosplaying as Megaman, Joe learns the difference between Clicks and Clacks, and Allen takes no prisoners.
Leave us a review if you have a chance! (/reviews)
Why are mechanical keyboards so popular with programmers?
Is it the sound? Is it the feel? What are silent switches? Are they missing the point?
You can buy key switches for good prices (drop.com)
Cloud Costs Every Programmer should know (vantage.sh) (Thanks Mikerg!)
List of static analysis tools, so you can get with the times! (GitHub) (Thanks Mikerg!)
"I’d love a breakdown of what each of you think are your key differences in philosophies or approaches to software development. Could be from arguments or debates on older episodes, whether on coding, leadership, startups, AI, whatever - just curious about how best to tell everyone’s voices apart based on what they’re saying. I know one of you is Jay Z (JZ?), but slow to pick up on which host is which based on accents alone."
How do you center a div? Within a div? With right-align text? What about centering 3 divs? What if you want to space them out evenly? If you've been away from CSS for a while, you may be a bit rusty on the best ways to do this. Not sure if it's "the best" but an easy solution to these problems is to use Flexbox, and lucky for you there is a fun little game designed to teach you how to use it. (flexboxfroggy.com)
Drop.com is a website focused on computer gear, headphones, keyboards, desk accessories etc. It's got a lot of cool stuff! (drop.com)
Have you ever accidentally deleted a file? Recovering files in git doesn't have to be hard with the "restore" command (rewind.com)
Have trouble with your hands and want to limber up? Also doubles as a cool retro Capcom Halloween costume. It's a LifePro Hand Massager! (amazon)
In this episode, we are talking all about GitHub Actions. What are they, and why should you consider learning more about them? Also, Allen terminates the terminators, Outlaw remembers the good ol' days, and Joe tries his hand at sales.
GitHub Actions is a CI/CD platform launched in 2018 that lets you define and automate workflows
It's well integrated into Github.com and fits nicely with git paradigms - repository, branches, tags, pull requests, hashes, immutability (episode 195)
The workflows can run on GitHub-hosted virtual machines, or on your own servers
GitHub Actions are free for standard Github runners in public repositories and self-hosted runners, private repositories get a certain amount of "free" minutes and any overages are controlled by your spending limits
2000 minutes and 500MB for free, 3000 minutes and 1Gb for Pro, etc (docs.github.com)
Examples of things you can do
Automate builds and releases whenever a branch is changed
Run tests or linters automatically on pull requests
Automatically create or assign Issues, or labels to issues
Publish changes to your gh-pages, wiki, releases,
Check out the "Actions" tab on any github repository to check if a repository has anything setup (github.com)
The "Actions" in GitHub Actions refers to the most atomic action that takes place - and we'll get there, but let us start from the top
Workflow is the highest level concept, you see any workflows that a repository has set up (learn.microsoft.com)
A workflow is triggered by an event: push, pull request, issue being opened, manual action, api call, scheduled event, etc (learn.microsoft.com)
CI - Runs linting, checking, builds, and publishes changes for all supported versions of Node on pull request or push to main or release-* branches
Close Issues - Looks for stale issues and closes them with a message (using gh!)
Code Scanning - Runs CodeQL checks on pull request, push, and on a weekly schedule
Publish Nightly - Publishes the last set of successful builds every night
Workflows can call other workflows in your repository, or in a repository you have access to
Special note about calling other workflows - when embedding other workflows you can specify a specific version with either a tag or a commit # to make sure you're running exactly what you expect
In the UI you'll see a filterable history of workflow runs on the right
The workflow is associated with a yaml file located in ./github/workflows
Clicking on a workflow in the left will show you a history of that workflow and a link to that file (cli.github.com)
Workflows are made up of jobs, which are associated with a "runner" (machine) (cli.github.com)
Jobs are mainly just a container for "Steps" which are up next, but the important bit is that they are associated with a machine (virtual or you can provide your own either via network or container)
Jobs can also be dependent on other jobs in the workflow - Github will figure out how to run things in the required order and parallelize anything it can
You're minutes are counted by machine time, so if you have 2 jobs that run in parallel that each take 5 minutes…you're getting "charged" for 10 minutes
Jobs are a group of steps that are executed in order on the same runner
Data can easily be shared between steps by echoing output, setting environment variables or mutating files
An action is a custom application written for the GitHub Actions platform
GitHub provides a lot of actions and other 3p (verified or not) providers do as well in the "Marketplace", you can use other people's actions (as long as they don't delete it!), and you can write your own
We glossed over a lot of the details about how things work - such as various contexts where data is available and how it's shared, how inputs and outputs are handled…just know that it's there! (docs.github.com)
You grant job permissions, default is content-read-only but you must give fine-grained permissions to the jobs you run - write content, gh-pages, repository, issues, packages, etc
There is a section under settings for setting secrets (unretrievable and masked in output) and variables for your jobs. You have to explicitly share secrets with other jobs you call
There is support for "expressions" which are common programming constructions such as conditionals and string helper functions you can run to save you some scripting (docs.github.com)
GitHub Actions is amazing because it's built around git!
Great features comparable (or much better) than other CI/CD providers
Great integration with a popular tool you might already be using (docs.github.com)
Works well w/ the concepts of Git By default, workflows cannot use actions from GitHub.com and GitHub Marketplace. You can restrict your developers to using actions that are stored on your GitHub Enterprise Server instance, which includes most official GitHub-authored actions, as well as any actions your developers create. Alternatively, to allow your developers to benefit from the full ecosystem of actions built by industry leaders and the open-source community, you can configure access to other actions from GitHub.com.
Great free tier
Great documentation https://docs.github.com/en/actions/using-containerized-services/creating-postgresql-service-containers
Working via commits can get ugly…make your changes in a branch and rebase when you're done!
If you are interested in getting started with DevOps, or just learning a bit more about it, then this is a great way to go! It's a great investment in your skillset as a developer in any case.
Build your project on every pull request or push to trunk
Run your tests, output the results from a test coverage tool
Run a linter or static analysis tool
Post to X, Update LinkedIn whenever you create a new release
There is a GitHub Actions plugin for VSCode that provides a similar UI to the website. This is much easier than trying to make all your changes in Github.com or bouncing between VSCode and the website to see how your changes worked. It also offers some integrated documentation and code completion! It's definitely my preferred way of working with actions. (marketplace.visualstudio.com)
Did you know that you can cancel terminating a terminating persistent volume in Kubernetes? Hopefully you never need to, but you can do it! (github.com)
How are the Framework Wars going? Check out Google trends for one point of view. (trends.google.com)
Rebasing is great, don't be afraid of it! A nice way to get started is to rebase while you are pulling to keep your commits on top. git pull origin main --rebase=i
There's a Dockerfile Linter written in Haskell that will help you keep your Docker files in great shape. (docker.com)
Allen made the video on generating a baseball lineup application just by chatting with ChatGPT (youtube)
What is OpenTelemetry?
An incubating project on the CNCF - Cloud Native Computing Foundation (cncf.io)
What does incubating mean?
Projects used in production by a small number of users with a good pool of contributors
Basically you shouldn't be left out to dry here
So what is Open Telemetry? A collection of APIs, SDKs and Tools that's used to instrument, generate, collect and export telemetry data
This helps you analyze your software's performance and behavior
It's available across multiple languages and frameworks
It's all about Observability
Understanding a system "from the outside"
Doesn't require you to understand the inner workings of the system
The goal is to be able to troubleshoot difficult problems and answer the "Why is this happening?" Question
To answer those questions, the application must be properly "Instrumented"
This means the application must emit signals like metrics, traces, and logs
The application is properly instrumented when you can completely troubleshoot an issue with the instrumentation available
That is the job of OpenTelemetry - to be the mechanism to instrument applications so they become observable
List of vendors that support OpenTelemetry: https://opentelemetry.io/ecosystem/vendors/
Reliability and Metrics
Telemetry - refers to the data emitted from a system about its behavior in the form of metrics, traces and logs
Reliability - is the system behaving the way it's supposed to? Not just, is it up and running, but also is it doing what it is expected to do
Metrics - numeric aggregations over a period of time about your application or infrastructure
Application error rates
Number of requests per second
SLI - Service Level Indicator - a measurement of a service's behavior - this should be in the perspective of a user / customer
Example - how fast a webpage loads
SLO - Service Level Objective - the means of communicating reliability to an organization or team
Accomplished by attaching SLI's to business value
To truly understand what distributed tracing is, there's a few parts we have to put together first
Logs - a timestamped message emitted by applications
Different than a trace - a trace is associated with a request or a transaction
Heavily used in all applications to help people observe the behavior of a system
Unfortunately, as you probably know, they aren't completely helpful in understanding the full context of the message - for instance, where was that particular code called from?
Logs become much more useful when they become part of a span or when they are correlated with a trace and a span
Span - represents a unit of work or operation
Tracks the operations that a request makes - meaning it helps to paint a picture of what all happened during the "span" of that request/operation
Contains a name, time-related data, structured log messages, and other metadata/attributes to provide information about that operation it's tracking
Some example metadata/attributes are: http.method=GET, http.target=/urlpath, http.server_name=codingblocks.net
Distributed trace is also known simply as a trace - record the paths taken for a user or system request as it passes through various services in a distributed, multi-service architecture, like micro-services or serverless applications (AWS Lambdas, Azure Functions, etc)
Tracing is ESSENTIAL for distributed systems because of the non-deterministic nature of the application or the fact that many things are incredibly difficult to reproduce in a local environment
Tracing makes it easier to understand and troubleshoot problems because they break down what happens in a request as it flows through the distributed system
A trace is made of one or more spans
The first span is the "root span" - this will represent a request from start to finish
The child spans will just add more context to what happened during different steps of the request
Some observability backends will visualize traces as waterfall diagrams where the root span is at the top and branching steps show as separate chains below - diagram linked below (opentelemetry.io)
Attention Windows users, did you know you can hold the control key to prevent the tasks from moving around in the TaskManager. It makes it much easier to shut down those misbehaving key loggers! (verge.com)
Does your JetBrains IDE feel sluggish? You can adjust the heap space to give it more juice! (blogs.jetbrains.com)
Beware of string interpolation in logging statements in Kotlin, you can end up performing the interpolation even if you're not configured to output the statement types! IntelliJ will show you some squiggles to warn you. Use string templates instead. Also, Kotlin has "use" statements to avoid unnecessary processing, and only executes when it's necessary. (discuss.kotlinlang.org)
Thanks to Tom for the tip on tldr pages, they are a community effort to simplify the beloved man pages with practical examples. (tldr.sh)
Looking for some new coding music? Check out these albums from popular guitar heroes!
In this episode, we're talking about the history of "man" pages, console apps, team leadership, and Artificial Intelligence liability. Also, Allen's downloading the internet, Outlaw has fallen in love with the sound of a morrvair, and Joe says TUI like two hundred times as if it were a real word.
DevFest Florida is a community-run one-day conference aimed to bring technologists, developers, students, tech companies, and speakers together in one location to learn, discuss and experiment with technology. (devfestfl.org)
What are (were?) man pages?
"man" is a command-line "pager" similar to "more" or "less" that was designed specifically to display documentation - ahem, "manuals"
"man" pages would show you documentation for many apps in a (mostly) consistent manner that was available offline
Do people still use them?
People would print these out in the 70's and beyond!
Want to learn something new while also making your life easier? Why not try writing a TUI!? Here's an article that will kindly introduce you to terminal user interfaces, libraries like "Clap", "TUI", and "Crossterm" that people are using to write them, and…you can get some XP with Rust while you're at it! (blog.logrocket.com)
Are you looking to upgrade your Kubernetes cluster? Check for API problems first!
Are you a browser tab fiend? Did you know you can reload all your tabs simultaneously with a simple shortcut? (groups.google.com)
No more nasty wiring jobs, get yourself to the hardware store website and pick up some wire and splicing connectors. Keep things nice, tidy, and organized. (wago.com)
In this episode, we're talking about lessons learned and the lessons we still need to learn. Also, Michael shares some anti-monetization strategies, Allen wins by default, and Joe keeps it real 59/60 days a year!
In this sequence of sound, we compute Joe's unexpected pleasure in commercial-viewing algorithms, Michael's intricate process of slicing up the pizza, and Allen's persistent request for more cheese data augmentation. Will you engage in this data streaming session?
MusicLM lets you create music from descriptive text, similar to Dalle-2. The output is a little strange, but could still potentially be really useful and inspiring with a little bit of effort. It's in private beta now, as part of the "AI Test Kitchen" but you can sign up to join the waitlist today.
In this episode we talk about several things that have been on our mind. We find that Joe has been taken over by AI's, Michael now understands our love of Kotlin, and Allen wants to know how to escape supporting code you wrote forever.
Have any experience with Twilio? It's work! (twilio.com)
Resources we like
docker init is a tool (in beta) built into the latest Docker Desktop that you can use to get a leg up on your next project. It makes it easy to create docker files with best practices, as well as a docker-compose file to get you up and running. (docker.com)
screen is an open-source powerful terminal multiplexer that allows users to create, manage, and switch between multiple terminal sessions, enabling seamless multitasking and persistent remote connections in a single window.
The VIVO Universal Treadmill Desk Riser is an adjustable, ergonomic workspace solution designed to fit most treadmills, allowing users to seamlessly combine their work and exercise routines for a healthy, productive lifestyle. (amazon.com)
The LifeSpan Fitness Under Desk Walking Treadmill is a compact, low-profile treadmill designed to fit under standing desks, enabling remote workers to maintain an active lifestyle by seamlessly integrating walking or light jogging into their daily work routine, promoting better health and increased productivity. (amazon.com)
Kubernetes Network Policies are a set of rules that define how pods within a cluster can communicate with each other and with external resources, allowing administrators to enforce fine-grained access control and enhance the security of their containerized applications. (kubernetes.io)
What are lost updates, and what can we do about them? Maybe we don't do anything and accept the write skew? Also, Allen has sharp ears, Outlaw's gort blah spotterfiles, and Joe is just thinking about breakfast.
Last episode we talked about weak isolation, committed reads, and snapshot isolation
There is one major problem we didn't discuss called "The Lost Update Problem"
Consider a read-modify-write transaction, now imagine two of them happening at the same time
Even with snapshot isolation, it's possible that read can happen for transaction A before B, but the write for A happens first
Incrementing/Decrementing values (counters, bank accounts)
Updating complex values (JSON for example)
CMS updates that send the full page as an update
Atomic Writes - Some databases support atomic updates that effectively combine the read and write
Cursor Stability - locking the read object until the update is performed
Single Threading - Force all atomic operations to happen serially through a single thread
The application can be responsible for explicitly locking objects, placing responsibility in the devs hands
This makes sense in certain situations - imagine a multiplayer game where multiple players can move a shared object. It's not enough to lock the data and then apply both updates in order since the shared game world can react. (ie: showing that the item is in use)
Detecting Lost Updates
Locks can be tricky, what if we reused the snapshot mechanism we discussed before?
We're already keeping a record of the last transactionId to modify our data, and we know our current transactionId. What if we just failed any updates where our current transaction id was less than the transactionId of the last write to our data?
This allows for naive application code, but also gives you fewer options…retry or give up
Note: MySQL's InnoDB's Repeatable Read feature does not support this, so some argue it doesn't qualify as snapshot isolation
What if you didn't have transactions?
If you didn't have transactions, let alone a snapshot number, you could get similar behavior by doing a compare-and-set
Example: update account set balance = 10 where balance = 9 and id = ABC
This works best in simple databases that support atomic updates, but not great with snapshot isolation
Note: it's up to the application code to check that updates were successful - Updating 0 records is not an error
Conflict resolution and replication
We haven't talked much about replicas lately, how do we handle lost updates when we have multiple copies of data on multiple nodes?
Compare-and-Set strategies and locking strategies assume a single up-to-date copy of the data….uh oh
The options are limited here, so the strategy is to accept the writes and have an application process to decide what to do
Merge: Some operations, like incrementing a counter, can be safely merged. Riak has special datatypes for these
Last Write Wins: This is a common solution. It's simple but inaccurate. Also the most common solution.
Write Skew and Phantoms
Write skew - when a race condition occurs that allows writes to different records to take place at the same time that violates a state constraint
The example given in the book is the on-call doctor rotation
If one record had been modified after another record's transaction had been completed, the race condition would not have taken place
write-skew is a generalization of the lost update problem
Atomic single-object locks won't work because there's more than one object being updated
Snapshot isolation also doesn't work in many implementations - SQL Server, PostgreSQL, Oracle, and MySQL won't prevent write skew
Requires true serializable isolation
Most databases don't allow you to create constraints on multiple objects but you may be able to work around this using triggers or materialized views as your constraint
They mention if you can't use serializable isolation, your next best option may be to lock the rows for an update in a transaction meaning nothing else can access them while the transaction is open
Phantoms causing write skew
The query for some business requirement - ie there's more than one doctor on call
The application decides what to do with the results from the query
If the application decides to go forward with the change, then an INSERT, UPDATE, or DELETE operation will occur that would change the outcome of the previous step's Application decision
They mention the steps could occur in different orders, for instance, you could do the write operation first and then check to make sure it didn't violate the business constraint
In the case of checking for records that meet some condition, you could do a SELECT FOR UPDATE and lock those rows
In the case that you're querying for a condition by checking on records to exist, if they don't exist there's nothing to lock, so the SELECT FOR UPDATE won't work and you get a phantom write - a write in one transaction changes the search result of a query in another transaction
Snapshot isolation avoids phantoms in read-only queries, but can't stop them in read-write transactions
The problem we mentioned with phantom is there'd no record/object to lock because it doesn't exist
What if you were to have a set of records that could be used for locking to alleviate the phantom writes?
Create records for every possible combination of conflicting events and only use those to lock when doing a write
"materializing conflicts" because you're taking the phantom writes and turning them into lock records that will prevent those conflicts
This can be difficult and prone to errors trying to create all the combinations of locks AND this is a nasty leakage of your storage into your application
Docker's Buildkit is their backend builder that replaces the "legacy" builder by adding new non-backward compatible functionality. The way you enable buildkit is a little awkward, either passing flags or setting variables as well as enabling the features per Dockerfile, but it's worth it! One of the cool features is the "mount" flag that you can pass as part of a RUN statement to bring in files that are not persisted past that layer. This is great for efficiency and security. The "cache" type is great for utilizing Docker's cache to save time in future builds. The "bind" type is nice for mounting files you only need temporarily. like source code in for a compiled language. The "secret" is great for temporarily bringing in environment variables without persisting them. Type "ssh" is similar to "secret", but for sharing ssh keys. Finally "tmpfs" is similar to swap memory, using an in-memory file system that's nice for temporarily storing data in primary memory as a file that doesn't need to be persisted. (github.com)
Did you know Google has a Google Cloud Architecture diagramming tool? It's free and easy to use so give it a shot! (cloud.google.com)
ChatGTP has an app for slack. It's designed to deliver instant conversation summaries, research tools, and writing assistance. Is this the end of scrolling through hundreds of messages to catch up on whatever is happening? /chatgpt summarize (salesforce.com)
Have you heard about ephemeral containers? It's a convenient way to spin up temporary containers that let you inspect files in a pod and do other debugging activities. Great for, well, debugging! (kubernetes.io)
There's this thing called ChatGPT you may have heard of. Is it the end for all software developers? Have we reached the epitome of mankind? Also, should you write your own or find a FOSS solution? That and much more as Allen gets redemption, Joe has a beautiful monologue, and Outlaw debates a monitor that is a thumb size larger than his current setup.
This probably isn't the first time and it won't be the last we ask the question - should you write your own version of something if there's a good Free Open Source Software alternative out there?
Typed vs Untyped Languages
Another topic that we've touched on over the years - which is better and why?
Any considerations when working with teams of developers?
What are the pros and cons of each?
If you're spending a good amount of money in the cloud, you should probably talk to a sales rep for your given cloud and try to negotiate rates. You may be surprised how much you can save. And...you never know until you ask!
Outlaw has the Itch to get a new Monitor
Is it worth upgrading from a 34" ultrawide to a 38" ultrawide?
Did you know that the handy, dandy application jq is great for formatting json AND it's also Turing complete? You can do full on programming inside jq to make changes - conditionals, variables, math, filtering, mapping...it's Turing Complete! https://stedolan.github.io/jq/
Want to freshen up your space, but you just don't have the vision? Give interiorai.com a chance, upload a picture of your room and give it a description. It works better than it should.
You can sort your command line output when doing something like an ls sort -k2 -b
On macOS you can drag a non-fullscreen window to a fullscreen desktop
When using the ls -l command in a terminal, that first numeric column shows the number of hard links to a file - meaning the number of names an inode has for that file
Ever wonder how database backups work if new data is coming in while the backup is running? Hang with us while we talk about that, while Allen doesn't stand a chance, Outlaw is in love, and Joe forgets his radio voice.
It’s time we learn about multi-object transactions as we continue our journey into Designing Data-Intensive Applications, while Allen didn’t specifically have that thought, Joe took a marketing class, and Michael promised he wouldn’t cry.
Multi-object transactions need to know which reads and writes are part of the same transaction.
In an RDBMS, this is typically handled by a unique transaction identifier managed by a transaction manager.
All statements between the BEGIN TRANSACTIONandCOMMIT TRANSACTION are part of that transaction.
Many non-relational databases don’t have a way of grouping those statements together.
Single object transactions must also be atomic and isolated.
Reading values while in the process of writing updated values would yield really weird results.
It’s for this reason that nearly all databases must support single object atomicity and isolation.
Atomicity is achievable with a log for crash recovery.
Isolation is achieved by locking the object to be written.
Some databases use a more complex atomic setup, such as an incrementer, eliminating the need for a read, modify, write cycle.
Another operation used is a compare and set.
These types of operations are useful for ensuring good writes when multiple clients are attempting to write the same object concurrently.
Transactions are more typically known for grouping multiple object writes into a single operational unit
Need for multi object transactions
Many distributed databases / datastores don’t have transactions because they are difficult to implement across partitions.
This can also cause problems for high performance or availability needs.
But there is no technical reason distributed transactions are not possible.
The author poses the question in the book: “Do we even need transactions?”
The short answer is, yes sometimes, such as:
Relational database systems where rows in tables link to rows in other tables,
In non-relational systems when data is denormalized for “object” reasons, those records need to be updated in a single shot, or
Indexes against tables in relational databases need to be updated at the same time as the underlying records in the tables.
These can be handled without database transactions, but error handling on the application side becomes much more difficult.
Lack of isolation can cause concurrency problems.
Handling errors and aborts
ACID transactions that fail are easily retry-able.
Some systems with leaderless replication follow the “best effort” basis. The database will do what it can, and if something fails in the middle, it’ll leave anything that was written, meaning it won’t undo anything it already finished.
This puts all the burden on the application to recover from an error or failure.
The book calls out developers saying that we only like to think about the happy path and not worry about what happens when something goes wrong.
The author also mentioned there are a number of ORM’s that don’t do transactions proud and rather than building in some retry functionality, if something goes wrong, it’ll just bubble an error up the stack, specifically calling out Rails ActiveRecord and Django.
Even ACID transactions aren’t necessarily perfect.
What if a transaction actually succeeded but the notification to the client got interrupted and now the application thinks it needs to try again, and MIGHT actually write a duplicate?
If an error is due to “overload”, basically a condition that will continue to error constantly, this could cause an unnecessary load of retries against the database.
Retrying may be pointless if there are network errors occurring.
Retrying something that will always yield an error is also pointless, such as a constraint violation.
There may be situations where your transactions trigger other actions, such as emails, SMS messages, etc. and in those situations you wouldn’t want to send new notifications every time you retry a transaction as it might generate a lot of noise.
When dealing with multiple systems such as the previous example, you may want to use something called a two-phase commit.
Tip of the Week
Manything is an app that lets you use your old devices as security cameras. You install the app on your old phone or tablet, hit record, and configure motion detection. A much easier and cheaper option than ordering a camera! (apps.apple.com, play.google.com)
The Linux Foundation offers training and certifications. Many great training courses, some free, some paid. There’s a nice Introduction to Kubernetes course you can try, and any money you do spend is going to a good place! (training.linuxfoundation.org)
Kubernetes has recommendations for common-labels. The labels are helpful and standardization makes it easier to write tooling and queries around them. (kubernetes.io)
Markdown Presentation for Visual Studio Code, thanks for the tip Nathan V! Marp lets you create slideshows from markdown in Visual Studio Code and helps you separate your content from the format. It looks great and it’s easy to version and re-use the data! (marketplace.visualstudio.com)
We decided to knock the dust off our copies of Designing Data-Intensive Applications to learn about transactions while Michael is full of solutions, Allen isn’t deterred by Cheater McCheaterton, and Joe realizes wurds iz hard.
Great statement from one of the creators of Google’s Spanner where the general idea is that it’s better to have transactions as an available feature even if it has performance issues and let developers decide if the performance is worth the tradeoff, rather than not having transactions and putting all that complexity on the developer.
Number of things that can go wrong during database interactions:
DB software or underlying hardware could fail during a write,
An application that uses the DB might crash in the middle of a series of operations,
Network problems could arise,
Multiple writes to the same records from multiple places causing race conditions,
Reads could happen to partially updated data which may not make sense, and/or
Race conditions between clients could cause weird problems.
“Reliable” systems can handle those situations and ensure they don’t cause catastrophic failures, but making a system “reliable” is a lot of work.
Transactions are what have been used for decades to address those issues.
A transaction is a way to group all related reads and writes into a single operation.
Either a transaction as a whole completes successfully as a “commit” or fails as an “abort, rollback”.
If the transaction fails, the application can choose what to do, like retry for example.
In general, transactions make error handling much simpler for an application.
That was their purpose, to make developing against a database much simpler.
Not all applications need transactions.
In some cases, it makes sense not to use transactions for performance and/or availability reasons.
How do you know if you need a transaction?
What are the safety guarantees?
What are the costs of using them?
Concepts of a transaction
Most relational DBs support transactions and some non-relational DBs support transactions.
The general idea of a transaction has been around mostly unchanged for over 40 years, originally introduced in IBM System R, the first relational database.
With the introduction of a lot of the NoSQL (non-relational) databases, transactions were left out.
In some NoSQL implementations, they redefined what a transaction meant with a weaker set of guarantees.
A popular belief was put out there that transactions meant anti-scalable.
Another popular belief was that to have a “serious” database, it had to have transactions.
The book calls out both as hyperbole.
The reality is there are tradeoffs for both having or not having transactions.
ACID is the acronym to describe the safety guarantees of databases and stands for Atomicity, Consistency, Isolation, and Durability.
Coined in 1983 by Theo Harder and Andreas Reuter.
The reality is that each database’s implementation of ACID may be very different.
Lots of ambiguity for what Isolation means.
Because ACID doesn’t specify the actual guarantees, it’s basically a marketing term.
Systems that don’t support ACID are often referred to as BASE, BAsically available, Soft state, and Eventual consistency.
Even more vague than ACID! BASE, more or less, just means anything but ACID.
Atomicity refers to something that can not be broken into smaller parts.
In terms of multi-threaded programming, this means you can only see the state of something before or after a complete operation and nothing in-between.
In the world of database and ACID, atomicity has nothing to do with concurrency. For instance, if multiple actions are trying to processes the same data, that’s covered under Isolation.
Instead, ACID describes what should happen if there is a fault while performing multiple related writes.
For example, if a group of related writes are to be performed in an operation and there is some underlying error that occurs before the transaction of writes can be committed, then the operation is aborted and any writes that occurred during that operation must be undone, i.e. rolled back.
Without atomicity, it is difficult to know what part of the operation completed and what failed.
The benefit of the rollback is you don’t have to have any special logic in your application to figure out how to get back to the original state. You can just simply try again because the transaction took care of the cleanup for you.
This ability to get rid of any writes after an abort is basically what the atomicity is all about.
In ACID, consistency just means the database is in a good state.
But consistency is a property of the application as it’s what defines the invariants for its operations.
This means that you must write your application transactions properly to satisfy the invariants that have been defined.
The database can take care of certain invariants, such as foreign key constraints and uniqueness constraints, but otherwise it’s left up to the application to set up the transactions properly.
The book suggests that because the consistency is on the application’s shoulders, the C shouldn’t be part of ACID.
Isolation is all about handling concurrency problems and race conditions.
The author provided an example of two clients trying to increment a single database counter concurrently, the value should have gone from 3 to 5, but only went to 4 because there was a race condition.
Isolation means that the transactions are isolated from each other so the previous example cannot happen.
The book doesn’t dive deep on various forms of isolation implementations here as they go deeper in later sections, however one that was brought up was treating every transaction as if it was a serial transaction. The problem with this is there is a rather severe performance hit for forcing everything serially.
The section that describes the additional isolation levels is “Weak Isolation Levels”.
Durability just means that once the database has committed a write, the data will not be forgotten, even if a database failure or hardware failure occurs.
This notion of durability typically means, in a single node database, that the data has been written to the drive, typically to a write-ahead log or similar implementation.
The write-ahead log ensures if there is any data corruption in the database, that it can be rebuilt, if necessary.
In a replicated database, durability means that the data has been written to the other nodes successfully.
The performance implication here is that for the database to guarantee that it’s durable, it must wait for those distributed writes to complete before committing the transaction.
PERFECT DURABILITY DOES NOT EXIST.
If all your databases and backups somehow got destroyed at the same time, there’s absolutely nothing you could do.
Longevity of Recordable CDs, DVDs and Blu-rays – Canadian Conservation Institute (CCI) Notes 19/1 (canada.ca)
Tip of the Week
The Bad Plus is an instrumental band that makes amazing music that’s perfect for programming. It’s a little wild, and a little strange. Maybe like Radiohead, but a saxophone instead of Thom Yorke? Maybe? (YouTube)
Correction, Piano Rock will quickly become your new favorite channel. (YouTube)
docker builder is a command prefix that you can use that specifically operates against the builder. For example you can prune the builder’s cache without wiping out your local cache. It can really save your bacon if you’re working with a lot of images. (docs.docker.com)
Ever want to convert YAML to JSON so you can see nesting issues easier? There’s a VSCode plugin for that! Search for hilleer.yaml-plus-json or find it on GitHub. (GitHub)
Spotify has a great interface, but Apple Audio has lossless audio, sounds great, and pays artists more. Give it a shot! If you sign up for Apple One you can get Apple Music, Apple TV+, Apple Arcade, Apple News+ and a lot more for one unified price. (Apple)