Allen made the video on generating a baseball lineup application just by chatting with ChatGPT (youtube)
What is OpenTelemetry?
An incubating project on the CNCF - Cloud Native Computing Foundation (cncf.io)
What does incubating mean?
Projects used in production by a small number of users with a good pool of contributors
Basically you shouldn't be left out to dry here
So what is Open Telemetry? A collection of APIs, SDKs and Tools that's used to instrument, generate, collect and export telemetry data
This helps you analyze your software's performance and behavior
It's available across multiple languages and frameworks
It's all about Observability
Understanding a system "from the outside"
Doesn't require you to understand the inner workings of the system
The goal is to be able to troubleshoot difficult problems and answer the "Why is this happening?" Question
To answer those questions, the application must be properly "Instrumented"
This means the application must emit signals like metrics, traces, and logs
The application is properly instrumented when you can completely troubleshoot an issue with the instrumentation available
That is the job of OpenTelemetry - to be the mechanism to instrument applications so they become observable
List of vendors that support OpenTelemetry: https://opentelemetry.io/ecosystem/vendors/
Reliability and Metrics
Telemetry - refers to the data emitted from a system about its behavior in the form of metrics, traces and logs
Reliability - is the system behaving the way it's supposed to? Not just, is it up and running, but also is it doing what it is expected to do
Metrics - numeric aggregations over a period of time about your application or infrastructure
CPU Utilization
Application error rates
Number of requests per second
SLI - Service Level Indicator - a measurement of a service's behavior - this should be in the perspective of a user / customer
Example - how fast a webpage loads
SLO - Service Level Objective - the means of communicating reliability to an organization or team
Accomplished by attaching SLI's to business value
Distributed Tracing
To truly understand what distributed tracing is, there's a few parts we have to put together first
Logs - a timestamped message emitted by applications
Different than a trace - a trace is associated with a request or a transaction
Heavily used in all applications to help people observe the behavior of a system
Unfortunately, as you probably know, they aren't completely helpful in understanding the full context of the message - for instance, where was that particular code called from?
Logs become much more useful when they become part of a span or when they are correlated with a trace and a span
Span - represents a unit of work or operation
Tracks the operations that a request makes - meaning it helps to paint a picture of what all happened during the "span" of that request/operation
Contains a name, time-related data, structured log messages, and other metadata/attributes to provide information about that operation it's tracking
Some example metadata/attributes are: http.method=GET, http.target=/urlpath, http.server_name=codingblocks.net
Distributed trace is also known simply as a trace - record the paths taken for a user or system request as it passes through various services in a distributed, multi-service architecture, like micro-services or serverless applications (AWS Lambdas, Azure Functions, etc)
Tracing is ESSENTIAL for distributed systems because of the non-deterministic nature of the application or the fact that many things are incredibly difficult to reproduce in a local environment
Tracing makes it easier to understand and troubleshoot problems because they break down what happens in a request as it flows through the distributed system
A trace is made of one or more spans
The first span is the "root span" - this will represent a request from start to finish
The child spans will just add more context to what happened during different steps of the request
Some observability backends will visualize traces as waterfall diagrams where the root span is at the top and branching steps show as separate chains below - diagram linked below (opentelemetry.io)
Attention Windows users, did you know you can hold the control key to prevent the tasks from moving around in the TaskManager. It makes it much easier to shut down those misbehaving key loggers! (verge.com)
Does your JetBrains IDE feel sluggish? You can adjust the heap space to give it more juice! (blogs.jetbrains.com)
Beware of string interpolation in logging statements in Kotlin, you can end up performing the interpolation even if you're not configured to output the statement types! IntelliJ will show you some squiggles to warn you. Use string templates instead. Also, Kotlin has "use" statements to avoid unnecessary processing, and only executes when it's necessary. (discuss.kotlinlang.org)
Thanks to Tom for the tip on tldr pages, they are a community effort to simplify the beloved man pages with practical examples. (tldr.sh)
Looking for some new coding music? Check out these albums from popular guitar heroes!