Jon couldn’t make it for this episode, he’ll be back next time!
Al mentions our last episode with Ewan, and how the focus on Observability fits with his current focus at work.
Stuart mentions a few books to read including the Google SRE book, Google SRE Workbook and Alex Hidalgo’s Implementing Service Level Objectives. One not mentioned in the show but also of interest is Observability Engineering.
Jerry talks about his new job, that uses Azure and .NET. He mentions using Terraform and Azure DevOps. He also does some freelance work, and is trying to build “platforms” rather than just managing servers manually.
Stuart mentions a push in the industry to build easily consumable platforms for developers, allowing them to consume it themselves (Platform Engineering).
Al talks about using multiple regions within Cloud providers. Stuart mentions that sometimes using multiple regions can add redundancy but significantly increase complexity, at which point there is a trade off to consider.
Stuart talks about database technologies that allow multiple “writers” (e.g. Apache’s Cassandra, AWS’s DynamoDB, Azure’s CosmosDB), compared to those with a single writer and multiple readers (e.g. default MySQL and PostgreSQL).
Al starts a discussion around Containers.
Al talks about Azure Serverless (“function-as-a-service” like AWS’s Lambda and OpenFAAS), and Jerry mentions that these often are running as containers in the background. He also mentions AWS’s Fargate as a “serverless” container platform.
The conversation then moves onto Kubernetes.
Stuart mentions that when using a Cloud’s managed Kubernetes service, you often still manage the worker nodes, with the Cloud provider managing the control plane. It is possible to use technologies like AWS’s Fargate as Kubernetes nodes.
Al asks about how you would go about viewing splitting up Kubernetes clusters (i.e. one big cluster? multiple app specific clusters? environment-specific clusters?). Jerry and Stuart talk about this, as well as how to use multi-tenancy/access control and more. Stuart mentions concerns in terms of quite large clusters, in terms of rolling upgrades of nodes.
Stuart mentions Openshift, a Kubernetes distribution (similar to how Ubuntu, Debian, and Red Hat are distributions of Linux), and talks more about how it differs from “vanilla” Kubernetes. Stuart also mentions Rancher as another Kubernetes distribution.
Stuart also mentions the Kubernetes reconciliation loop, which is a really powerful concept within Kubernetes.
Stuart briefly mentions Chaos Engineering, inducing “chaos” to prove that your infrastructure and applications can handle failure gracefully.
Stuart talks about the Kubernetes Cluster Autoscaler.
Stuart and Jerry talk about how Kubernetes is not far off being a unified platform to aim for, although not entirely. Differences in how Clouds implement access control/service accounts is a good example of this.
Jerry mentions Alpine Linux as a good base for Container images, to reduce the size of containers and not including unneeded dependencies.
Jerry talks about Multi-Stage container images, as a way of removing build dependencies from a Production container. Stuart also mentions “Scratch” containers, which are effectively an image with nothing in it.
Stuart mentions running the built container within a Continuous Integration Pipeline with some tests, to make sure that your container doesn’t even get published until it meets the requirements of running the application inside of it.
Al and Stuart talks about running init systems (e.g. systemd) in Containers, and how it usually isn’t the way you run applications within Containers.
Jerry mentions viewing containers as immutable (e.g. don’t install packages that are required in an already running container, add them to the base image before starting it).
Stuart talks about viewing Containers as stateless, avoiding the need to persist data when a new container is deployed.