Category Archives: Show Notes

Admin Admin Podcast #105 – Show Notes: The one without the musician

In this episode, Al, Jon, and Jerry catch up on some major life changes, dive deep into the world of Model Context Protocols (MCP), and discuss the practicalities of moving into IT consultancy. Plus, we explore why Docker might be your best friend for CLI tools and how to keep your AWS environment secure with temporary tokens.

Community Update

A Bit of News: The team shares an update regarding Stu, who is taking a step back from the show to focus on outside commitments. We wish him the absolute best! The podcast continues with Al, Jon, and Jerry at the helm.

The Racing Clock: This episode was recorded in a “sprint” style, thanks to Jon’s laptop battery giving a strictly enforced 46-minute deadline.

Development & AI

MCP Server Progress: Jon discusses building Model Context Protocol (MCP) servers, specifically focusing on a Grafana server for incident management. By using an adapter to bridge Slack and Grafana via Claude, the team can correlate network events with real-time conversations.

The MCP Proxy Idea: A look into the “internal lab” at a potential proxy for MCPs to provide better visibility into LLM requests and responses.

Spec-Driven Development: Al breaks down the workflow of using LLMs for spec-driven development. By using tools like GitHub’s Spec Kit, the process moves from markdown standards to LLM-generated tests, ensuring software is built in small, verifiable increments.

The Business of IT

Contracting vs. Consulting: Jerry shares his transition from the contract market toward a full-scale IT Consultancy model.

Market Gap: Discussion on the demand for high-level IT support for small companies that don’t need a full five-day-a-week commitment.

Defining the Roles: A breakdown of the nuances between permanent employment (benefits/stability), contracting (freedom/higher risk), and consulting (multi-client management/marketing).

Tools & Security

Dockerized CLI Tools: Why Jon is moving toward wrapping command-line scripts in Docker containers. This approach eliminates the “it works on my machine” headache by bundling dependencies like Python versions and LibSSL directly into the image.

AWS Security: A shallow-ish dive into AWS Short-Lived Token Service (STS). We discuss why temporary tokens are a superior security choice over long-lived IAM access keys and how they compare to Azure’s SAS keys.

Events & Links

OggCamp 2026: Mark your calendars for April 25th! The crew plans to be there, and Jon is already prepping at least one talk.

Get in Touch

  • Email: mail@adminadminpodcast.co.uk
  • Community: Join our Telegram group to keep the conversation going.

Credits: Huge thanks to Dave Lee for the audio production. We are a proud member of the Other Side Podcast Network.

Admin Admin Podcast #104 – Show Notes: He’s been talking to himself again

In this episode, Jon flies solo to tackle the “scheduling gremlins” affecting the team before diving deep into two powerful tools for the modern SysAdmin: Server Spec for infrastructure unit testing and MCP Servers for integrating LLMs into your monitoring workflow.

Bringing Unit Testing to the Server

Jon breaks down why unit testing isn’t just for software developers anymore. By using Server Spec (an extension of RSpec), admins can verify their infrastructure just like code.  Jon explains – with examples – how to define spec files to check if files exist, are executable, or contains specific strings.  He also talks about how to use Vagrant to verify virtual machine upgrades and service states before they hit production.

The Power of MCP (Model Context Protocol)

The episode shifts to the cutting edge of AI in the data center. Jon discusses how MCP servers allow Large Language Models (LLMs) like Claude and Gemini to interact directly with your infrastructure.  Instead of writing complex SQL or PromQL queries, Jon explains how – using natural language – his team uses MCP to ask Grafana: “What was the impact of the incident between 2 PM and 4 PM?”  He also explains how MCP can be used for complex connections, like linking CloudWatch metrics to customer support claims in real-time.

Community & Events

Jon will be attending OggCamp in Manchester on Saturday 25 and Sunday 26 April this year, and hopes to see Al, Jerry, and Stu there!

Connect with Us

We want to hear from you! How are you using AI in your daily admin tasks?
Contact us via email or Telegram!

Admin Admin Podcast #103 – Show Notes: That’s how I role

In this episode:

Cloud Outages and Incident Reviews

We mention recent service outages involving AWS DNS and Azure Front Door, discussing how both were triggered by minor misconfigurations, such as empty arrays or DNS records.

We highlight Azure’s practice of sharing detailed post-incident reviews on YouTube to boost transparency, similar to what GitLab once did. The need for improved input validation by cloud providers is emphasized following these outages. Also, a brief explanation of HugOps

Migration and Modernization Projects

Jerry describes his current gig involving the migration of legacy on-premises infrastructure to modern cloud solutions, using AWS Transfer Family for SFTP services and migrating SQL Server databases to Azure SQL Managed Instance. SQL Server Management Studio (SSMS) and AWS Database Migration Service are mentioned as typical tools for these migrations, though both are noted for occasional reliability issues.

Linux Laptop Setup and Configuration Management

The discussion shifts to strategies for configuring Linux systems, especially as Windows 10 becomes unsupported.

Different configuration management tools are discussed:  Al recently restarted with Ansible (after using Puppet), noting how Ansible scripts can provision a system from scratch efficiently using APT and Flatpaks and the local connection in Ansible.

​Playbooks, dotfile management (using solutions like chezmoi), and over-engineered Vim configurations are recurring themes, with mentions of Ansible configs supporting distributions like Debian, RHEL and Arch (but not NixOS yet – someone would have said something ).

Jerry belatedly realises he should sort something out in this respect, though all he really needs to get going is SSH/GPG keys (for pass), ssh-keychain for WSL. Jerry & Stu discuss vim and the vscode vim plugin.

Shells, Package Managers, and Dotfiles

We discuss oh-my-zsh and its productivity-boosting plugins, offering git aliases and improved history searching using fzf. We compare bash, zsh, and fish, with zsh preferred for its better completion and command history features and ability to run Bash one-liners. We also look into the role of package managers (Homebrew (also on Linux, which already has a package manager :), pip, NPM, Cargo, etc.) for managing dev environments

Coding and Tools

We discuss recent experiences (vibe-)coding in Go (Golang) to replace some dodgy powershell scripts, and touch on golang’s learning curve and the fact it’s a compiled language.

We touch on SST (Serverless Stack Toolkit), which is based on TypeScript and offers opinionated AWS resource deployment.

We touch on AI/LLMs again – OpenCode and Claude Code are referenced with their ability to support coding workflows either by making direct changes or providing guidance, we discuss the tradeoffs involved with using them to get stuff done.

Sysadmin and SRE Roles

We discuss the differences and overlaps between the various roles associated with out work: System Administrator (sysadmin), DevOps, Platform Engineering, and Site Reliability Engineering (SRE).

  • Jerry defines sysadmin as a Windows or Linux engineer, perhaps someone from less of a programming background
  • We dive a bit deeper into “SRE” is defined as focusing on reliability to a level that meets business and customer needs, balancing automation and reducing toil (work that could be automated) and the concept of user experience monitoring

SLOs (Service Level Objectives), SLIs (Service Level Indicators), and the importance of observability is highlighted – referencing logs, metrics, traces, and (sometimes) profiling.

Observability, Monitoring, and OpenTelemetry

We discuss logs, metrics, and distributed tracing (especially via OpenTelemetry and hosted services such as Datadog and Honeycomb). Jerry mentions an excerpt from Observability Engineering by the Honeycomb engineers. We also touch on the  practical need for monitoring at both the system level and deeper into data that may be being collected, with analogies like a pain in the foot turning out to be a broken toe upon further investigation.

The pillars of observability (metrics, logs, and traces) come up again and Stu breaks down their roles in incident investigation and maintaining SLOs. We define a real-world example of a 99.5% SLO.

We go on about SRE so much that we run out of time and touch on the naming of these roles over time (plus new roles that are popping up e.g. “finops”), stay tuned for further discussions…

Get in touch with us at  mail@adminadminpodcast.co.uk or via our Telegram channel.

 

Admin Admin Podcast #102 – Show Notes: Getting the band back together

In this episode:

The team shared career updates, including Jon’s new SRE role, Jerry’s transition to freelance work, Stu’s move to a principal software engineer position, and Al’s lead role in a DevOps team.

Key discussions revolved around AI, with Jerry sharing his positive experience using Light LM and AI for design documents, while Stu expressed ethical concerns about AI’s energy consumption. Al raised concerns about AI hindering learning for new developers, and Jon highlighted the issue of “AI slop” affecting projects like curl

John mentioned:
Defensive Security Podcast
– TinyOIDC: https://tinyoidc.authenti-kate.org/ and https://github.com/authenti-kate/tiny-oidc
Open Source Security Podcast; LLM Finding bugs in Curl
– Human Resources book: https://www.amazon.co.uk/dp/B0DZWKGZGN and https://torpublishinggroup.com/human-resources/?isbn=9781250375933&format=ebook

Jerry mentioned:
A YouTube video about AI Sloop

Admin Admin Podcast #101 – Show Notes -It’s not like riding a bike

We are back for episode 101:

  • Jon has started a new job as a SRE.
  • Stu is currently at the same place for over a year.
  • Jerry just finished a contract job using vSphere and Rancher (Enterprise Kubernetes Management)
  • Al been using Puppet and Pulumi

Jon has been using Puppet and using to configure his work laptop. We talk about Manifests files,PDK and the link to Stu blog to puppet.

Jerry mentions k9s cli

Jon mention the self hosted podcast. He been playing with Proxmox and GlusterFS to make Home Assistant High Availability

Al mentions Borg backup to backup his Linux configs

Jon mention vaultwarden

Admin Admin Podcast #100 Show Notes – Branching out at 100

Admin Admin Podcast 100 Show notes

We reached the century! This episode was recorded via a live stream, so you get to hear a lot more of what happens behind the scenes than usual.

Jon is back in full after a short hiatus (due to being busy at his job). Jon has been visiting his clients at his role, rather than over conference calls.

Stuart has been at his current role for a while now, and is getting into Reliability being a primary focus rather than an afterthought.

Jerry is working with a client who runs Rancher (a Kubernetes distribution) on-premises using VMWare’s VSphere. He mentions about working across timezones with colleagues, which Al and Stuart also have experience with.

Jerry also mentions that his freelance work is increasing, meaning he may have to look at bringing other people in to help. He mentions the challenges of building his code/infrastructure to be utilised/managed by other people.

Jerry and Jon talk about using Github Actions to deploy code/changes.

Al brings up using Continuous Integration with Terraform (referencing Ned In The Cloud). He then asks the question about Git and branching strategies.

Jerry and Stuart talk about trunk-based development, and some of the downsides long-lived branches. Stuart talks about tagged commits, which can be a good way of managing how and where code runs.

Jerry mentions about some of the challenges of working with long-lived branches and divergence from the primary/default Git branch.

Stuart brings up the point that sometimes long-lived branches are useful, but more for deprecated versions/features (e.g. supporting older versions of Terraform).

Jon mentions a useful Github Action for working with Terraform.

Al brings up linting, which Jon gives a brief explanation of. Everyone talks briefly about pre-commit as well. Listener Yannick also mentions that Anthony Sottile (who has written a lot of pre-commit hooks) is doing Twitch streams on Python code, which are worth looking at.

We all talk about the podcast from the early days, meeting at different editions of OggCamp, and how the conference landscape has changed in recent years.

A big thank you to Dave Lee for supporting us in the Podcast, editing, and keeping us honest! Another big thank you to the Otherside Network for supporting us too.

A massive thank you to all of our listeners too!

Admin Admin Podcast #099 Show Notes – Making it all a bit modular

Jon recently passed his AWS Certified Security – Speciality exam, congratulations!

Jon mentions how we’re starting to go to more in-person meetings again. Stu and Jerry have been to a few more in-person meetings recently, whereas Al has transitioned to working from home more.

Al mentions how his team and current workplace are trying to adopt a more SRE mindset.

Stu mentions how he is working very heavily with SLIs, SLOs and Error Budgets. He also mentions that a couple of the people on his team come from development primarily, which means he is starting to pick up new ways of doing development (e.g. TDD).

Al mentions how it’s interesting working alongside developers in your team, especially when you come from an infrastructure/networking/sysadmin background. Al is also starting to learn .NET.

Jerry mentions his new role includes Kubernetes and Rancher.

Al talks about Terraform. Al mentions how they are starting to consider adopting/refactoring their current codebase to use Terraform Modules.

Stu talks about using Modules to enforce requirements (e.g. tags for costing resources), consistency and turning business logic into code. Stu also mentions versioning your modules, like using Git tags to reduce breakage but improve the modules.

Stu also mentions his views on Community/Public Terraform modules (i.e. using ones created and open sourced), compared to creating your own. Jon mentions similar views on community Ansible Galaxy modules.

Jon mentions about how to structure your Terraform code, so that plans/applies do not take a long time to complete. The structure can also help with permissions/access for other teams.

Stu mentions using Terraform Data Sources or Remote State for separating concerns within Terraform code.

Jerry mentions that it is possible to abstract far enough so that a team just needs to define a configuration file to create their app, and the Terraform code and modules provide this to them, without them needing to understand Terraform.

Jerry mentions Terragrunt, a Terraform wrapper to abstract Terraform code. It makes code "DRY" (Don’t Repeat Yourself), allowing even less code to be defined within your Terraform codebase.

Stu talks about pipelines and Git strategy, especially with Terraform. Some examples are available here (including GitHub Actions and CircleCI).

Jon mentions an option for testing Terraform in pipelines could be creating ephemeral environments that the Terraform code runs against, so it shows real infrastructure changes.

Al and Stu talk about linting code. Jon mentions pre-commit for taking steps before a commit finishes (meaning code cannot be pushed into a Git repository until the pre-commit rules finish).

Al and Jon talk about public versus private endpoints (i.e. exposing web services to the internet by default, or having it private by default).

Jon mentions HTTP Request Smuggling, as a way of finding ways of bypassing/making a request go to an endpoint that isn’t necessarily exposed to the internet directly.

Jon also mentions some Bastion-style techniques for accessing infrastructure without needing to expose the bastion to the internet instead (e.g. AWS’s SSM).

Admin Admin Podcast #098 Show Notes – Contain Your Enthusiasm

Jon couldn’t make it for this episode, he’ll be back next time!

Al mentions our last episode with Ewan, and how the focus on Observability fits with his current focus at work.

Al references the Golden Signals of monitoring, as well as Azure’s App Insights.

Stuart mentions a few books to read including the Google SRE bookGoogle SRE Workbook and Alex Hidalgo’s Implementing Service Level Objectives. One not mentioned in the show but also of interest is Observability Engineering.

Jerry talks about his new job, that uses Azure and .NET. He mentions using Terraform and Azure DevOps. He also does some freelance work, and is trying to build “platforms” rather than just managing servers manually.

Stuart mentions a push in the industry to build easily consumable platforms for developers, allowing them to consume it themselves (Platform Engineering).

Al talks about using multiple regions within Cloud providers. Stuart mentions that sometimes using multiple regions can add redundancy but significantly increase complexity, at which point there is a trade off to consider.

Stuart talks about database technologies that allow multiple “writers” (e.g. Apache’s Cassandra, AWS’s DynamoDB, Azure’s CosmosDB), compared to those with a single writer and multiple readers (e.g. default MySQL and PostgreSQL).

Jerry talks about CPU Credits in Cloud providers, Stuart references AWS’s T-series of instances which make use of CPU Credits.

Al starts a discussion around Containers.

Stuart mentions the primitives that Containers are based around like cgroups. They also use network namespaces (not used in the show).

Al mentions a container image he is looking at currently which includes a huge amount of dependencies (including Xorg and LibreOffice!) that are probably not required.

Al talks about Azure Serverless (“function-as-a-service” like AWS’s Lambda and OpenFAAS), and Jerry mentions that these often are running as containers in the background. He also mentions AWS’s Fargate as a “serverless” container platform.

The conversation then moves onto Kubernetes.

Stuart mentions that when using a Cloud’s managed Kubernetes service, you often still manage the worker nodes, with the Cloud provider managing the control plane. It is possible to use technologies like AWS’s Fargate as Kubernetes nodes.

Al asks about how you would go about viewing splitting up Kubernetes clusters (i.e. one big cluster? multiple app specific clusters? environment-specific clusters?). Jerry and Stuart talk about this, as well as how to use multi-tenancy/access control and more. Stuart mentions concerns in terms of quite large clusters, in terms of rolling upgrades of nodes.

Stuart mentions Openshift, a Kubernetes distribution (similar to how Ubuntu, Debian, and Red Hat are distributions of Linux), and talks more about how it differs from “vanilla” Kubernetes. Stuart also mentions Rancher as another Kubernetes distribution.

Stuart also mentions the Kubernetes reconciliation loop, which is a really powerful concept within Kubernetes.

Stuart briefly mentions Chaos Engineering, inducing “chaos” to prove that your infrastructure and applications can handle failure gracefully.

Stuart talks about the Kubernetes Cluster Autoscaler.

Stuart and Jerry talk about how Kubernetes is not far off being a unified platform to aim for, although not entirely. Differences in how Clouds implement access control/service accounts is a good example of this.

Al mentions using a Container Registry, which Jerry and Stuart go into more detail about. Jerry talks about Container Images and only including what is required in it.

Jerry mentions Alpine Linux as a good base for Container images, to reduce the size of containers and not including unneeded dependencies.

Al mentions slim.ai, and Stuart mentions how it is aiming to be like minify but for Containers.

Jerry talks about Multi-Stage container images, as a way of removing build dependencies from a Production container. Stuart also mentions “Scratch” containers, which are effectively an image with nothing in it.

Stuart mentions running the built container within a Continuous Integration Pipeline with some tests, to make sure that your container doesn’t even get published until it meets the requirements of running the application inside of it.

Al and Stuart talks about running init systems (e.g. systemd) in Containers, and how it usually isn’t the way you run applications within Containers.

Jerry mentions viewing containers as immutable (e.g. don’t install packages that are required in an already running container, add them to the base image before starting it).

Stuart talks about viewing Containers as stateless, avoiding the need to persist data when a new container is deployed.

Admin Admin Podcast #097 Show Notes – Through the Logging Glass

In this episode, Jon’s colleague Ewan joins us, to talk about Observability.

Stu explains that Observability is how you monitor requests across microservices.

Microservices (which we foolishly don’t describe during the recording) is the term given to an application architectural pattern where rather than having all your application logic in a single “monolith” application, instead it is a collection of small applications, executed, as required, when triggered by a request to a single application entry point (like a web page). These small applications are built to scale horizontally (across many machines or environments), rather than vertically (by providing them with more RAM or CPU on a single host), which means that if you have a function that takes a long time to execute, this doesn’t slow down the whole application loading. It also means that you can theoretically develop your application with less risk, as you don’t need to remove your version 1 microservice when you develop your version 2 microservice, so if your version 2 microservice doesn’t operate the way you’re expecting, you can easily roll back to version 1. This, however, introduces more complexity in the code you’ve written, as there’s no single point for logs, and it can be much harder to identify where slowdowns have occurred.

Stu then explains that observability often refers to the “three pillars“, which are: Metrics, Logs and Tracing. He also mentions that there’s a fourth pillar being mentioned now about “Continuous Profiling“. Jerry talks about some of the products he’s used before, including Data Dog and Netdata, and compares them to Nagios.

Ewan talks about his history with Observability, and some of the pitfalls he’s had with them.

Stu talks about being a “SRE” – Site Reliability Engineer, and how that influences his view on Observability. Stu and Ewan talk about KPIs (Key Performance Indicators), SLI (Service Level Indicators) and SLO (Service Level Objectives), and how to determine what to monitor, and where history might make you monitor the wrong things. Jerry asks about Error Budgets. Stu talks about using SLI, SLO and error budgets to determine how quickly you can build new features.

Jerry asks about tooling. Stu and Ewan talk about products they’ve used. Jon asks about injecting tracing IDs. Ewan and Stu talk about how a tracing ID can be generated and how having that tracing ID can help you perform debugging, not just of general errors, but even on specific issues in specific contexts.

Jon asks about identifing outliers with tooling, but the consensus is that this is down to specific tools. Ewan mentions that observability just is tracing events that occur across your systems, and that metrics, logs and tracing can all be considered events.

Jon asks about what is a “Log”, a “Metric” and a “Trace”, Ewan describes these. Stu talks about profiling and how this might also weigh into the conversation, and mentions Parca, a project talking about profiling.

Ewan talks about the impact of Observability on the “industry as a whole” and references “The Phoenix Project“. Jerry talks about understanding systems by using observability.

We talk about being on-call and alert fatigue, and how you can be incentivised to be called out, or to proactively monitor systems. The DevOps movement’s impact on on-call is also discussed.

Ewan talks about structured logging and what it means and how it might be implemented. Stu talks about not logging everything!

We’re a member of the Other Side Podcast Network. The lovely Dave Lee does our Audio Production.

We want to remind our listeners that we have a Telegram channel and email address if you want to contact the hosts. We also have Patreon, if you’re interested in supporting the show. Details can all be found on our Contact Us page.

Admin Admin Podcast #096 Show Notes – Tech With A Cup Of Tea

Jon couldn’t make it for this episode again, however he should be back next time!

Jerry mentions that he is using NetData, for monitoring his own infrastructure and also for his clients. He mentions how it can be used as a Prometheus Exporter, as a standalone package, and also has a Cloud/SaaS offering.

He mentions how it can pick up services automatically (if Netdata supports them – Integrations). RPM-based packages are available in EPEL and a third-party Debian repository (more information here).

Jerry mentions that it can run effectively as an agent to send metrics back to Netdata Cloud, which is different from how Prometheus has worked traditionally.

Stuart mentions that Prometheus are now adding a new feature called Agent mode. This is to solve the issue of needing to get access to Prometheus on a site, without necessarily wanting to open up every site in firewalls/security groups or running VPNs.

Jerry mentions issues he’s having with Let’s Encrypt currently, with Apache Virtual Hosts, specifically in how to automate it with Ansible.

Stuart mentions moving away from using Apache and starting to use Caddy, as he moving to using containers for deploying his publicly available services. Caddy comes out of the box with Let’s Encrypt support, removing one of the challenges in automation.

He also uses Traefik at home, as not everything is container-based and Traefik makes a mixed environment quite straightforward to use. Traefik is more complex than Caddy, but does have some extra features that Stuart makes use of.

Jerry mentions Dehydrated, a BASH implementation of an ACME server (what Let’s Encrypt is based upon).

Stuart mentions that he has been overhauling his home infrastructure. His aim was to move to using Git to define his infrastructure more, rather than the mixture of some configuration management, some adhoc, some scripts, with no consistency.

He mentions using Gitea for source control, and finding the awesome-gitea repository for what can be used alongside Gitea. He mentions using Drone for continuous integration, which has allowed him to move most tasks from manually-triggered to triggered on changes in his Git repositories.

He has put a series of posts on his blog about it here: –

More posts on this are still to come!

Jerry asks about running Drone agents on something like Spot Instances or Spot Virtual Machines.

A discussion was had around our preferences on using an Open Source product with great documentation or a Commerial offering/SaaS with a support contract.

Stuart brought up the example of running something like Prometheus for monitoring (i.e. running a monitoring stack yourself) compared to something like Datadog that runs the monitoring stack for you.

Jerry mentions it is entirely dependent upon the service.

Stuart mentions that it can be nice to look through code to see where an issue might be that you are facing (and even contributing fixes).

Admin Admin Podcast #094 Show Notes – Observe closely

Jon couldn’t make it for this podcast due to a recent job change, but will be back soon

Stuart and Jerry talk some about their new jobs.

Stuart is a Site Reliability Engineer for a VoIP/Communications company. He talks about using PuppetTerraformNomad and Kubernetes. Jerry and Stuart both talk about the move to containers in both their jobs.

Jerry mentions learning Amazon AWS’s ECS (AWS managed Docker/Container solution) using Fargate. Stuart mentions using ECS previously, but using AWS EC2s rather than Fargate. Stuart also mentions that ECS is a lot simpler than Kubernetes, but the simplicity does have some trade offs.

Al mentions he has recently recertified his Azure Administrator Associate certiication. He mentions how the certifications are “point-in-time”, in that it doesn’t reflect some of the newer features.

Al also mentions the Late Night Linux Extra podcast episode featuring Martin Wimpress (of Ubuntu MATE and ex-Canonical fame) episode on Docker Slim

Al mentions Azure Web Apps, which are effectively Docker containers in the background.

Al asks an open question about monitoring and how it changes in the world of cloud, PaaS (Platform-As-AService) and microservices. He mentions how throwing machine resources at a problem doesn’t always fix an issue.

Stuart talks about the idea of contention in the cloud being desirable, compared to being avoided in on-premises environments. He mentions his issues with using purely thresholds for monitoring. He refers to distributed tracing to get insights into requests/services (especially when running across a number of microservices).

Stuart mentions the Golden Signals method of monitoring. He also refers to the Site Reliability Engineering handbook from Google.

Jerry mentions about using Prometheus for metrics, specifically the node_exporter as a lightweight agent for monitoring node metrics.

Stuart mentions OpenMetrics (which is the Prometheus metrics format but as an open standard) which can be exposed by any application, not just a specific exporter. He mentions adding this to his own applications, and writing exporters as well.

Stuart talks about eBPF, how it relates to monitoring, as well as tracing and forwarding packets. He mentions eBPF programs that are allowed to sit alongside the kernel itself, allowing direct kernel tracing or taking actions on network packets before they reach the kernel.

Stuart references Brendan Gregg and his website for information on eBPF usage and examples. He also later mentions Liz Rice for great information and tutorials on eBPF, having started learning eBPF because of her great tutorials.

Stuart mentions about start to learn C to be able to write eBPF programs. He also mentions that you can interact with eBPF programs using Go, Python, C and Rust, whereas the eBPF programs themselves are either in C or recently in Rust.

Al mentions that Azure Web Apps for PHP include Apache for PHP 7, and Nginx for PHP 8.

Jerry brings up Terragrunt, which is a thin wrapper for Terraform. Terragrunt extends Terraform with some useful features like being able to run Terraform across multiple directories, and to make Terraform DRY (Don’t Repeat Yourself). It can also show a graph of dependencies too. Stuart mentions why separating Terraform files into different directories is desirable, but comes with a trade off that Terragrunt can help resolve.

Jerry mentions how using Terragrunt to separate environments and parameterise Terraform helps significantly with keeping repitition of code lower.

Al talks about Terraform Workspaces as a way of separating environments.

Al brings up the subject of other podcasts we listen to, including: –

  • Ship It – About deployment, infrastructure and the operation of software
  • Rent, Buy, Build – About the cloud native world and whether to use a managed solution, an off-the-shelf solution, or building it yourself for different technologies
  • Al’s Code Snippets Podcast – About Al’s journey into coding and his learnings along the way