Author Archives: mralcadmin

Admin Admin Podcast #091 Show Notes – A Comedy of Errors

Jon brought Nick "Mohclips" into the podcast to talk to us about some of the things he does.

Nick talks about "Gold Images" – and mentions that he’s created images because of issues of provenance. He mentions docker containers holding cryptocurrency miners. We agree that you should check the images you’re downloading are coming from the vendors of those images, as it’s not just on Docker, there are also issues with at least AWS (Amazon Web Services) public AMIs (AWS Machine Image) and Azure public VM images too.

We also discuss CIS (Center for Internet Security) hardening guides and Nick mentions that he uses Ansible to implement the controls. Jon mentions an interview with Jeff Geerling to quote some numbers of Ansible Modules.

We talk a bit about Ansible 3, and Collections which are formally introduced in this release.

We talk about Semantic versioning, and explain about how movements in version numbers should explain why you would move between one major version number and the next, or between a major.minor version number, or between a major.minor.patch version number and the next.

Next Nick talks about ServerSpec, a set of RSpec tests for servers and Jerry suggests that Nick might be talking about Inspec instead. Jerry also mentions Molecule which is similar. Jerry asks whether Nick uses a CI/CD (Continuous Integration and Continuous Delivery or Continuous Deployment) system. Nick explains why he doesn’t.

Nick mentions he’s a "Lazy Engineer". Nick also mentions Kanban boards in passing.

Jerry talks about Netdata. Stu talks about Pulumi. Jerry talks again about Tinkerbell which was linked to from DevOps Weekly. Stu mentions that Tinkerbell was also mentioned on an Equinix Metal blog post which also covers quite a bit of Pulumi too.

Admin Admin Podcast #090 Show Notes- Rise and Shine


Al is using remote state in Azure using Azure Blob Storage.

John mentions terraformer using this to import infrastructure into the tfstate file.

Jerry mentions using “terraform import” to import Azure Resources.

Al ask about

John mentions his blogpost.

Al mentions the youtube channel he’s been following for Tutorials about Terraform.

Jerry mentions that Centos 6 EOL was November 30th 2020 and Ubuntu 16.04 will be on April 30th 2021.

Al mentions the Naming vs Tagging blog post.

Al mentions that he now using unraid for his nas. Jon mentions following this 2.5 admins podcast episode.

Admin Admin Podcast #088 Show Notes – Speculative execution

This is a predictions show. To save you from being spoiled what the predictions are, there will just be some links to terms and articles mentioned in the show. The rules are inspired by the Bad Voltage accumulation of prediction rules revealed in episode 2×62. We make reference to the fact that in the most recent predictions review show (episode 3×19) the haggling for fractions of a point are unbelievable. It’s amazing 🙂

This had the impact of making some of the predictions being walked back…

So, with that, on to the terms of note:

Wrap up

Admin Admin Podcast #079 Show notes – A conversation with the coolest nerd in the room

In this episode, Al and Jon (no Jerry this time, sadly) have a conversation with Reggie from The Coolest Nerds in the Room Podcast.

Reggie is a Site Reliability Engineer (SRE). SRE was a term coined by Google in 2016. SREs will often perform operations roles, similar to those performed by “DevOps” or Operations teams, but are also responsible for reliability by monitoring the health of a service, an application or a node, and reacting to issues with a longer term view on solving those issues.

Reggie went into how he moved into an SRE role, and went into some details on the platforms he’s used in the past, including AWS, Azure and Google Cloud.

Reggie mentions the following terms:

  • Kubernetes (sometimes abbreviated to K8s) – A container orchestration tool, run by the Cloud Native Computing Foundation. Jon mentions MiniKube, which is a way to run Kubernetes on your local machine.
  • Stackdriver – a monitoring tool.
  • SLI – Service Level Indicator. An SLI is an indicator which is observed on a service component, like remaining storage capacity, CPU utilization by a specific application, number of errors returned by the application, response time to retrieve a specific page element, and so-on.
  • SLO – Service Level Objective. An SLO is the target for the SLI items on the host. For example, you might be looking for an SLO of < 5 non-OK HTTP responses in 1 hour, or perhaps that the login service returns a response in less than 3 seconds. This is typically a lower threshold than the SLA, and is the point where an SRE would be engaged to identify *why* the service was degraded before it becomes an issue.
  • SLA – Service Level Agreement. An SLA is a contractual agreement between the service provider and the service consumer, for example between a website and it’s user, or between a microservice and the overarching service it’s trying to deliver. The SLA might refer to SLO-like components, for example “logging in must take less than 5 seconds” or “no more than 10 minutes of outage time in a given month”.
  • Error Budget. This wasn’t explored particularly in the show, but seems to be an “acceptable” level of SLO failure that, if that threshold were crossed, should trigger the engagement of the SRE.

Next, we go into how Reggie started his podcast with Steph. We talk about how the podcast developed and how they keep their momentum in tech. This turns into a wider conversation about working in IT.

Reggie talks about how Kubernetes works, and how this has changed his workflow. We mention “Pets versus Cattle“, Microservices and Containers.

Reggie talks about how he learned about Kubernetes, and things he feels you need to understand about Kubernetes to be able to use it well. We mention that it’s worth learning about how Docker works (as a Container primitive), and then growing out to using Kubernetes. We mention that all the major cloud providers (AWS, Azure, Google) have Kubernetes platforms, that you can host Kubernetes in your hosting environment, and that you can also run MiniKube to learn Kubernetes on a small number of machines.

Reggie suggests that the Velocity Conference was very worthwhile getting to!

Reggie goes into more detail on what being an SRE is about, and talks about why Google and other large companies are moving towards using the SRE roles.

Reggie talks about bringing more diversity into tech, and that nerds are frequently very harsh about excluding people based on their choices and preferences. He also endorses bringing new people into your environments, and mentions that these can be good opportunities to examine why you do things and to ask if how they’re done is the right way to do them.

Reggie mentions that he puts videos on Instagram about tech basics, and encourages people to let him know when there’s something they don’t understand!

Admin Admin Podcast #078 Show notes – Unrolling OggCamp 2019

For this week’s episode we are sitting in a hotel lobby discussing OggCamp 19,  with special guest Gary Williams and Special thanks to Joe Ressington, standing in with his recording gear to record the podcast.

Al did a live demo for a talk and it did not work due to demo gods in: “How I use wireguard to connect to my VPS” but got it working after the event. More info can be found here.

We all agree this was the best talk at OggCamp “The power of change – learning to live as a “weirdo”” by Rachel Morgan-Trimmer.

The Oggcamp kids’ track continues to grow..

Al, Jerry and Gary mention about Talk “The MQTT, InfluxDB, NodeRED and Grafana stack, and natural intelligence” by Julian Todd and his @wheeliepad.

Al and Gary have a go at lock-picking.

Gary talk to us about how he migrated from being a SysAdmin to DevOps engineer.

Jon talks about “Noobs on Ubs (Ubuntu for Beginners) ” talk by Anna Dodson

Admin Admin Podcast #077 Show notes – The one about monitoring

We introduce our guest – Lucy McGrother.

Lucy is a colleague of Jon’s, who worked in Windows Support, Enterprise Management and now SOAR (Security Orchestration, Automation and Response).

Jon explains what SOAR is, and Lucy improves his answer.

We introduce the question of Monitoring, as raised by our Telegram group.

Lucy explains that you need to start by asking “What do you want to monitor”, and the answer shouldn’t be “everything”. We also talk about how you can respond to monitoring events. Lucy makes a sensible point “When you get an alarm from a monitor, it’s just telling you there’s something wrong to be looking at, and it’s up to you to add the intelligence to it”.

We discuss what enterprise monitoring tools we’ve used, including SCOM (System Center Operations Manager – a Microsoft product, part of SCCM) and CA OIM (previously known as “NSM”, “TNG”, “NISM”). We also mention some open source tools, like Zabbix, Nagios, Monit, Grafana and a free/paid product PRTG.

There’s also a conversation about how you can monitor processes running on a machine to reduce the amount of “noise”. Jon mentions about writing content to a log file, and capturing the output, but that won’t capture all the updates, Lucy mentions you can just monitor whether a log file has been touched in X hours!

Jerry talks about Nagios monitoring plugins, and how they would report issues using error codes.

Al mentions the podcast “Self Hosted Show“.

Jerry talks about the difference between metrics and polling. Lucy mentions that she did a Microsoft Statistics and Analytics course, and that your polling tool should be feeding metrics data for later use.

Jon and Lucy draw some information from their pasts about dealing with incidents and about how it’s difficult to pull logs from boxes, especially when there’s a need to resume service as soon as possible. We also discuss the difficulty of having a constant log transfers to other devices, particularly in carrier grade equipment that might be processing many gigabytes per second, a proxy for a large company that might be producing many 10,000’s of log files per 24 hours, collecting logs from cloud providers that charge for egress traffic, or perhaps if there’s someone malicious inside your network that is trying to hide their actions, they might spam the monitoring solution with valid or invalid log entries to frustrate investigators.

Jerry talks about how application developers he’s worked with frequently embed log collection features into their applications so that you have a known API point you can ask for the status of that application, and use that from your polling system.

Jon brings up a point made in the Telegram group from Stuart, who mentions that his workloads are frequently ephemeral, and that he really needs something that handles service discovery, like Prometheus and Consul.

Jon went on a Wireshark Webinar which he’d strongly endorse people watch (he’s waiting on approval to post the link), and ideally get training from the creator of the course!

Jon also is reading “Analogue Network Security” by Winn Schwartau.

Jerry mentions a weekly podcast “The Pod Delusion” which has restarted. Jon mentions “The Coolest Nerds In The Room” podcast. Al talks about the “Lost Connections” audio book and connected podcast – “Uncovering the Real Causes of Depression with Johann Hari“. Lucy mentions the school in Salford who are teaching all their pupils BSL (British Sign Language) to ensure that deaf students at the school are included.

We thank Dave Lee for his continuing work in fixing up our audio. Jerry non-ironically mentions that he hopes our audio will be better this episode. Dave has advised us that he laughed extensively when he heard this.

