In this episode we talk about Terraform and Azure AKS
Show Notes: https://www.adminadminpodcast.co.uk/ep090sn/
In this episode we talk about Terraform and Azure AKS
Show Notes: https://www.adminadminpodcast.co.uk/ep090sn/
Al is using remote state in Azure using Azure Blob Storage.
John mentions terraformer using this to import infrastructure into the tfstate file.
Jerry mentions using “terraform import” to import Azure Resources.
Al ask about output.tf.
John mentions his blogpost.
Al mentions the youtube channel he’s been following for Tutorials about Terraform.
Jerry mentions that Centos 6 EOL was November 30th 2020 and Ubuntu 16.04 will be on April 30th 2021.
Al mentions the Naming vs Tagging blog post.
This is a predictions show. To save you from being spoiled what the predictions are, there will just be some links to terms and articles mentioned in the show. The rules are inspired by the Bad Voltage accumulation of prediction rules revealed in episode 2×62. We make reference to the fact that in the most recent predictions review show (episode 3×19) the haggling for fractions of a point are unbelievable. It’s amazing 🙂
This had the impact of making some of the predictions being walked back…
So, with that, on to the terms of note:
We want to remind our listeners that we have a Telegram channel and email address if you want to contact the hosts. We also have Patreon, if you’re interested in supporting the show. Details can all be found on our Contact Us page.
The Admin Admin guys host their first predictions show
Show Notes: https://www.adminadminpodcast.co.uk/ep088sn/
Reggie is a Site Reliability Engineer (SRE). SRE was a term coined by Google in 2016. SREs will often perform operations roles, similar to those performed by “DevOps” or Operations teams, but are also responsible for reliability by monitoring the health of a service, an application or a node, and reacting to issues with a longer term view on solving those issues.
Reggie went into how he moved into an SRE role, and went into some details on the platforms he’s used in the past, including AWS, Azure and Google Cloud.
Reggie mentions the following terms:
Next, we go into how Reggie started his podcast with Steph. We talk about how the podcast developed and how they keep their momentum in tech. This turns into a wider conversation about working in IT.
Reggie talks about how he learned about Kubernetes, and things he feels you need to understand about Kubernetes to be able to use it well. We mention that it’s worth learning about how Docker works (as a Container primitive), and then growing out to using Kubernetes. We mention that all the major cloud providers (AWS, Azure, Google) have Kubernetes platforms, that you can host Kubernetes in your hosting environment, and that you can also run MiniKube to learn Kubernetes on a small number of machines.
Reggie suggests that the Velocity Conference was very worthwhile getting to!
Reggie goes into more detail on what being an SRE is about, and talks about why Google and other large companies are moving towards using the SRE roles.
Reggie talks about bringing more diversity into tech, and that nerds are frequently very harsh about excluding people based on their choices and preferences. He also endorses bringing new people into your environments, and mentions that these can be good opportunities to examine why you do things and to ask if how they’re done is the right way to do them.
Reggie mentions that he puts videos on Instagram about tech basics, and encourages people to let him know when there’s something they don’t understand!
Wrapping up, we thank our Patreons, Dave for being our superproducer, and invite you to chat with our audience on Telegram, or directly to the team by email, especially asking any questions you want the podcast to answer!
For this week’s episode we are sitting in a hotel lobby discussing OggCamp 19, with special guest Gary Williams and Special thanks to Joe Ressington, standing in with his recording gear to record the podcast.
We all agree this was the best talk at OggCamp “The power of change – learning to live as a “weirdo”” by Rachel Morgan-Trimmer.
The Oggcamp kids’ track continues to grow..
Al, Jerry and Gary mention about Talk “The MQTT, InfluxDB, NodeRED and Grafana stack, and natural intelligence” by Julian Todd and his @wheeliepad.
Al and Gary have a go at lock-picking.
Gary talk to us about how he migrated from being a SysAdmin to DevOps engineer.
In this show we discuss Oggcamp 2019 and we have Gary on the show to talk bout changing jobs from a sysadmin to DevOps Engineer
Show Notes: https://www.adminadminpodcast.co.uk/ep078sn/
We introduce our guest – Lucy McGrother.
Lucy is a colleague of Jon’s, who worked in Windows Support, Enterprise Management and now SOAR (Security Orchestration, Automation and Response).
Jon explains what SOAR is, and Lucy improves his answer.
We introduce the question of Monitoring, as raised by our Telegram group.
Lucy explains that you need to start by asking “What do you want to monitor”, and the answer shouldn’t be “everything”. We also talk about how you can respond to monitoring events. Lucy makes a sensible point “When you get an alarm from a monitor, it’s just telling you there’s something wrong to be looking at, and it’s up to you to add the intelligence to it”.
We discuss what enterprise monitoring tools we’ve used, including SCOM (System Center Operations Manager – a Microsoft product, part of SCCM) and CA OIM (previously known as “NSM”, “TNG”, “NISM”). We also mention some open source tools, like Zabbix, Nagios, Monit, Grafana and a free/paid product PRTG.
There’s also a conversation about how you can monitor processes running on a machine to reduce the amount of “noise”. Jon mentions about writing content to a log file, and capturing the output, but that won’t capture all the updates, Lucy mentions you can just monitor whether a log file has been touched in X hours!
Jerry talks about Nagios monitoring plugins, and how they would report issues using error codes.
Al mentions the podcast “Self Hosted Show“.
Jerry talks about the difference between metrics and polling. Lucy mentions that she did a Microsoft Statistics and Analytics course, and that your polling tool should be feeding metrics data for later use.
Jon and Lucy draw some information from their pasts about dealing with incidents and about how it’s difficult to pull logs from boxes, especially when there’s a need to resume service as soon as possible. We also discuss the difficulty of having a constant log transfers to other devices, particularly in carrier grade equipment that might be processing many gigabytes per second, a proxy for a large company that might be producing many 10,000’s of log files per 24 hours, collecting logs from cloud providers that charge for egress traffic, or perhaps if there’s someone malicious inside your network that is trying to hide their actions, they might spam the monitoring solution with valid or invalid log entries to frustrate investigators.
Jerry talks about how application developers he’s worked with frequently embed log collection features into their applications so that you have a known API point you can ask for the status of that application, and use that from your polling system.
Jon brings up a point made in the Telegram group from Stuart, who mentions that his workloads are frequently ephemeral, and that he really needs something that handles service discovery, like Prometheus and Consul.
Jon went on a Wireshark Webinar which he’d strongly endorse people watch (he’s waiting on approval to post the link), and ideally get training from the creator of the course!
Jerry mentions a weekly podcast “The Pod Delusion” which has restarted. Jon mentions “The Coolest Nerds In The Room” podcast. Al talks about the “Lost Connections” audio book and connected podcast – “Uncovering the Real Causes of Depression with Johann Hari“. Lucy mentions the school in Salford who are teaching all their pupils BSL (British Sign Language) to ensure that deaf students at the school are included.
We thank Dave Lee for his continuing work in fixing up our audio. Jerry non-ironically mentions that he hopes our audio will be better this episode. Dave has advised us that he laughed extensively when he heard this.
Dave is also one of our Patreons – if you also want to be a Patreon, please follow this link: https://www.patreon.com/adminadminpodcast.
In this episode, we go through your questions and feedback. Keep it coming! For example via our Telegram group
– Meaty, a sysdmin in education
First a touch of background to add some context: I work as a team lead & sysadmin (+ “hack” of all trades) in education on a fairly large Windows network. Low budget, high demand, and besides some legal stuff and, contrary to what all the teachers and admin staff believe, no overly urgent requirements (no intellectual property, no critical systems, no four-9’s uptime requirements, but we do have lots of personal and sensitive data). We have an old, mostly unchanging network but due to the nature of teaching, many departments change up their location and/or software (which is often cheap, poorly made and has incredibly specific requirements) on a termly or yearly basis. Lots of “last minute this is urgent do it now” stuff, and even more projects where we’re not consulted and have to hack together solutions at the 11th hour after the majority of work has been done without anyone communicating with us.
We’re small enough that we don’t have much available extra capacity people or resource-wise, but complex enough to have a couple dozen servers (mostly VMs) running on old hardware and nearly 100 switches across a dozen buildings on four campuses, on top of other random infrastructure that is becoming digitised, such as boilers, cctv, access control. Small team, too, so time is tight. No overtime and no out-of-hours work (9-5 only) which is nice, but causes problems as we have no maintenance windows to make changes!
q1: in order to make our lives easier I’m beginning to embrace more automation. We’ve got the big stuff out of the way but to proceed we’re looking into using lots of custom powershell scripts for a lot of this given the random requirements and poor quality of our software. We’ve run into a small issue but I’m not sure what the best practice and most practical solution is. We often need to run scripts over night. So far we’ve run them off a random server that also does other things during the day (hosts a few end user applications) but we know there’s a better way. What is it? Dedicated server? Does something exist that’ll manage this for us instead of using task scheduler on a 2016 box?
q2: We deal with a lot of sensitive data across a lot of systems involving many different types of person – students, staff, parents, visitors, governors, contractors, etc. We know that if an incident/breach occurs and we need to investigate, we’ll be on the phone to an expensive third party to come in and investigate for us as we just don’t know what to look for or where to find it. We need some kind of centralised logging, which we can deploy in time. For now, though, what are the essentials to enable and where can we find them? (eg: logging in AD)
Jerry suggests Ansible for Windows, it speaks to WinRM and runs powershell scripts on the node. Jon suggests Ansible Tower/AWX. It’s an Ansible job scheduler and a credential store. He also suggests version controlling those powershell scripts/ansible code in version control e.g. with Gitlab. Advantges include the ability to run config mgmt from a single place – a “single pane of glass”
He warns that running Gitlab and AWX on a machine can be resource heavy. Jon refers to his Vagrant machine for Gitlab and AWX.
Al reckons that on the windows side, SCCM is good and in depth but expensive. He notes that charities or educational institutions can get it cheaper
On windows – the Auditing Service is something that can be enabled on the Domain Controller. It logs events like user logging, searching can be a challenge due to the amount of data created.
Al mentions that you can enable these with some scripts.
Jerry mentions that good versioned backups help with Ransomware attacks
Make your servers disposable (cattle vs. pets)
– Andy, deploying Windows Desktops
“Is there an affordable way to image Windows desktops that is less insanely complex than Microsoft’s deployment thing?”
I’ve already had a few suggestions here on Telegram but perhaps other listeners face the same challenge.
– Stuart, wonders about what to do in the case of a significant outage at a cloud provider
AWS/GCP/Azure fall off the face of the planet overnight, and you are now faced with either choosing smaller providers (with probably a much smaller feature set) or moving back to on-prem
In that situation, what would you choose?
If the former, how would you deal with the limitations? Would you mix and match workloads across multiple providers or would you stick with one or two and work with the limitations?
If the latter, would your workflow and choice of infrastructure change based upon how you work with the cloud now? Would you steer more towards hyperconverged and/or private cloud in a box solutions, or would it be VMware/KVM/Hyper-V with config management, or just revert to how it was pre-cloud days?
I suppose in a sense it’s a question partly about reliance on the big clouds, but also how do you think on prem has improved (if at all) to keep up with the cloud providers
Jon thinks losing all the big cloud providers is pretty unlikely, Jerry thinks if that happens, we would have bigger problems.
Do we count DigitalOcean? They don’t have things like autoscaling and key mgmt, but it should be possible to build these yourself and use smaller providers. If the big 3 disappeared, smaller providers might rush to fill that space. Jon points out that there isn’t really a framework for running Functions-as-a-service (e.g. AWS Lambda).
Jerry says that a Lambda function is just a container – if you have an easy way to get those up and running.
Jerry mentions he has been working with on-prem for most of the last year. In that environment it’s still worth thinking in terms of cloud workflows to inform the on-prem work. The other thing is that on-prem environments can be made easier to manage by using the tooling that has grown up around managing infra on cloud providers.
Jon mentions VMware.
– Vmware NSX-T can run in AWS (and others, including bare metal)
Jerry mentions oVirt.
Al is still 50/50 between running on-prem stuff and running stuff in the cloud. He doesn’t think on-prem is going anywhere 🙂 He would also be using modern tooling to get things done.
Thank you for your podcast.
In episode 075, you asked about tools to check whether a web page had
changed. You might like to try Silas Brown’s WebCheck program:
http://ssb22.user.srcf.net/setup/webcheck.html [Note: we were contacted by the author of this app to note that the URL had changed. This link is now the accurate one.]
Just wanted to say thanks for a fantastic episode 75.
I gotta be honest, a lot of what you guys talk about goes over my head as I’ve never used Selenium, Terraform, Ansible, etc… but I still enjoy listening because I can often pick up some utter gems.
I’d heard much talk about SyncThing on t’interwebs, but it wasn’t until I heard about it on this episode and actually looked into it more that I realised how powerful it actually is. I’m currently using it to perform a one-way backup key folders on my phone and tablet to my laptop. But I also have a two-way sync (kinda like a Dropbox or NextCloud shared folder) in place so that I can transfer files to my phone seamlessly.
Having heard about Al’s experiences of spinning up a NextCloud instance on a $5 Digital Ocean droplet, I decided to do the same as a test… and ended up shifting over to it permanently. All I had to do was spin up the droplet, snap install nextcloud, enter some information, run a single command to apply a Let’s Encrypt certificate, and that was it. 5 minutes, tops. And moving all my stuff between instances was really straight forward too. So thanks for the confidence to make the move, Al!
At the moment, I have 3 VPSes (costing over £36/month) that I could quite easily replace with a number of DO droplets. A $5 droplet, with backup, plus VAT is just under £6, so I could theoretically spin up to 6 $5 droplets (or fewer if I spin a $10 one up, which I might do for some of the smaller services I’m running), but I don’t think I’ll need that many, which will save me money in the long run – win!
Again, thanks for a great episode, and congratulations on the audio quality… you should give your producer a pay rise #JustSaying
As gathered from the Iron Sysadmin Slack:
XenoPhage (Jason) [12:59 AM]
Hey @JonTheNiceGuy … Was listening to AdminAdmin 75 .. (Yeah, I’m behind a bit) .. Tell Al to take a look at webinject.pl .. Works great with monitoring systems like nagios/icinga2/etc. for monitoring versions of software.. I’ve used it for years to let me know when updates come out for things i can’t just add a yum repo for. :slightly_smiling_face:
Al seems to have dropped off the recording!
Jon is involved with the lug.org.uk infrastructure, where they have the following problems:
Jerry’s instinct is to decouple services, Jon is interested in using docker or something similar
Docker has a way to glue the networking of individual containers together. More complex deployments would probably require e.g. Kubernetes – which is much more complicated.
Any suggestions from listeners?
Al is back!
Thanks Dave! 🙂 We agree to a payrise on-air..
Welcome to new listeners! Give us feedback…
Sadly, we’ve no Al this time, it’s just Jon and Jerry.
Want to join the community talking about this podcast on Telegram? Join us!
In this episode, we talk about: