In this episode, we go through your questions and feedback. Keep it coming! For example via our Telegram group
First question is from meaty:
– Meaty, a sysdmin in education
First a touch of background to add some context: I work as a team lead & sysadmin (+ “hack” of all trades) in education on a fairly large Windows network. Low budget, high demand, and besides some legal stuff and, contrary to what all the teachers and admin staff believe, no overly urgent requirements (no intellectual property, no critical systems, no four-9’s uptime requirements, but we do have lots of personal and sensitive data). We have an old, mostly unchanging network but due to the nature of teaching, many departments change up their location and/or software (which is often cheap, poorly made and has incredibly specific requirements) on a termly or yearly basis. Lots of “last minute this is urgent do it now” stuff, and even more projects where we’re not consulted and have to hack together solutions at the 11th hour after the majority of work has been done without anyone communicating with us.
We’re small enough that we don’t have much available extra capacity people or resource-wise, but complex enough to have a couple dozen servers (mostly VMs) running on old hardware and nearly 100 switches across a dozen buildings on four campuses, on top of other random infrastructure that is becoming digitised, such as boilers, cctv, access control. Small team, too, so time is tight. No overtime and no out-of-hours work (9-5 only) which is nice, but causes problems as we have no maintenance windows to make changes!
q1: in order to make our lives easier I’m beginning to embrace more automation. We’ve got the big stuff out of the way but to proceed we’re looking into using lots of custom powershell scripts for a lot of this given the random requirements and poor quality of our software. We’ve run into a small issue but I’m not sure what the best practice and most practical solution is. We often need to run scripts over night. So far we’ve run them off a random server that also does other things during the day (hosts a few end user applications) but we know there’s a better way. What is it? Dedicated server? Does something exist that’ll manage this for us instead of using task scheduler on a 2016 box?
q2: We deal with a lot of sensitive data across a lot of systems involving many different types of person – students, staff, parents, visitors, governors, contractors, etc. We know that if an incident/breach occurs and we need to investigate, we’ll be on the phone to an expensive third party to come in and investigate for us as we just don’t know what to look for or where to find it. We need some kind of centralised logging, which we can deploy in time. For now, though, what are the essentials to enable and where can we find them? (eg: logging in AD)
Running scripts on machines
Jerry suggests Ansible for Windows, it speaks to WinRM and runs powershell scripts on the node. Jon suggests Ansible Tower/AWX. It’s an Ansible job scheduler and a credential store. He also suggests version controlling those powershell scripts/ansible code in version control e.g. with Gitlab. Advantges include the ability to run config mgmt from a single place – a “single pane of glass”
He warns that running Gitlab and AWX on a machine can be resource heavy. Jon refers to his Vagrant machine for Gitlab and AWX.
Al reckons that on the windows side, SCCM is good and in depth but expensive. He notes that charities or educational institutions can get it cheaper
Centralised logging/data security
On windows – the Auditing Service is something that can be enabled on the Domain Controller. It logs events like user logging, searching can be a challenge due to the amount of data created.
Al mentions that you can enable these with some scripts.
Jerry mentions that good versioned backups help with Ransomware attacks
Make your servers disposable (cattle vs. pets)
Encryption at rest
- Bitlocker (windows)
- LUKS (Linux)
- Veracypt (Cross-platform, but beware that there’s no veracrypt device driver for Win10 install environments, which can cause an issue with quarterly Win10 upgrades)
Next question is from Andy
– Andy, deploying Windows Desktops
“Is there an affordable way to image Windows desktops that is less insanely complex than Microsoft’s deployment thing?”
I’ve already had a few suggestions here on Telegram but perhaps other listeners face the same challenge.
- MDT with SCCM on top
- You must have a Volume License Key to even image a Windows machine, though it’s technically possible to do it without one
- MDT builds a “golden image”, which then gets pushed to the server
- Initial Setup is a big effort, but makes life easier once its done.
- Sysprep resets the machine’s SID to make sure the image can be put on different machines
- PXE (Legacy & UEFI)
Our last question comes from Stuart
– Stuart, wonders about what to do in the case of a significant outage at a cloud provider
AWS/GCP/Azure fall off the face of the planet overnight, and you are now faced with either choosing smaller providers (with probably a much smaller feature set) or moving back to on-prem
In that situation, what would you choose?
If the former, how would you deal with the limitations? Would you mix and match workloads across multiple providers or would you stick with one or two and work with the limitations?
If the latter, would your workflow and choice of infrastructure change based upon how you work with the cloud now? Would you steer more towards hyperconverged and/or private cloud in a box solutions, or would it be VMware/KVM/Hyper-V with config management, or just revert to how it was pre-cloud days?
I suppose in a sense it’s a question partly about reliance on the big clouds, but also how do you think on prem has improved (if at all) to keep up with the cloud providers
Jon thinks losing all the big cloud providers is pretty unlikely, Jerry thinks if that happens, we would have bigger problems.
Do we count DigitalOcean? They don’t have things like autoscaling and key mgmt, but it should be possible to build these yourself and use smaller providers. If the big 3 disappeared, smaller providers might rush to fill that space. Jon points out that there isn’t really a framework for running Functions-as-a-service (e.g. AWS Lambda).
Jerry says that a Lambda function is just a container – if you have an easy way to get those up and running.
Jerry mentions he has been working with on-prem for most of the last year. In that environment it’s still worth thinking in terms of cloud workflows to inform the on-prem work. The other thing is that on-prem environments can be made easier to manage by using the tooling that has grown up around managing infra on cloud providers.
Jon mentions VMware.
– Vmware NSX-T can run in AWS (and others, including bare metal)
Jerry mentions oVirt.
Al is still 50/50 between running on-prem stuff and running stuff in the cloud. He doesn’t think on-prem is going anywhere 🙂 He would also be using modern tooling to get things done.
We got some Feedback from David:
Thank you for your podcast.
In episode 075, you asked about tools to check whether a web page had
changed. You might like to try Silas Brown’s WebCheck program:
We also got Feedback from Producer Dave:
Just wanted to say thanks for a fantastic episode 75.
I gotta be honest, a lot of what you guys talk about goes over my head as I’ve never used Selenium, Terraform, Ansible, etc… but I still enjoy listening because I can often pick up some utter gems.
I’d heard much talk about SyncThing on t’interwebs, but it wasn’t until I heard about it on this episode and actually looked into it more that I realised how powerful it actually is. I’m currently using it to perform a one-way backup key folders on my phone and tablet to my laptop. But I also have a two-way sync (kinda like a Dropbox or NextCloud shared folder) in place so that I can transfer files to my phone seamlessly.
Having heard about Al’s experiences of spinning up a NextCloud instance on a $5 Digital Ocean droplet, I decided to do the same as a test… and ended up shifting over to it permanently. All I had to do was spin up the droplet, snap install nextcloud, enter some information, run a single command to apply a Let’s Encrypt certificate, and that was it. 5 minutes, tops. And moving all my stuff between instances was really straight forward too. So thanks for the confidence to make the move, Al!
At the moment, I have 3 VPSes (costing over £36/month) that I could quite easily replace with a number of DO droplets. A $5 droplet, with backup, plus VAT is just under £6, so I could theoretically spin up to 6 $5 droplets (or fewer if I spin a $10 one up, which I might do for some of the smaller services I’m running), but I don’t think I’ll need that many, which will save me money in the long run – win!
Again, thanks for a great episode, and congratulations on the audio quality… you should give your producer a pay rise #JustSaying
We lastly got Feedback from Jason:
As gathered from the Iron Sysadmin Slack:
XenoPhage (Jason) [12:59 AM]
Hey @JonTheNiceGuy … Was listening to AdminAdmin 75 .. (Yeah, I’m behind a bit) .. Tell Al to take a look at webinject.pl .. Works great with monitoring systems like nagios/icinga2/etc. for monitoring versions of software.. I’ve used it for years to let me know when updates come out for things i can’t just add a yum repo for. :slightly_smiling_face:
Al seems to have dropped off the recording!
Consolidating services chat:
Jon is involved with the lug.org.uk infrastructure, where they have the following problems:
- x86 build – becoming unsupported by modern OSes
- Too many machines – looking for a way to reduce the number of physical machines.
Jerry’s instinct is to decouple services, Jon is interested in using docker or something similar
Docker has a way to glue the networking of individual containers together. More complex deployments would probably require e.g. Kubernetes – which is much more complicated.
Any suggestions from listeners?
Al is back!
Thanks Dave! 🙂 We agree to a payrise on-air..
- Oggcamp – We’re all going – see you there? 🙂
Welcome to new listeners! Give us feedback…