I decided not to stay on the monthly IT department catch up call this afternoon. Luckily I had plenty to keep me occupied a few minutes later.
> "Hey, does anyone know why the site's down?"
< "It's not down", "Ooh, it's slow, what's that about"
....
Long story short, somebody updated a security group config so one of the apps couldn't reach its cache. A few years ago it would have been a firewall, or iptables, or ipchains now we're in the cloud.
Before we could get around to identifying and fixing that we had a bigger problem - users started to see server errors instead of just slow responses and timeouts.
Second long story short - there was an expired SSL certificate on another back end service.
This is 2019, and we're still making the same types of mistake as I used to see in 1990s.
No comments:
Post a Comment