One day, a change went live in our AWS environment that no one remembered making.
Security groups had shifted. Auto-scaling thresholds were off. What we had in Terraform did not match reality. I built this while working at Apex Analytix, where even minor infrastructure changes could impact supplier portals and audit workflows. We needed early warnings, not postmortems.
The Challenge
AWS Config tells you what changed, eventually. But I needed instant visibility, not hours later, not after something broke. I wanted something that could:
- ✅Compare live infra against baseline configs
- 🔔Alert on changes immediately
- ♻️Roll back if needed with a toggle
The Fix: Detect + Alert + Revert
I built a system using:
- 🧠AWS Lambda + CloudWatch Events: trigger checks every 10 minutes
- 🗂️Baseline stored as JSON snapshot in S3
- 📩Drift alert email with change details
- ♻️Optional auto-revert if critical resources are touched
GitHub: github.com/chinmaya-chhatre/configuration-drift-detector
What Changed
- 🔒Fewer surprise changes in prod
- 📉Mean Time to Detect config issues dropped by 60%
- 💬Helped SRE and Security stay in sync with dev infra updates
Config drift is invisible until it is not. Detecting it is just as critical as preventing it.
Tradeoffs I Made
- 📸Used JSON snapshot in S3 instead of live Terraform state: faster, easier to trigger via Lambda
- 🤝Approval-based auto-revert keeps humans in control
- 🔧Manual baseline updates add ops overhead but enforce conscious versioning
What I Would Add Next
- 📊Dashboard to visualize drift over time
- 🔐Role-based control over what is allowed to drift
- 🏷️Auto-tagging for anything outside baseline
More from the Tech Blog