Zero Downtime in 90 Days: A FinTech Deployment Transformation
I helped a FinTech client reduce their 4-hour maintenance window to zero downtime deployments in 3 months.
The Problem with Scheduled Maintenance
It's not uncommon for many companies in this sector to have scheduled downtime, usually on a Sunday, where every release means having engineers on call, customers seeing maintenance pages and the risk of further revenue loss if the rollout doesn't go to plan.
For a 24/7 financial platform, this isn't acceptable long term. The reasons for this are nearly always the same:
- Deployments are tightly coupled, everything has to ship at once.
- There's no proper rollback strategy in place. The only strategy is usually to re-deploy the old version, manually.
- Database updates are coupled to application deployments.
- Deployments happen less frequently due to loss of confidence which in turn makes each deployment riskier.
What We Changed
In reviewing their architecture, I introduced the following:
- A blue-green deployment pipeline (in AWS) that shifted traffic in under 60 seconds.
- I decoupled database changes from application deployments with backward-compatible schema changes (expand/contract).
- Feature flags allowed code to ship independent of releasing new features.
- Automated deployment tests that run before traffic cut-over.
The Results
After 3 months we had achieved zero downtime releases. That's not to say we didn't have deployment failures. We did. We had achieved zero downtime deployments, 100% of the time.
Deployment frequency went from once per week on a Sunday to multiple times daily. Rollback time was reduced from 4-hour full re-deployments to 45 seconds.
Infrequent Deployments Don't Reduce Risk
If there is one takeaway from the above, it's this:
Infrequent deployments don't reduce risk. They increase it. The longer you wait between releases, the more changes are piling up and as a result the harder it is to isolate what broke when things do go wrong (and they will).
Zero-downtime deployments are not only an infrastructure win, they're a culture shift that lets your teams move faster without fear.