Blue-Green Deployments: How and What
A blue-green deployment is a software delivery technique that’s been around for a little while. It’s meant to help you minimize the risks of downtime and/or errors associated with deploying new code to production. It works in tandem with feature flags, our favorite topic here.
Let’s dive in.
What is blue-green deployment?
The idea of a blue-green deployment is as follows: At any given point in time, there are two instances of your production environment running.
One instance runs your software and receives all your traffic, and the other is sitting idle, ready to be used to test and validate your next versions. This setup aims to give you a safety net if things go wrong.
The live deployment is called the “blue deployment” and the other instance is called “green deployment.”
Overview: the blue-green deployment process
The process of doing a “blue/green deployment” after these two environments are set up is to deploy new code to the green instance that’s running idle. Then, a portion of traffic is directed to the green environment via a load balancer, where it is tested thoroughly. If everything looks good, you start gradually shifting more traffic from blue to green.
If all goes well, you eventually move all your traffic to the green environment and can retire the old blue environment. And then your clients see the new version with no downtime between the existing application running your old code and the green instance running your new code.
If you find a problem, you can quickly switch back to the blue environment while you work on a fix. Once the fixed version has been tested in the green environment and proven stable, it can be deployed to the blue environment and the process can be repeated for future releases.
The advantages of blue-green deployment
As we’ve just determined, blue-green deployments are great for testing new versions in a real production environment, and they let you quickly roll back if something goes wrong. That ability to quickly roll back to a known good state is a big safety net. But there are other advantages.
Some teams use it as part of a disaster recovery strategy because it gives you a standby, production-ready environment ready to go.
Some teams make use of the idle green deployment as part of their load testing strategy.
You can copy all production traffic and send it to the green environment (also known as traffic shadowing). As it receives the copied traffic, you can observe how it handles the load, while responses are not sent back to your actual users. The blue environment continues to serve all real user requests. This is where some teams would historically use a “staging” environment, which leads us to the interesting observation that with blue/green you may not need a staging environment anymore.
The challenges with blue-green deployment
Like everything in software, there are no solutions, only trade offs. Here are some of the challenges with this approach.
- Replicating a production environment can get complicated, especially when working with microservices.
- There can be costs associated with maintaining a copy of your production environment.
- Changes to databases. First, most applications can only have one version of a database at a time which makes two environments with two separate databases challenging. We’ve got a couple of options here if we want to use blue-green deployments, but neither is straightforward. We could try to synchronize database changes after they’ve been committed to either the blue or the green environment. Essentially, we’re doing batch updates to keep everything aligned. Or, we could go the route of putting a multi-master database behind both environments.
Schema changes are also a challenge. If your database schema has to change between deployments, a good practice to follow is the expand/contract pattern. Here’s how it works:
- Expand: Add new schema elements (e.g., columns, tables) without removing old ones.
- Migrate: Gradually move data and switch application code to use the new schema.
- Contract: Remove old, unused schema elements once the transition is complete.
This approach supports both old and new code versions during the transition, making it compatible with blue-green deployments and allowing for safe rollbacks if needed. We have more tips to help you deal with this in our documentation.
Using feature flags instead of blue-green deployment
A blue-green deployment lets you test a new version of your code on a subset of users. That sounds a lot like a feature flag. So it begs the question, when should I use a blue-green deployment? And when should I use a feature flag?
Here are some scenarios where using feature flags make more sense than blue/green deployments. Like everything else in software, it’s about trade-offs, but feature flags make your life a lot easier when you need to deal with the following:
- Fine-grained control: When you need to enable or disable specific features at a very granular level, down to individual users, subsets of users or requests.
- Long-term toggles: When operational features need to be switchable over extended periods. Like log levels, rate limits, kill switches, etc.
- Complex rollback scenarios: In cases where rolling back involves more than just switching traffic, like reversing migrations.
- Complex testing: For complex A/B or multivariate testing that requires you to maintain multiple variants of a feature at the same time.
- Frequent production updates: When you’re releasing small changes multiple times a day, and want to control their release without having to copy your production environment.
- Restricting features by geographies: It’s significantly easier to or enable features only to users in specific countries or regions with feature flags. It’s possible at the load balancer level, but feature flags make this process more transparent to other people in your company.
Using feature flags with blue-green deployment
Feature flags can also be used as an enhancement to blue-green deployments.
The idea is to connect your feature management service to both environments. You can then enable or disable feature flags in either environment.
That lets you gradually activate and test new features in the green environment before the full traffic switch. If there are problems, you can quickly disable specific flags without rolling back the entire deployment.
This combined approach lets you:
- Maintain the safety net of blue-green deployments
- Get the flexibility of feature flags
- Reduce the risk associated with releasing multiple changes simultaneously
- Perform more granular testing in production with real users
Conclusion
With or without feature flags, blue-green deployments help you release while maintaining production stability. It’s not a silver bullet, and it comes with trade-offs but it’s one more tool in your toolbelt to deal with deploying code to users.