In this post, we will reflect on the practice of using feature toggles. As more and more development organizations are moving towards continuous delivery, we see them using feature toggles more and more. It is, once again, not a best practice, but one that comes with trade-offs. We are not implying feature toggles are bad, but we urge you to be aware of the trade-offs and risks involved, and to take a critical look at how you are using them.
Feature toggles defined
Feature toggles are a separate mechanism to give access to a feature, e.g. enabled in a specific environment, or enabled for specific users or organizations. Usually it works through some configuration flag, sometimes via a special administrator UI where you can toggle a feature on or off in a specific environment, like test, acceptance, production.
Feature toggles facilitate continuous delivery
As developers we want to ship small pieces and get early feedback. Users do not always want this - “where did that button move?’ - even when they are agile developers or UX people.
Feature toggles can help decouple delivery - moving changes to production - from release - enabling it for users - for example because what is meaningful for a user or customer comprises a bigger set of features. Another example is rolling out gradually.
Feature toggles enable you to deliver at will, and removing releasing to users or customers as a bottleneck. As a result of this decoupling, releasing something to specific users or stakeholders becomes a separate, independent business decision.
Feature toggles can help reduce the risk of ‘bad’ changes and fear of releasing. If you can reduce the impact of releasing a feature to just one or a few users, it becomes less risky. It even allows a kind of rollback when necessary - by switching off the feature toggle.
Feature toggles hinder continuous integration
Feature toggles come with trade-offs. They delay integration, so they tend to move away from continuous integration. A risk of this is late feedback - you don’t catch issues fast, because the issue is hidden behind the toggle.
Creating automated tests for the system with and without the feature toggle enabled can help a lot. This does increase system complexity however, as it is more difficult to juggle two variations of the system in your mind.
Feature toggles introduce extra complexity in the code, either by adding extra conditionals and behaviour variations or by having extra routing rules and configuration logic.
Feature toggles that span more than one component sharply increase cognitive load. It is much harder to reason over two or more components and how their different configurations interact, than it is to reason about one component.
Once teams start using feature toggles, sooner or later there will be many. Some feature toggle management is highly recommended, so that you know when to remove specific toggles and the corresponding code and tests.
Another nasty effect of multiple feature toggles occurs when they start interacting. You could get into a situation of a combinatorial explosion of toggle states. Effectively, you are creating a multitude of system configurations. Having automated tests for each individual configuration becomes much harder.
Once you have a toggle mechanism in place, it becomes your hammer and everything starts looking like a nail - or a thumb. We encounter teams using toggles to switch off parts of the code that are not yet fully working or still a mess. Such toggles are driven by technical motives, so we tend to call these tech toggles.
Tech toggles can be useful in your journey towards continuous delivery, for example to get rid of long release cycles and code freezes.
Once you are able to deliver at will, introducing tech toggles is a slippery slope. You start moving away from having continuously integrated software, because you increase your batch size. Tech toggles hide deeper issues in your software development process. We recommend looking at your way of working and finding a way of deliver code that is tested and works first, and building capability to quickly deploy and rollback changes, before resorting to tech toggles. We want to respond to change, not live in fear of additional if’s and buts we added.
Reduce the scope of a toggle: if a toggle changes behaviour in many places in the code, it becomes much harder to test both configurations of the system.
Prefer using your existing authorisation mechanism to enable/disable access to features. Having a single, well understood mechanism reduces the risk of mistakes. The feature toggle might temporarily need more fine grained permissions. Remember to refactor and clean up when the feature toggle related permissions are not needed any more.
If a toggle is affecting multiple components, preferably let one component be in the lead, while the rest follows, i.e. one component uses the toggle to show different behaviour, the others react correctly on what this component returns. A back-end component for instance can use a feature toggle to expose or hide specific data. The front end component shows what it gets from that back-end and does not need to check the toggle.
Feature toggles will only work well when one can separate all the functionality under the toggle. One can also fool oneself easily.
Feature toggles is a practice that facilitates continuous delivery, but it comes with trade-offs. We find it useful to take a critical stance and see feature toggles more as a symptom than a solution. This can help us find better ways of delivering continuously without postponing integration or getting stuck in complexity.
- Twitter thread by Pete Hodgsson on Piranha: an open Source tool to automatically delete stale code. There is also an academic paper on this topic. The Twitter thread contains a more detailed categorisation beyond ‘tech toggles’ and ‘other’.
- Categories of Feature Toggles on martinfowler.com
- LaunchDarkly, a service to manage flags at various states in a products lifecycle, for various audiences.