To Kill Code
Economics is a great frame for understanding problems, but it can cloud many issues as well. The software development community has fought for years against the idea that IT should be viewed as a cost center. We’d like to see it as a value creating investment. Yet, as much as we’d like to see it that way, we speak about code as a liability. Experienced developers know that if we can deliver value with less code, we’re better off.
Much of the discussion around Technical Debt has this same issue. We see code, particularly entropy-scarred code, as a liability, but we can’t get away from the fact that it produces value. Sometimes I feel that we might be better adopting the attitude that the problem with code is that it has too much value. After a fixed investment, you can keep it in production for years, making money. It can stay in production far longer than any developer can remember how it works, and that’s where the problem starts. When you modify things you don’t understand, there’s a good chance you’ll introduce bugs - and that’s after overcoming the fear and stress of being uncertain about what you are doing.
Poorly understood code is the norm today. We’ve spent decades concentrating on ways to write better code, and ways to refactor it, but we can’t evade the fact that we keep writing more of it, and it doesn’t go away - unless we actively kill it.
What does it take to kill code?
It’s harder than it looks. First of all, we need to know that a particular area of code is dead. In software development, we use the term code coverage to describe the degree to which source code is covered by a test suite. What we need do is the equivalent of running coverage in production to discover, by implication, the areas of code that are not exercised as the system runs. You can’t use this information to “prove” that an area of code is dead, but you can find areas of code that are executed only once or twice per week despite your system getting millions of requests. At that point, you have choices. You can decide whether the work that code does is really necessary. Maybe the system can be changed to transition users to a different code path in those cases. If it can, you can delete that code.
Concentrating on rare cases is powerful. First of all, they may be lower risk. If you choose to work on the code that supports those areas, by refactoring, rewriting, or eliminating those paths, you’re unlikely to bring the system to a screeching halt. You’re also choosing to address the problem of code that isn’t quite paying for itself. Code has very real carrying costs. If we weight every line of code in a system by its use, we can see that some code is overhead. Whatever we can do to eliminate it or make it less likely to take up our time is a win for us.
Coverage in production doesn’t necessarily mean running a code coverage tool in production runs, although you can do that for some subset of some subset of your traffic if you don’t mind the execution costs. You can get the same information with some clever logging. Make sure you have logging in areas of code that you are suspicious of - areas that you don’t think run very often. Then, look for their absence from reports. The insights you gain can help you prune your system.