After working with many different software projects (has it been thirty years?), I have found one programming defect almost incurable: unused generality. Someone tried to anticipate future needs and implemented extra abstractions and infrastructure in advance. That does not sound so bad does it? Yet it can be fatal. Anticipating how code might evolve is very different from committing to a particular evolution.
There are an infinite number of ways to make code hard to maintain. You can use eccentric algorithms, strange idioms, and fragile frameworks. You can hide dependencies and side-effects as a trap for the unwary. Let us assume, however, that you use standard algorithms, style guides, and design patterns. Why does unused generality make it unusually difficult for someone else to modify the code?
An easily modified program solves the problem it needs to solve and no more. If so, a newcomer can quickly appreciate the problem and the solution. A newcomer can see how to make changes without spoiling the existing design.
Computer science sometimes treats programs as a form of mathematical proof, with a machine to confirm the logic. I find this idea a powerful analogy as well. Imagine that you are required to extend someone else's theorem and proof. What would make your job harder than necessary?
A mathematical proof must begin with a clear statement of the problem to be solved. Martin Fowler argues convincingly in "Refactoring" that test code is essential before code can evolve safely. Only test code can tell you what the program was intended to do. Only test code can tell when you have broken something.
An unfinished, undebugged computer program is as bad as an unfinished mathematical proof. First, you have inadequate tests and no clear definition of the problem to be solved. The original authors understood what needed to be done, but you do not. They had not yet discovered where they were on the wrong track. Completing a defective line of reasoning can be much more difficult than starting from scratch. Under what conditions, and with what data, was the program intended to work? I have yet to see a case where abandoned code was easier to finish, without the original authors, than a total rewrite. (See [ The_annoying_TODO_tag.html ] .)
Think of a mathematical proof that detours with irrelevant arguments and derivations of unused results. You must understand the entire proof before you can safely ignore steps that did not really matter. Until then, you will be afraid that you are missing something.
Unused code creates enormous delays in code maintenance. When you read someone else's program for the first time, you assume every class, every method, and every line has a reason. Before you can move on, you must understand the problem being solved. When you encounter unused code, you work hard to try to find a reason for its existence. You cannot easily prove a negative. Without understanding the whole, you may have simply overlooked an indirect reason. Without adequate test code, you cannot safely remove the unnecessary logic and see if anything breaks. You are likely to work around the mystery, by trying not to change the behavior in any way, even if that behavior might be worthless. I call this undead code, like a vampire that drains your life force.
Think of a mathematical proof that seems to address several general problems, but reaches a narrow conclusion. Unused generality is a combination of unfinished code and unused code, with the problems of both. Yet this defect is often defended with pride.
The original author tried to anticipate future upgrades and added extra layers of abstraction, extra methods, extra parameters, and extra branches to the logic. A newcomer assumes that the additional complexity solves an existing problem. Like other dead code, the complexity is unused and necessarily untested. Like unfinished code, there is a good chance it would not work anyway, without a major rewrite. It is hard to perceive what problems the original author intended to solve. As undead code, the behavior must be preserved when new enhancements are added. Unused generality actually makes future generality harder to add.
If you know you will need more generality right away, then add it. Just make sure that the extra complexity is actually used and tested now. Tests should not run unless the new generality is honored. Otherwise, you are writing unfinished code that you may be forced to abandon. For example, if you abstract a service with an interface or class factory, then provide more than one implementation. Otherwise, others may discover your abstraction needs to be rewritten anyway.
Ask what is the least amount of abstraction necessary to solve a particular problem. Abstraction is too valuable to waste.
There are other common forms of unmaintainable code. The most familiar is spaghetti code, or entangled code. Let us reconsider exactly why.
Imagine a proof that does not distinguish one step from another. Arguments are dropped and resumed later, perhaps with different terminology. You are not told when an earlier or later result is used. A machine could follow the logic, but not a person with limited short-term memory. Rarely can you grasp the whole of a mathematical proof at once. You can check only that each step in an argument is sound, and that no steps are missing.
Why does entangled code appear even when programmers know better? I think it is because they urgently need to fix a problem or add an enhancement, and they just do not have time to grasp enough of the program to do it correctly. So they code around all the features that they do not understand, particularly any undead code.
Scoping allows others to understand each part of your program without an exhaustive search for side-effects and dependencies. A user should never be surprised that a change to X changes the state and behavior of Y, when neither is visibly responsible for the other. You prefer to understand X without Y, Y without Z, and so on.
Unused generality often introduces extra unused dependencies and services.
Avoid specifying rules on how your code should be used. Instead enforce such logic in your code. Instead of requiring a client to call your methods in a certain order, have your class call the client's methods in the correct order. Something is wrong if you must provide a cut-and-paste template of how to use your code. Instead provide an interface that a client can implement. Each method can be understood separately, without anticipating how all are used together.
Unused generality requires programmers to acknowledge future constraints that have no effect on current behavior. It may be tempting to enforce these constraints with rules on usage. But the tax on client code will be high.
All defects are easier to overlook in your own code. You can consult your memory, although others cannot. You may believe some unneeded complexity in the code makes you more productive. But few modern software projects can be completed without a team. You want others to choose you for their teams in the future. They will not want to sacrifice their productivity to yours.
I assume that you write code as a service. You have customers that expect continuing upgrades and support over the years. If so, then you want to keep the support as inexpensive as possible. If you are in a startup that throws together flashy abandon-ware to unload on a larger company, then good luck. That scam gets harder as more companies get burned.
Difficult code also provides a certain kind of job security because no one else will dare touch it. But you will be trapped maintaining code that everyone else in your organization sees as a burden -- clumsy, under-featured, and buggy. They will rejoice when your work can be superseded entirely. And you? You never had the opportunity to move onto something more technically interesting. You too will have been superseded.
Or maybe you simply plan to move onto new projects before the cost of maintenance kicks in. Do you think that your reputation will never catch up with you? Colleagues have much longer memories than corporations. Instead, imagine being able to brag about a program that is still used ten years after you worked on it.
Bill Harlan, 2000, 2008
Return to parent directory.