Orange Code
What humble citrus fruit can tell us about software
Whenever I work on unfamiliar code I start extracting methods. I look for chunks of code that I can name - then I extract them. Even if I inline the methods I’ve extracted later, I have a way of temporarily hiding details so that I can see the overall structure.
When I’m working with someone, often they point out that I’m actually adding more code when I do these extractions. In terms of line count, they are right. An expression with 10 tokens on one line becomes three lines in most languages. There’s a line for the declaration, a line for the expression, and a line for the syntax that terminates a function definition.
public void run() {
adapter.tier().run();
}
A few languages spare us that extra line at the end to complete functions. Python is one of them.
@given(st.lists(st.integers()))
def test_reversing(self, xs):
ys = list(xs)
ys.reverse()
ys.reverse()
assert xs == ys
So, in the process of extracting methods, we add more code. Is that bad?
Superficially, it seems like it should be. Line count correlates with complexity, but the lines that we add aren’t quite the same as the lines they surround.
Let’s talk about oranges.
Oranges are good fruit. Plenty of vitamin C. They’re also well packaged. Cut one open and you see that oranges have sections.
Each section is covered with white fiber called fascia. This fiber is very different from the pulp inside of it — the part that makes the orange orange.
Not surprisingly, nature uses the same playbook for other structures. Muscles have fascia too. It’s the surface layer around muscle. Fascia is like skin or the membrane of a cell. It’s all the same pattern.
Can we see the declaration of a function and the function’s enclosing braces as fascia? I think we can. The signature, start and end of a function can be seen as surface. The function’s body is the interior. Refactoring that breaks code down into smaller pieces increases the surface area of the system relative to its volume.
Let’s compare apples to oranges. (It was inevitable. Humor me).
Apples are just a lot of undifferentiated mass. We can bite into one, but our teeth might get stuck. Apples are sort of like monoliths that way.
Oranges, on the other hand, are sectional, modular.
Modularity is something we all know about in software, but one thing we don’t think about very often is something that comes along with modularity — surface area.
Let's go back to apples.. the surface area of an apple is its skin. Oranges, in addition to their skin, have fascia surrounding their sections. This vastly increases their surface area.
It turns out that surface are is very useful in software most of the time, except (perhaps) in the context of security. When your ratio of surface area to volume is high you have more interfaces for testing. Less relative volume means that there are fewer places where non-instrumentable complexity can hide.
Look at this snippet of Java code. It's only a piece of a large function.
There is no doubt that a function that large hides inscrutable logic and state that we can’t approach easily with our minds — or our tests.
With a better ratio of surface area to volume, we’d be able to.
We can turn to mathematics and try to capture surface area and volume in a metric. The math is simple. In terms of line count, the surface area of a function is fixed by language. It’s either 2 or 1. Volume is the sum of all of the lines of code in the bodies of methods. This means that the ratio of volume to surface area is roughly the same as the total number of lines in a project divided by the number of methods. Which, when you ignore declarations and other code that isn’t in methods, is roughly the same as average method length.
Suppose that we find out that the average method length on a project is 7. What does that tell us? It’s likely that the codebase is better than if it were, say, 15, but this is where statistics fails us. The average doesn’t tell you much about the distribution, and even then, radically changing the distribution on a large project isn’t easy. The useful thing about the orange metaphor is the perspective that it gives you.
When you look a system, look for surface area. Get a feel for when and where it is lacking. This can happen all different scales.
I think that, if nothing else, shifting our perspective from the volume of methods, classes, and services to their surface area can lead to richer conversations. It’s a perspective that gives us design insight.
Look at some code and imagine saying “that code needs to be more like an orange.”