Private Languages in Code

Michael Feathers
December 10, 2013

There are many ways of writing incomprehensible code. Most of us have tried many, but every once in a while I see a new one.

About a year ago, I looked at a large method that set a series of boolean flags up at the top and referred to them about 300 lines further along in its body. It's easy to look at this and say that the root cause is having 500+ line methods - we all know we shouldn't - but there is a deeper problem lurking inside of this scenario and its something I haven't seen people talk about.

In general, it's a bad idea to have private languages in code. You have a private language when you encode information in a data only to decode it further along in processing. Setting flags is a classic example, but there are others. Occasionally, I find people formatting strings in one part of their code only to parse them later. You might suspect that I'm talking about serialization formats like JSON or XML, but no. People do use strings as universal data structures at times. Alan Perlis had a wonderful quote about this: "The string is a stark data structure and everywhere it is passed there is much duplication of process. It is the perfect vehicle for hiding information."

Private languages fall into the category of making work for ourselves. When we code we should be doing things that make downstream work easier. This is why, for instance, switch statements in code are often a bad idea. It's not the switch statement itself that is a problem but rather the fact that we create a private language of type codes that has to be decoded later that is a problem. The remedy in OO is to create objects early and pass them along. Once an object is created we don't have to decode it, we send it messages and it does what it needs to do. We've avoided creating work for ourselves.

Now, having said this, are there cases where private languages are useful? Yes. A parse tree generated by a compiler front end can be seen as a private language that is interpreted by a code generator. But that's an example of using a private language for leverage and separation of concerns. It should be a conscious design choice. The pathological ones I run into typically aren't.

© 2014 Michael Feathers