Advertisements
Posted by: jsonmez | April 14, 2013

What Makes Code Readable: Not What You Think

You often hear about how important it is to write “readable code.”

Developers have pretty strong opinions about what makes code more readable. The more senior the developer, the stronger the opinion.

But, have you ever stopped to think about what really makes code readable?

The standard answer

You would probably agree that the following things, regardless of programming language, contribute to the readability of code:

  • Good variable, method and class names
  • Variables, classes and methods that have a single purpose
  • Consistent indentation and formatting style
  • Reduction of the nesting level in code

There are many more standard answers and pretty widely held beliefs about what makes code readable, and I am not disagreeing with any of these.

(By the way, an excellent resource for this kind of information about “good code” is Robert Martin’s excellent book, Clean Code, or Steve McConnell’s book that all developers should read, Code Complete. *both of these are affiliate links, thanks for your support.)

Instead, I want to point you to a deeper insight about readability…

The vocabulary and experience of the reader

I can look at code and in 2 seconds tell if you it is well written and highly readable or not.  (At least my opinion thereof.)

At the same time, I can take a sample of my best, well written, highly readable code and give it to a novice or beginner programmer, and they don’t spot how it is different from any other code they are looking at.

Even though my code has nice descriptive variable names, short well named methods with few parameters that do one thing and one thing only, and is structured in a way that clearly groups the sections of functionality together, they don’t find it any easier to read than they do code that has had no thought put into its structure whatsoever.

In fact, the complaint I get most often is that my code has too many methods, which makes it hard to follow, and the variable names are too long, which is confusing.

There is a fundamental difference in the way an experienced coder reads code versus how a beginner does

An experienced developer reading code doesn’t pay attention to the vocabulary of the programming language itself.  An experienced developer is more focused on the actual concept being expressed by the code—what the purpose of the code is, not how it is doing it.

A beginner or less experienced developer reads code much differently.

When a less experienced developer reads code, they are trying to understand the actual structure of the code.  A beginner is more focused on the actual vocabulary of the language than what the expression of that language is trying to convey.

To them, a long named variable isn’t descriptive, it’s deceptive, because it is hiding the fact that NumberOfCoins represents an integer value with its long name and personification of the variable, as something more than just an integer.  They’d rather see the variable named X or Number, because its confusing enough to remember what an integer is.

An experienced developer, doesn’t care about integers versus strings and other variable types.  An experienced developer wants to know what the variable represents in the logical context of the method or system, not what type the variable is or how it works.

Example: learning to read

Think about what it is like to learn to read.

When kids are learning to read, they start off by learning the phonetic sounds of letters.

When young kids are reading books for the first time, they start out by sounding out each word.  When they are reading, they are not focusing on the grammar or the thought being conveyed by the writing, so much as they are focusing on the very structure of the words themselves.

Imagine if this blog post was written in the form of an early reader.

Imagine if I constrained my vocabulary and sentence structure to that of a “See Spot Run” book.

Would you find my blog to be highly “readable?”  Probably not, but kindergarteners would probably find it much more digestible.  (Although they would most likely still snub the content.)

You’d find the same scenario with experienced musicians, who can read sheet music easily versus beginners who would probably much prefer tablature.

An experienced musician would find sheet music much easier to read and understand than a musical description that said what keys on a piano to press or what strings on a guitar to pluck.

Readability constraints

Just like you are limited to the elegance with which you can express thoughts and ideas using the vocabulary and structure of an early reader book, you are also limited in the same way by both the programming language in which you program in and the context in which you program it.

This is better seen in an example though.  Let’s look at some assembly language.

.model small
.stack 100h

.data
msg     db      'Hello world!$'

.code
start:
        mov     ah, 09h   ; Display the message
        lea     dx, msg
        int     21h
        mov     ax, 4C00h  ; Terminate the executable
        int     21h

end start

This assembly code will print “Hello World!” to the screen in DOS.

With x86 assembly language, the vocabulary and grammar of the language is quite limited.  It isn’t easy to express complex code in the language and make it readable.

There is an upper limit on the readability of x86 assembly language, no matter how good of a programmer you are.

Now let’s look at Hello World in C#.

public class Hello1
{
   public static void Main()
   {
      System.Console.WriteLine("Hello, World!");
   }
}

It’s not a straight across the board comparison, because this version is using .NET framework in addition to the C# language, but for the purposes of this post we’ll consider C# to include the base class libraries as well.

The point though, is that with C#’s much larger vocabulary and more complicated grammar, comes the ability to express more complex ideas in a more succinct and readable way.

Want to know why Ruby got so popular for a while?  Here is Hello World in Ruby.

puts "Hello, world"

That’s it, pretty small.

I’m not a huge fan of Ruby myself, but if you understand the large vocabulary and grammar structure of the Ruby language, you’ll find that you can express things very clearly in the language.

Now, I realize I am not comparing apples to apples here and that Hello World is hardly a good representation of a programming language’s vocabulary or grammar.

My point is, the larger the vocabulary you have, the more succinctly ideas can be expressed, thus making them more readable, BUT only to those who have a mastery of that vocabulary and grammar.

What we can draw from all this?

So, you might be thinking “oh ok, that’s interesting… I’m not sure if I totally agree with you, but I kind of get what your saying, so what’s the point?”

Fair question.

There is quite a bit we can draw from understanding how vocabulary and experience affects readability.

First of all, we can target our code for our audience.

We have to think about who is going to be reading our code and what their vocabulary and experience level is.

In C#, it is commonly argued whether or not the conditional operator should be used.

Should we write code like this:

var nextAction = dogIsHungry ? Actions.Feed : Actions.Walk;

Or should we write code like this:

var nextAction = Actions.None
if(dogIsHungry)
{
   nextAction = Actions.Feed
}
else
{
   nextAction = Actions.Walk;
}

I used to be in the camp that said the second way was better, but now I find myself writing the first way more often.  And if someone asks me which is better, my answer will be “it depends.”

The reason why it depends is because if your audience isn’t used to the conditional operator, they’ll probably find code that uses it confusing.  (They’ll have to parse the vocabulary rather than focusing on the story.)  But, if your audience is familiar with the conditional operator, the long version with an if statement, will seem drawn out and like a complete waste of space.

The other piece of information to gather from this observation is the value of having a large vocabulary in a programming language and having a solid understanding of that vocabulary and grammar.

The English language is a large language with a very large vocabulary and a ridiculous number of grammatical rules.  Some people say that it should be easier and have a reduced vocabulary and grammar.

If we made the English language smaller, and reduced the complex rules of grammar to a more much simple structure, we’d make it much easier to learn, but we’d make it harder to convey information.

What we’d gain in reduction of time to mastery, we’d lose in its power of expressiveness.

One language to rule them all?

It’s hard to think of programming languages in the same way, because we typically don’t want to invest in a single programming language and framework with the same fervor as we do a spoken and written language, but as repugnant as it may be, the larger we make programming languages, and the more complex we make their grammars, the more expressive they become and ultimately—for those who achieve mastery of the vocabulary and grammar—the more readable they become. (At least the potential for higher readability is greater.)

Don’t worry though, I’m not advocating the creation of a huge complex programming language that we should all learn… at least not yet.

This type of thing has to evolve with the general knowledge of the population.

What we really need to focus on now is programming languages with small vocabularies that can be easily understood and learned, even though they might not be as expressive as more complicated languages.

Eventually when a larger base of the population understands how to code and programming concepts, I do believe there will be a need for a language as expressive to computers and humans alike, as English and other written languages of the world are.

What do you think?  Should we have more complicated programming languages that take longer to learn and master in order to get the benefit of an increased power of expression, or is it better to keep the language simple and have more complicated and longer code?

If you like this post don’t forget to Follow @jsonmez or subscribe to my RSS feed.

Advertisements

Responses

  1. Small detail: it’s a ternary operator, not tertiary operator. (Tertiary means ‘third’, ternary means ‘composed of three components’).

    • Ah, you are right. Thanks. I always get that confused. Thanks for pointing that out. Fixed now.

  2. “An experienced developer wants to know what the variable represents in the logical context of the method or system, not what type the variable is or how it works.”

    I read this as an argument for programming languages that don’t force the programmer to annotate the types. (lisp, haskell, and their derivatives).

  3. It’s *a* ternary operator (and the only one) but its name is the *conditional* operator. It’s a ternary operator because it has three operands – but that doesn’t same anything about what it does.

    More importantly, your usage of it is invalid (as least in C# and Java) because the conditional operator can’t be used as a stand-alone statement, only an expression within a statement.

    I’d also argue that experienced developers often care very much about the types of variables… and the information that type implies can be used to make the variable name simpler without losing any context.

    • Thanks Jon! You are right of course.
      I have updated the wording and examples.
      That is what I get for writing code directly into the browser. I wrote the example how I wanted the conditional operator to work. 🙂

      I agree that experienced developer do care about the types of variables.

      I only mean to place an emphasis on that when they are reading code, they are not focusing on the actual technical details of the language as much as a beginner would be, but their focus is more on the actual concept the code is trying to convey.

      In many cases, it doesn’t require even knowing the variable type… Hence, the usage of var so often in C# that doesn’t seem to affect readability for many experienced developers, but seems to hamper readability to some degree for beginners.

      For example, if I write some code like this:

      var coins = sorter.sortQuarters();

      You do not need to know what actual type the variable coins represents. (At least at first glance, to understand the concept the code conveys.)

      Not sure how well I conveyed that point in the original post, so thanks for pointing that out as well.

  4. Interesting point of view, John. How do you fit into this reasoning rather small but nonetheless expressive languages, like the new Go for example ?

  5. Programing languages are a little diferent from natural languages in that you are more or less constantly redefining the language. The more so in higher level programming languages like lisp or haskell, where you can easily use metalinguistic abstraction, in addition to data abstraction, functional abstraction ans syntactic abstraction (cf SICP).

    Essentially, we are writing programs as if we started our books with a chapter or two defining the grammar and vocabulary of a new Esperanto or Volapuk, in which the rest of the book is written. For each book!

    And this is a good think, since it let’s write very nice books, in which the language use is the most adapted to the ideas expressed in each book.

  6. […] Programming News: What Makes Code Readable: Not What You Think You often hear about how important it is to write “readable code.” Developers have pretty strong opinions about what makes code more readable. The more senior the developer, the stronger the opinion. But, have you ever stopped to think about what really makes code readable? Read full story => SimpleProgrammer […]

  7. “the complaint I get most often is that my code has too many methods, which makes it hard to follow” I’ve heard the same from more experienced developers as well. Some devs prefer having everything in one huge method I guess so you don’t have to navigate around the code. Modern tools make navigation easy so I don’t see that as a valid complaint. For me long methods are harder to follow and its not reusable. It also results in duplicate code and more bugs.

    • Just because someone has been a developer for a long period of time does not make them an “experienced” developer. I would say it would be the wisdom and the knowledge one gains while they continue to learn and evolve over time is the beginning of what makes a developer an experienced developer. Knowing that you are not perfect, knowing that the code you are writing today you will hate in 6 months and want to refactor it these are additional signs of a developer who has gained experience.

      I have seen developers who have been doing it for only 4 years be far more experienced that some that have been doing it for 12 years. I have worked with developers who truly believed they wrote perfect and bug free code and that the coded they wrote a year ago is just as perfect as the code they are writing today. These developers have been doing this for 10 to 15 years but I would not call them experienced developers, I would call them misguided developers.

  8. Let’s not confuse *language* complexity with *development* complexity. The basics of DOS programming with 8086 assembly language are relatively simple, and ignoring the specific mnenomics, 8086 assembly is pretty trivial because the CPU is quite trivial by today’s standards. But if you don’t know that in DOS int 21h AH=09h wants a pointer to the $-terminated message to be printed as DS:DX and that in the small memory model DS is initialized to the segment number of the only one data segment, that simplicity doesn’t help; the code is utterly unreadable. (And I’m pretty sure you could save a byte by only setting AH=4Ch; unless the caller checks the return errorlevel, it won’t ever know the difference.) For that information, generally speaking, you need API documentation.

    That’s a bit like the difference between knowing the C# grammar, and (at least have a working understanding of the basic parts of) the .NET BCL. `public`, `for`, `?:` and so on won’t help you much if you don’t know about `System.Console.WriteLine()` or the other I/O methods.

    Some languages have a complex grammar which comes with a lot of expressivity; Pascal comes to mind as a possible example. Other languages, such as assembler and C, make do with a much simpler language and instead rely on a complex standard library (one could argue that in assembler, you get all the pieces you need to build your very own standard library, IKEA style). Yet others pretty much require compiler- or system-specific extensions to get actual work done. Actually, try doing a “Hello World” in pure C without calling out to the standard library and without tying yourself to a specific hardware or software architecture; it probably won’t help much that with access to the standard library it’s just `printf(“Hello world!”);` plus a few lines of boilerplate for the compiler.

    I have to admit, sometimes I miss the days when directly fiddling with CPU registers and calling out to various interrupts, or working directly with memory locations, was commonplace.

  9. @John I understand your point and for the most part agree with it. However my big takeaway from this is a reason not to higher beginning and junior developers, and to let them get “seasoned” on someone else dime. It seems you are making the point that beginners need to learn more before they mix too much with the “experienced” developers and try to play with them.

    I mean there is a reason we don’t ask first graders to be CEO’s and there is a reason we don’t has CEO’s to teach first graders.

    If you work for a large enough dev shop and have mentoring resources and are willing to spend the time to mentor that is great and make good farms for helping to being creating experienced developers. However if you have a small budget and can only have a small team for a project you want the best of the best. You want to get the biggest bang for your dollar and you don’t have too much time to mentor beginners.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: