- Cosimo Orban

# More comments != Better Code

Comments provide a valuable opportunity to increase readability in your code, but instead often endup reducing readability. In this article I will explain why I believe blindly commenting code is an anti-pattern, and instead good organization along with properly named variables and functions is the key to readable code. I will be using python for coding examples, but this applies to any language.

**Novice Code**

When a novice programmer first starts really getting into programing longer stretches of code, their first instinct seems to be to comment all-the-things.

*#**!**/**usr**/**bin**/**env python3*
**def** main():
s = input('Number Array: ')
na = s.split(',')
sm = 0;
e = '';
**for** n **in** na:
**try**:
sm += float(n)
**except** (ValueError, OverflowError):
e = "Error parsing number"
**if** e != '':
**print**(e)
**else**:
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

It’s a simple program to add a few numbers together. After realizing it might be hard to follow, the novice decides to add some comments:

*#**!**/**usr**/**bin**/**env python3*
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
*# Split the string*
na = s.split(',')
*# Initialize some values*
sm = 0;
e = '';
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
sm += float(n)
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
e = "Error parsing number"
**if** e != '':
*# We have an error**,** print it*
**print**(e)
**else**:
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

The exact same code, just with comments now. This may seem a bit over the top, but these style comments are what I often see junior developers proudly adding. The number of lines of code has almost doubled, and now there is a lot more dense text taking away from the actual content of the code. A person going through is almost forced to read the comments, instead of reading the code. It’s not necessarily worse, but it's not exactly better either.

Now let's say the requirements were changed. A manager requests to add functionality to the program which checks if the sum is 42, + or - 0.5, and if it is it needs to remove the 2nd number and recalculate the sum. It also should notify the user when this happens.

After adding in this requirement the novices code is:

*#**!**/**usr**/**bin**/**env python3*
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
*# Split the string*
na = s.split(',')
*# Initialize some values*
sm = 0;
e = '';
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
sm += float(n)
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
e = "Error parsing number"
**if** e != '':
*# We have an error**,** print it*
**print**(e)
**else**:
thrsh = 42
**if** sm >= thrsh-0.5 **and** sm <= thrsh+0.5:
*# The sum is **42.** We need to fix it**.*
**print**("Your sum is 42! Must re-compute!")
*# remove the **2**nd number **from** the array*
na.pop(1)
*# re**-**initialize*
sm = 0
**for** n **in** na:
**try**:
*# parse and add*
sm += float(n)
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
e = "Error parsing number"
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

I hope you can see how bad this code is. There are large swaths of copy-pasta (largely copy-pasted code with slight changes), tersely named variables, and about half the "code" is actually comments. Now let's clean it up this mess.

**Cleaning it up**

First things first, let's refactor the summing into its own function:

*#**!**/**usr**/**bin**/**env python3*
**def** nsum(na):
*# Initialize some values*
sm = 0;
e = '';
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
sm += float(n)
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
e = "Error parsing number"
**return** sm, e
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
*# Split the string*
na = s.split(',')
*# Sum the numbers*
sm, e = nsum(na)
**if** e != '':
*# We have an error**,** print it*
**print**(e)
**else**:
thrsh = 42
**if** sm >= thrsh-0.5 **and** sm <= thrsh+0.5:
*# The sum is **42.** We need to fix it**.*
**print**("Your sum is 42! Must re-compute!")
*# remove the **2**nd number **from** the array*
na.pop(1)
*# Sum the numbers*
sm, e = nsum(na)
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

This helps DRY-out the code a bit. Notice how it reduces the code reuse and actually improves the readability and flow of the main function.

Let’s keep going, this time with our new function’s error handling:

*#**!**/**usr**/**bin**/**env python3*
**class** **ParsingError**(Exception):
**pass**
**def** nsum(na):
*# Initialize some values*
sm = 0;
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
sm += float(n)
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
**raise** ParsingError("Error parsing value: {}".format(n))
**return** sm
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
*# Split the string*
na = s.split(',')
*# Sum the numbers*
sm = nsum(na)
thrsh = 42
**if** sm >= thrsh-0.5 **and** sm <= thrsh+0.5:
*# The sum is **42.** We need to fix it**.*
**print**("Your sum is 42! Must re-compute!")
*# remove the **2**nd number **from** the array*
na.pop(1)
*# Sum the numbers*
sm = nsum(na)
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

This change again helps improve reability by moving the job of error handling entirely into the block of code which can cause the error. Notice how the output of the nsum function is now only the sum. We no longer have to worry about hanlding the possible error outside of the nsum function.

*Note: The code will now raise an error when it can't parse the input. This will cause the program to end abruptly, but this isn't any different than what was happening before, as we were just printing out the error and exiting. All we have done is deferred that functionality to python's internal error handling.*

Ok, now let's do a bit of refactoring:

*#**!**/**usr**/**bin**/**env python3*
**class** **ParsingError**(Exception):
**pass**
**def** nsum(na):
*# Initialize some values*
sm = 0;
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
sm += n
**return** sm
**def** parse_input_to_numbers(raw_input):
*# Split the string*
na = raw_input.split(',')
numbers = []
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
numbers.append(float(n))
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
**raise** ParsingError("Error parsing value: {}".format(n))
**return** numbers
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
numbers = parse_input_to_numbers(s)
*# Sum the numbers*
sm = nsum(numbers)
thrsh = 42
**if** sm >= thrsh-0.5 **and** sm <= thrsh+0.5:
*# The sum is **42.** We need to fix it**.*
**print**("Your sum is 42! Must re-compute!")
*# remove the **2**nd number **from** the array*
numbers.pop(1)
*# Sum the numbers*
sm = nsum(numbers)
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

These changes separate out the responsibility of parsing the input from a combination of the main function and the nsum function into a function called parse_input_to_numbers. Notice how the nsum fuction is now only responsible for summing our list of numbers?

Speaking of nsum, now that we have separated it out sufficiently, we can see that it is just duplicating the functionality of the python built-in sum. Let's use that instead:

*#**!**/**usr**/**bin**/**env python3*
**class** **ParsingError**(Exception):
**pass**
**def** parse_input_to_numbers(raw_input):
*# Split the string*
na = raw_input.split(',')
numbers = []
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
numbers.append(float(n))
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
**raise** ParsingError("Error parsing value: {}".format(n))
**return** numbers
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
numbers = parse_input_to_numbers(s)
*# Sum the numbers*
sm = sum(numbers)
thrsh = 42
**if** sm >= thrsh-0.5 **and** sm <= thrsh+0.5:
*# The sum is **42.** We need to fix it**.*
**print**("Your sum is 42! Must re-compute!")
*# remove the **2**nd number **from** the array*
numbers.pop(1)
*# Sum the numbers*
sm = sum(numbers)
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

Whenever possible, use built-in functions and librarys. They generally are extensively tested and stable, and there is no need to do the extra work of re-inventing the wheel. A simple google search can endup saving hours or days worth of work.

Refactoring out the logic for checking if we are near 42 brings us to:

*#**!**/**usr**/**bin**/**env python3*
**class** **ParsingError**(Exception):
**pass**
**def** is_sum_close_to_threshold(numbers, threshold=42, delta=0.5):
numbers_sum = sum(numbers)
**return** abs(numbers_sum - threshold) <= delta
**def** parse_input_to_numbers(raw_input):
*# Split the string*
na = raw_input.split(',')
numbers = []
*# Ok**,** so **this** is the main bit**.** It does the actual*
*# Adding **of** the numbers**.** It also parses the strings**.*
**for** n **in** na:
**try**:
*# parse and add*
numbers.append(float(n))
**except** (ValueError, OverflowError):
*# Something went wrong**.*
*# This will happen **if** they put **in** something that isn't a number*
**raise** ParsingError("Error parsing value: {}".format(n))
**return** numbers
**def** main():
*# Prompt the user **for** an input array **of** numbers*
s = input('Number Array: ')
numbers = parse_input_to_numbers(s)
**if** is_sum_close_to_threshold(numbers):
*# The sum is invalid**.** We need to fix it**.*
**print**("Your sum is invalid! Must re-compute!")
*# remove the **2**nd number **from** the array*
numbers.pop(1)
*# Sum the numbers*
sm = sum(numbers)
*# Yay**,** we made it**,** print the sum*
**print**("Sum: {}".format(sm))
**if** __name__ == '__main__':
main()

Notice that I simplified the logic for checking the sum by using the abs python built-in.

Now to finish up our cleanup by dealing with the comments and variable names:

*#**!**/**usr**/**bin**/**env python3*
**class** **ParsingError**(Exception):
**pass**
**def** is_sum_close_to_threshold(numbers, threshold=42, delta=0.5):
numbers_sum = sum(numbers)
**return** abs(numbers_sum - threshold) <= delta
**def** parse_input_to_numbers(raw_input):
raw_numbers = raw_input.split(',')
numbers = []
**for** raw_number **in** raw_numbers:
**try**:
numbers.append(float(raw_number))
**except** (ValueError, OverflowError):
**raise** ParsingError("Error parsing value: {}".format(raw_number))
**return** numbers
**def** main():
raw_input = input('Number Array: ')
numbers = parse_input_to_numbers(raw_input)
**if** is_sum_close_to_threshold(numbers):
**print**("Your sum is invalid! Must re-compute!")
numbers.pop(1)
numbers_sum = sum(numbers)
**print**("Sum: {}".format(numbers_sum))
**if** __name__ == '__main__':
main()

Now we have some decent code. Notice that the main function does not contain much logic, instead it delegates to other functions. Also notice the names I’m using for functions and variables. The result is easy to read and follow code, yet is lacking any comments whatsoever.

**Naming Conventions**

In that final code I very deliberately picked concise, descriptive names. Let's go over some of the naming conventions I use and why.

**Variables**

A variable name should accurately *describe* what the variable represents. For example: in the above code numbers contains a list of numbers; raw_input is the user input from the input function; numbers_sum is the sum of the numbers obtained from our summing function. The variable names have meaning which can be easily seen.

*Note: I do not put the variable* **type***as part of the variable name. That is an anti-pattern.* *numbers_list* *is not any clearer than* *numbers;* *raw_input_string* *is not better than* *raw_input.*

**Functions**

Functions should *do* exactly what their name implies.

In my example, parse_inupt_to_numbers parses the program's raw input to a list of numbers, as would be expected by its name; is_sum_close_to_threshold returns a boolean, true if an input of parsed numbers have a sum close to the specified threshold, false if not. Most, if not all, of the functionality can now be inferred from the function names, without having to delve into their code.

Functions should be small enough that you can describe them easily in a few words. If you are struggling to come up with a name for your function, chances are you either don't really understand what the function is doing, or it is doing too much. The word "and" should be avoided; if you find yourself using "and" in your function names, then that function should probably be two separate functions. Using naming conventions like this helps to enforce the single responsibility principle, which is an entire topic by itself, but is generaly concidered a good thing.

**A Note On Comments**

Comments are not bad. When used properly they will clarify tricky to understand bits of code, but they should only be used as a last resort. They should only be used after refactoring and renaming of functions/variables has failed to clarify the code.

**Conclusion**

Every programmer should be striving for self-documenting code, or code which is able to easily convey its meaning using organization and language. It is an art form, in the same way writing a good novel is an art form. It is easy to come up with a good idea, it is conveying that good idea in an easy to understand way which is the hard part. This is one of the main traits which separates an average writer from a great one, regardless if you are writing in English, Spanish, Chinese, Python, C++, or JavaScript.