Concrete Examples: Comments

06 August 2022

‘Clean coders’ are constantly in a debate over comments. For some, comments are never required, in favor of writing self documenting code. For others, detailed comments are a must for any codebase. Most people (i’m guessing) are somewhere in the middle.

The conventional wisdom is to add comments explaining ‘why’ over what and how, which sounds great but what does it actually mean?. As a lover of concrete examples, I'm going to share some real life examples of comments I've made while working on real software, solving real problems for real people. Each example I will present the code example first without a comment, then with a comment. Have a read through of the uncommented code, have a think about what the code is doing then have a guess at why.

1. No room for errors

In a small console application responsible for automatically deploying SQL migration scripts to a database:

var pendingMigrations = Database.GetPendingMigrations(); foreach (var pendingMigration in pendingMigrations) { Console.WriteLine("Script for migration " + pendingMigration.Name + ":"); var migrationScriptToPrint = pendingMigration.Script; migrationScriptToPrint = migrationScriptToPrint.Replace("error", "err*r"); Console.WriteLine(migrationScriptToPrint) }

Anything look funny? Here’s the explanation, including the comment that’s living inside the console app to this day.

var pendingMigrations = Database.GetPendingMigrations(); foreach (var pendingMigration in pendingMigrations) { Console.WriteLine("Script for migration " + pendingMigration.Name + ":"); var migrationScriptToPrint = pendingMigration.Script; // our pipelines hilariously fail if the string "error" appears anywhere in the build log // so tables/columns with "error" in their name cause a build failure even if everything was OK. // Can't think of a better way to fix this... migrationScriptToPrint = migrationScriptToPrint.Replace("error", "err*r"); Console.WriteLine(migrationScriptToPrint) }

Pipelines here refer to CI/CD pipelines, which capture any text dumped to the console throughout their execution. The pipelines in question all contained a setting causing them to fail if the string “error” occurred anywhere in the output, so as soon as we released a table containing a column called “errorMessage” the pipelines would report a ‘failure’. That failure condition turned out to be very useful for capturing unexpected errors, so a little bit of a fudge in some code that doesn’t run all too often is a small price to pay. That’s pragmatism for you.

2. Too much time

Here’s an example in a unit test of an “Export to Excel” feature for financial transactions. N.b. some

[Test] public void ExcelExportCapturesTransactionTimestamp() { var transaction = new Transaction { Timestamp = DateTime.UtcNow }; var transcationExportBuilderUnderTest = new TransactionExportBuilder(); transcationExportBuilderUnderTest.AddTransaction(transaction); var excelExport = transcationExportBuilderUnderTest.BuildExcelFile(); var actualTimestamp = (DateTime)excelExport .GetWorkbook("Export") .GetColumn("Timestamp") .GetDataInCellAtRow(1); Assert.AreEqual(transaction.Timestamp.Date, actualTimestamp.Date); Assert.AreEqual(transaction.Timestamp.Hour, actualTimestamp.Hour); Assert.AreEqual(transaction.Timestamp.Minute, actualTimestamp.Minute); Assert.AreEqual(transaction.Timestamp.Second, actualTimestamp.Second); }

All those assertions look a little dubious, but “WHY!?!?” screams any future reader. Luckily someone left a comment.

[Test] public void ExcelExportCapturesTransactionTimestamp() { var transaction = new Transaction { Timestamp = DateTime.UtcNow }; var transcationExportBuilderUnderTest = new TransactionExportBuilder(); transcationExportBuilderUnderTest.AddTransaction(transaction); var excelExport = transcationExportBuilderUnderTest.BuildExcelFile(); var actualTimestamp = (DateTime)excelExport .GetWorkbook("Export") .GetColumn("Timestamp") .GetDataInCellAtRow(1); // Assert on the different date parts to avoid rounding errors when converting to an excel date. Assert.AreEqual(transaction.Timestamp.Date, actualTimestamp.Date); Assert.AreEqual(transaction.Timestamp.Hour, actualTimestamp.Hour); Assert.AreEqual(transaction.Timestamp.Minute, actualTimestamp.Minute); Assert.AreEqual(transaction.Timestamp.Second, actualTimestamp.Second); }

Sure this test breaks the ‘one assertion per test’ guideline, but at least gives a good reason. Converting data to formats suitable for third-party applications is often a common pain point and small changes to data might creep in. The unit test balances the complexity of transforming data against the practicalities of caring that the data that looks correct.

3. More export to Excel

This example is a simplified version of the real code (another part of the excel export process as in example 2), but the comment stays the same. Note the test includes a so-called ‘magic string. Magic strings are another divisive topic, in my opinion (and in this unit test) directly using strings reduces cognitive load by removing a level of indirection. Directly looking at raw data allows us to think about the low-level representation of data where appropriate. Sometimes a comment and a magic string works like a charm for example:

[Test] public void ExcelExportCapturesCommentsWithFunnyCharacters() { var transaction = new Transaction { Comment = "\u0001" }; var transcationExportBuilderUnderTest = new TransactionExportBuilder(); transcationExportBuilderUnderTest.AddTransaction(transaction); var excelExport = transcationExportBuilderUnderTest.BuildExcelFile(); var actualComment = excelExport .GetWorkbook("Export") .GetColumn("Comment") .GetDataInCellAtRow(1); Assert.AreEqual(actualComment, "\u0001"); }

Cool, but what’s so special about "\u0001" I can hear you ask? Here’s the supporting comment:

[Test] public void ExcelExportCapturesCommentsWithFunnyCharacters() { var transaction = new Transaction { Comment = "\u0001" }; var transcationExportBuilderUnderTest = new TransactionExportBuilder(); transcationExportBuilderUnderTest.AddTransaction(transaction); var excelExport = transcationExportBuilderUnderTest.BuildExcelFile(); var actualComment = excelExport .GetWorkbook("Export") .GetColumn("Comment") .GetDataInCellAtRow(1); // Unicode 0001 is not valid in the XML 1.0 grammar as per https://www.w3.org/TR/xml/#charsets Assert.AreEqual(actualComment, "\u0001"); }

This unit test arose from a real-world scenario where funny characters were causing XML parse errors. On reflection the comment I left might even be a little too cryptic - the code is exporting data to excel so why ought it care about XML characters? Excel files contain XML, if you didn’t know now you know. I think this test is a great case study of the rabbit hole one can end up down when writing comments - should the comment explain that Excel files are made up with XML? Maybe it should include a note on what the heck an XML grammar is anyway? Should the comment explain what problem the \u0001 characters caused?? These are all tough questions to answer - everyone has different levels of domain and technical knowledge and we can’t cater for everyone.

That’s three concrete examples of comments I’ve written in ‘self documenting’ code. Hopefully it's clear that all three are ‘weird’ scenarios, where we write some code that looks bizarre, sub-optimal or even just handles an edge case. I’ve never regretted sprinkling a few comments around the more dubious areas of a codebase. I don’t see myself stopping any time soon.