Write Code In English Part III
28 November 2021
Most written communication is in natural language, so why wouldn’t we want to try to mirror that in our code?
This series will present small concrete examples that might help your code read less like code, today's topic is...
Table Aliases
There seems to be an unofficial rule that table aliases should be cryptic. I think that’s a lame convention. Here’s a typical select clause - have a think what real life information the query could be returning given the column list:
Races at circuits? Rates for Currencies?? Records and Competitors??? All wrong… Click the spoiler to see the full query.
Full Query
The query is part of a fictitious hotel’s reporting system - listing reservation requests made by would-be customers that weren’t converted into bookings, so therefore did not make the hotel any money. None of that semantic information is embedded by the table aliases. We tend to avoid single letter identifiers for variables, so why should we treat our SQL queries any differently?
Here’s a version with more descriptive table aliases - easier to understand (in my opinion).
When quickly scanning the query one can infer more about why it was written, compared to the version with the semantically ambiguous aliases.
This technique is particularly applicable for aliasing tables whose names are generic due to the table storing multiple entities. As a concrete example, consider SQL Server’s sys.database_principals system view that holds information about database users, roles and more. Sometimes we only care about one type of principal, such as only roles. Example listing all roles created over a year ago.
Nice and obvious from our column list alone - we’re selecting data about roles rather than the more generic 'principals'.
Another shining sample usage is (if you’re unlucky enough to be) working with a one true lookup table. Often I have seen queries joining such tables multiple times, where each join to the lookup table is aliased with a different number to ensure a unique alias per join. As an example:
The selected aliases L1
, L2
and L3
provide information about what each join represents in the real world.
A reader has to put in the cognitive effort to comprehend ‘L1
is representing the room types,
L2
is representing booking status and L3
is representing the room sizes’. R
isn’t great either.
I wouldn’t want to impose this style on future readers of my code, so I would prefer to see:
It's obvious just by scanning the selected column list this query has something to do with upcoming bookings -
a reader can then quickly decide to delve into the query more or skip past it when skim-reading code.
There’s almost no cognitive load associating the RoomType
alias with ‘Room Types’ compared to associating L1
with ‘Room Types’.
The earlier obfuscated version would cause many readers to at least have a think before they can understand the query,
which is the last thing anyone wants to be doing.
In summary, using semantic names in table aliases is one more tool on our clean code toolbelts.