In our code base we frequently create descendant classes of the TSQLDatabaseScript class to override the Execute function for specific actions in a database script. Three types of naming conventions are now used for these subclasses:
- "TScript" plus ID suffix, for example: TScript231
- An action description, for example: TDoThisAction
- An action description plus a suffix "Script" or "DatabaseScript", for example: TDoThisActionDatabaseScript
I prefer the last option because it states what it does and what it is. The first convention misses any kind of description, and therefore is not very helpful. The second is much better, but the name does not give any hint that it concerns a database script. Any ideas/suggestions?
Berichten
Laatst gewijzigd op 2010-09-10 11:33
Good classes have a specific focused purpose. One indication of a class not being focused enough is that its name needs to be too long to describe its purpose. I think database scripts should be treated differently as their purposes do not need to be focused at all. They merely act as a container for all the actions necessary to get a database schema up to a new level.
Hopefully changes to the database schema are well thought through and occur in small batches directly related to the changes necessary in the code. Even for small changes the descriptive names of the scripts easily grow large and feel artificial. I would not want to restrict the actions of a script just to avoid a name that is too long. That is, I do not think the normal class naming conventions apply for database scripts in this way.
Finally, one might even argue that the purpose of TScript351 is to get the database schema up to version 351 when the current database version is 350. Continuing this line of thought one might say normal class naming rules do apply.
Without having given it too much thought I agree with the "noun" requirement.
Laatst gewijzigd op 2010-09-09 13:39
Option 2 has my preference. Since we are strictly following the 'code is documentation' mantra naming becomes extremely important.
Option 3 is also fine, but the fact that the scripts already extend DatabaseScript should be sufficient that the explicit naming is not necessary. Continuously adding the extended baseclass its name as suffix to child class name would result in classnames like BranchingLogicalSQLDatabaseConnectorObject when followed to the letter. Programmer discretion is advised when naming classes, if the Script suffix is required to make the classname a noun instead of a verb: so be it. If the functionality of the script can be conveyed without the superfluous suffix, that is in my opinion a better solution. Keep it as short and simple while still sufficiently describing what a script does.
IDs provide no useful information except for when comparing two different scripts to see which of the two will be executed before the other. And even that is not correct in the case of script reuse.
The message associated with an instance of a script should convey what the particular instance of a script does, and not the script class itself. This is important when a script is reused to perform the same action on different databases, tables, etc. An additional reason why this message is not a good documentation of the intent of a particular script is that the messages are actually specified through the constructor of scripts in the DatabaseScriptManager, thereby separating the documentation of what a script does and its code over two different files.
Laatst gewijzigd op 2010-09-10 11:35
I think your comment on long class names like TBranchingLogicalSQLDatabaseConnectorObject also deserves some attention.
I agree these long names should be avoided, however, I still believe standard naming rules are better than programmer's discretion. It just depends on what the rules are.
I think classes should be named to indicate both its purpose and its interface. The key thing is to leave out the implementation part. In the case of the example both "Logical" and "Object" are implementation details that should not be in the name. This would result in TBranchingSQLDatabaseConnector which I think is perfectly fine.
The code base also contains the class TBranchingDatabase. I think this name is not correctly chosen as it does not sufficiently indicate its interface, that it is a TSQLDatabase. The correct name would be TBranchingSQLDatabase. The reason I believe the "SQL" part should be in the name is that it is crucial to know that you can query the database with SQL and makes it stand out against other databases like TNetworkDatabase, TObjectDatabase and THierarchicalDatabase.
If following these rules still results in names that are ridiculous then probably something is wrong with the purpose and interface of that class. Abbreviating the class name will not be the solution. Instead, long names should still be used until the interfaces are refactored into something more sensible. Abbreviating the names in such cases would only hide the real problem.
Most scripts are singletons which are defined by their Execute() function and after which other script they will be executed. Therefore I believe there is no extra documentation value by either repeating the entire log message into the class name of these scripts or abbreviating the class name to something too generic to be accurate.
I do not mind people adding extra information to the class name of a database script. I do not mind to skip the duplicated information as well. With respect to the documentation issue I think one can easily test whether more precise class names are needed. Just ask yourself the question: "Do I really miss the class names of the scripts to understand what will happen?". Up till now my answer is that I do not miss them at all. When Marlies first introduced this standardized naming I had the same Pavlov reaction as Karsten and Gijs. However, I soon realized I could easily do without the specialized class names.
No, I do not have a clue what Script351.Execute() does. I know enough about TBIVASChangeRealsToDecimalsScript.Execute() its effect though. Without going into the source or the DatabaseScriptManager.
We do not need class/function/variable names either since in the end the bytecode generated defines them anyway. Code should document itself as best as possible as it is intended to be read by humans. And thus I repeat myself: IDs document nothing and thus have no place in class/function/variable names.
It took me a minute to discover the function of TScript351. A name describing that change for the script could be TAddLoadCapacityToTripsTableAndInitializeFromReizenbestandTableScript. Of course you now know exactly what the script does but this is a repetition of both the log message and the contents for the Execute() function.
I think the key sentence in your reply is "without going into the source of the TSQLDatabaseScriptManager". TScript351 only exists to be used at that specific location in the TSQLDatabaseScriptManager. From there I easily end up at the Execute() function to see its implementation. I, myself, have never been in the situation where I needed a descriptive class name for a script.
I feel that the word "Script" definitely should be used in the name.
I am not sure we should use the action in the name of the script. Of course it feels good to indicate the action that will be taken by the script, however, my experience is that it is very hard to find a name that is both unique and reasonably short, while clearly indicating what the script does. Usually, the names will turn out too generic or become long sentences. Both are not very useful. Finally, scripts are used only once so there does not seem to be much value in supplying a readable name to make reuse possible. The log message and the contents of Execute() should be enough to know what is going on. However, if a script is generic and used as a base class for an other script I certainly feel it should be named well.
One thing to think about is that scripts will be assigned UUIDs in the future. However, I think it is still very possible to use these IDs in the name of the script.
