Perl Programming Standards
"There's more than one way to do it". So says the byline on the cover of "The Perl Bible", also known as "The Camel Book" - "Programing Perl" by Larry Wall, Tom Christiansen and Jon Orwant. This is both a blessing and a curse - a blessing because you can choose the best way of doing something, and a curse because you can write code that's so obfurscated, so unusual, that it's difficult or impractical to maintain.
Do you want to write code that's cost effective to develop, robust, easy to test, maintain and (later) upgrade, is good for the user, and contain a structure that allows for re-use where appropriate? Good - then it is important (especially with Perl, given the flexibility) to work with some programming standards to help you achieve these ends.
Perl is a language that trusts the programmer to know what he/she is doing - to be used "by consenting adults" if you like. That's as opposed to a "Nanny" language such as Java, where everything must be declared and is checked and double checked by the compiler to ensure it's exactly right. It follows on from this philosophy that a standard should be more a set of guidelines and less a set of rules; with each guideline, ask yourself "why is this being suggested" and if you can find good contrary reason in your particular case, perhaps you've found an instance where the guideline may be broken.
The guidelines will differ from programming team to programming team - a team that's writing major applications thousands of lines long and are all highly trained in the nuances of Perl will have a different ideal to a group of people who do some occasional Perl coding as part of a much wider ranging job, and write code that is typically a few lines of "glueware".
Guidelines for Perl
- Don't re-invent the wheel - if the right wheel isn't available elsewhere, adopt one to make it right. Is there a standard module, a CPAN module, something internal in your organisation or something you yourself have previously written that does the job or gets you off to a flying start?
And on that basis, these guidelines rely heavily on the perldoc manual page online at:
and also suggestions by others (such as the BBC) on how it should be modified for their particular use at:
- GOLDEN RULE 1 Consider WHY you're writing the code in the first place - design first. At the very least, produce a use-case diagram on a piece of paper. If you're coding a web application, also produce a state diagram showing the flow of the user (and administrator if you have one) through the site.
- GOLDEN RULE 2 Consider all parties in coding and decisions you make. The user (who will usually be the one who makes the biggest investment in your code), the code maintainer, the code tester, as well as yourself. You may want to adjust your coding standards to suit the testers and maintainers - if they're Perl Geeks, you can use a wider range of slightly more obscure constructs than you would if they're just occasional Perlers.
- The Guidelines are just that - and each is there for a reason. If you understand the reason and feel that it's honestly not applicable, you should be free not to follow it - "at your own risk".
Things that affect the user
Items in this section affect the user and should be given the greatest of attention.
The naming of blocks of code (which can be called up by another programmer) and good user documentation is MUCH more important than the internal naming of variables which will be much more hidden within ("encapsulated") once the code is released.
- GOLDEN RULE 4 Think about reusability. Why waste brainpower on a one-shot when you might want to do something like it again? Consider generalizing your code. Consider writing a module or object class. Consider placing your code in a central library. Make your code sharable by using h2xs to create the modules.
- GOLDEN RULE 5 Consider making your code run cleanly with use strict and use warnings (or -w) in effect.
- For portability, when using features that may not be implemented on every machine, test the construct in an eval to see if it fails. If you know what version or patchlevel a particular feature was implemented, you can test $] ($PERL_VERSION inEnglish) to see if it will be there. The Config module will also let you interrogate values determined by the Configureprogram when Perl was installed. And / or use a require stating a minimal code version.
- Package naming. Perl informally reserves lowercase module names for "pragma" modules like integer and strict. Other modules should begin with a capital letter and use mixed case, but probably without underscores due to limitations in primitive file systems' representations of module names as files that must fit into a few sparse bytes.
- You SHOULD use subroutines, which are no longer than 100 lines, wherever possible. A subroutine longer than that SHOULD be refactored, as this is likely to make it clearer and easier to maintain.
- One function does one thing. For example, A function in a module should never print to standard out. Instead, it should return text which the calling script can print when it wants.
- No global variables.
- Be consistent. Be especially consistent with usability.
- GOLDEN RULE 6 Provide good comments AND good user documentation.
- Open source. Consider giving away your code.
Within the code
This is the third of three sections of this standard document. You should consider the generalities and how your code affects others before you consider the following code-level issues:
You SHOULD write code that is easy to read and understand. Some considerations:
- Block structure and layout
- Closing curly brace on a multiline block should line up with the opening keyword
- 4-column indent
- Opening curly on same line as keyword, if possible, otherwise line up
- Space before the opening curly of a multi-line BLOCK
- One-line BLOCK may be put on one line, including curlies
- No space before the semicolon
- Semicolon omitted in "short" one-line BLOCK
- Space around most operators
- Space around a "complex" subscript (inside brackets)
- Blank lines between chunks that do different things
- Uncuddled elses
- No space between function name and its opening parenthesis
- Space after each comma
- Long lines broken after an operator (except "and" and "or")
- Space after last parenthesis matching on current line
- Line up corresponding items vertically
- Omit redundant punctuation as long as clarity doesn't suffer
- Line up corresponding things vertically, especially if it'd be too long to fit on one line anyway.
While you can choose your own rules about bracing style, tab width, spaces and so forth, for new files, you MUST be consistent in applying the style to your code. When modifying existing code that has a well-ordered layout, you MUST use the same standard in your modification.
In the interest of portability, lines MUST be no more than 80 characters long.
- Write Statements in order to be most readable:
open(FOO,$foo) || die "Can't open $foo: $!";
is better than
die "Can't open $foo: $!" unless open(FOO,$foo);
- Don't omit brackets and operands and assume the default where it compromises clarity:
return print reverse sort num values %array;
could be more readably written as
return print(reverse(sort num (values(%array))));
- Use the last, next and redo operators within a loop in order to avoid the obfurscated code that would otherwise be necessary to test the flow of a loop only upon the completion of each iteration. Use loop labels where they help make the code more readable.
- Avoid using grep() (or map()) or `backticks` in a void context, that is, when you just throw away their return values. Those functions all have return values, so use them. Otherwise use a foreach() loop or the system() function instead.
- Variable naming. Choose mnemonic identifiers. If you can't remember what mnemonic means, you've got a problem.
While short identifiers like $gotit are probably ok, use underscores to separate words. It is generally easier to read $var_names_like_this than $VarNamesLikeThis, especially for non-native speakers of English. It's also a simple rule that works consistently with VAR_NAMES_LIKE_THIS.
The names of your variables and subroutines are a good way to communicate the meaning of your code to other developers who have to read and maintain it. Therefore all identifiers SHOULD be descriptive.
Use the case of variable names to show their scope:
Constants MUST be written in $ALL_CAPS with underscores to separate words.
Global/package scoped variables MUST begin with an upper-case letter, and use either underscores or studly caps (BiCapitalisation) to separate words (for example, $CustomerDBUser or $Customer_db_user are permitted).
Locally scoped variables my() or local() variables MUST begin with a lower case letter, and use either underscores or studly caps to separate words (for example, $indexFileName or $index_file_name are permitted).
Function and method names adhere to the same standard as Local variables they MUST begin with a lower case letter. You SHOULD also separate words with underscores or studly caps. Function names beginning with an underscore are considered private, and SHOULD NOT be called outside of the package in which they are defined.
- You may find it helpful to use letter case to indicate the scope or nature of a variable. For example:
$ALL_CAPS_HERE constants only (beware clashes with perl vars!)
$Some_Caps_Here package-wide global/static
$no_caps_here function scope my() or local() variables
- Regular Expressions. If you have a really hairy regular expression, use the /x modifier and put in some whitespace to make it look a little less like line noise. Don't use slash as a delimiter when your regexp has slashes or backslashes.
You SHOULD also put line breaks and comments in regular expressions, to make them even more comprehensible.
Two shorter regular expressions are often more readable than one long one - and faster too.
- Use the new "and" and "or" operators to avoid having to parenthesize list operators so much, and to reduce the incidence of punctuation operators like && and ||. Call your subroutines as if they were functions or list operators to avoid excessive ampersands and parentheses.
- Use here documents instead of repeated print() statements. And if you find yourself cutting and pasting code (repeating things) that should be a really big clue that you should be using a sub!
- Always check the return codes of system calls. Good error messages should go to STDERR, include which program caused the problem, what the failed system call and arguments were, and (VERY IMPORTANT) should contain the standard system error message for what went wrong. Here's a simple but sufficient example:
opendir(D, $dir) or die "can't opendir $dir: $!";
Number your error messages (to help on reports to your support service, where a number can be asked for and quoted) and use a consistent format.
Error messages that relate to the user's data or environment should include as much information as possible to help the user locate the problem.
print STDERR "Error 743 - too many fieds in line\nError is in file $filename at line $lineno: $line";
is much better than
print STDERR "Data Error";
- Commenting standards:
- Purpose, date, release, author, support contact, license terms and copyright
- One comment every major block of 10 to 30 lines, highlighted by white space
- Major comment at the start of subs
- Comment whenever you write something "clever" in the code
- Provide a sample of the input data as comments
- English v short names and scope of $_. I personally recommend against the use of the English module as it can have a major detrimental effect on the operation speed of regular expressions. Better to use the short form variable names and comment those which aren't used day to day in your organisation.
$_ (which is the basis of topicalization in Perl 6) should be used within short areas of code, but do
- Add a comment where you make a rare use of $_ as default
- Feel free to specify $_ explicitly.
- Avoid setting $_ and assuming it's still set many lines later; it's best in smaller scopes / areas of code
- Rare structures like do, unless, until ... don't use these without good cause; no prizes are offered to the programmer who manages to get every possible construct into a hundred lines of code ;-)
This document is eclectic - collected from many sources and so each of the paragraphs differs in style and there is some overlap. Please feel free to copy and paste suggestions for your own standards document that meets your needs; a link back to here would be appreciated, and if you like our standards why not send your new staff on our Perl Courses
so that they learn not only the language, but also the good application of the language.
Standard version no. 1.0.1, 14th June 2006