Strictly
Speaking about "use strict"
Randal L. Schwartz
In many of my writings about Perl, I give the strong admonition
to place use strict at the beginning of the program. I've
often explained the line with a few short phrases, but I thought
it would be interesting to focus on this one construct in detail
for a change.
The use strict line is a pragma. The purpose of
a pragma is to regionally or globally alter the way the language
is translated for execution. For the strict pragma, we get
three sub-features enabled or disabled within a particular program
scope. The scope extends to the end of the curly-brace-delimited
block in which the pragma appears or to the end of the file if otherwise
outside all blocks. Inner pragma controls override outer controls,
so we can get as specific as needed to process a particular chunk
of code.
The use strict pragma has three aspects: vars, subs,
and refs. Each aspect may be enabled or disabled individually
by explicit name but, most often, all three are enabled at once
with a simple use strict. For example, we can enable all
three aspects initially and disable just the vars aspect
for a portion of the code, like so:
use strict; # all enabled
...
sub marine {
no strict 'vars'; # disable vars
...
}
# all enabled again
The vars aspect is probably the most useful of the three aspects
and is the one most likely to give trouble to a beginner. Scalar,
array, and hash variables are mapped into package and lexical variables
using one of five methods. The vars aspect disables one of
these methods, leaving the remaining four enabled.
For example, the variable $bammbamm might be referring
to a lexical variable named $bammbamm, introduced earlier
in the same scope through the use of the my declaration,
as in:
my $bammbamm = 5;
...
print $bammbamm; # lexical $bammbamm in scope
Or, it might be a package variable declared earlier by use vars
in the same package, such as:
package This::One;
use vars qw($bammbamm);
...
print $bammbamm; # same as $This::One::bammbamm
...
package That::One;
# $bammbamm no longer legal here
The variable name might also be declared through the our declaration,
which associates a simple name with a package variable in the current
package for the remainder of the scope. For example:
package This::One;
sub nominal {
our $bammbamm; # $bammbamm is $This::One::bammbamm
...
package That::One;
print $bammbamm; # still prints $This::One::bammbamm
}
# $bammbamm is no longer permitted
Or, if the name contains a package delimiter (double colon), it's
an explicit use of a package variable:
package This::One;
print $This::One::bammbamm; # always permitted
Finally, the variable $bammbamm may be just a package variable
in the current package, if no prior declaration exists:
package This::One;
print $bammbamm; # $This::One::bammbamm;
package That::One;
print $bammbamm; # $That::One::bammbamm;
It is this particular method that is disabled by use strict,
because it can lead to the most errors in larger programs. By default,
any mention of any simple scalar, array, or hash name is simply accepted
as a package variable in that package, even if the name is a typo!
By enabling use strict 'vars', the troublesome automatic
acceptance of any variable name is prevented, forcing you to declare
your variables through one of the other methods. This isn't all
that important on a five-line program, but I have rarely seen any
program stay at only five lines unless it was a one-off task.
The subs aspect of use strict disables the interpretation
of "barewords" as text strings. By default, a Perl identifier (a
sequence of letters, digits, and underscores, not starting with
a digit unless it is completely numeric) that is not otherwise a
built-in keyword or previously seen subroutine definition is treated
as a quoted text string:
@daynames = (sun, mon, tue, wed, thu, fri, sat);
However, this is considered a dangerous practice, because obscure
bugs may result:
@monthnames = (jan, feb, mar, apr, may, jun,
jul, aug, sep, oct, nov, dec);
Can you spot the bug? Yes, the 10th entry is not the string oct,
but rather an invocation of the built-in oct() function, returning
the numeric equivalent of the default $_ treated as an octal
number. And if you wrote this program in April, you might not even
notice that it breaks for six months. I'm not saying that this has
happened to anyone I know, because I believe I'm protected from self-incrimination.
Although the problem arises mostly from collisions with built-in
words, simply watching for built-ins is insufficient. Suppose we
added a sun function earlier in the same scope:
sub sun { ... }
Now our first day name is also messed up, being a call to the subroutine
instead of the three-character string. But it's not sufficient to
simply scan in the source text for a same-named subroutine. The name
can also be imported from other code by one of the earlier use
directives!
So, the proper method out of this madness is to avoid the use
of "bare" words in most circumstances. This list of day names can
be created easily with qw() instead:
my @daynames = qw(sun mon tue wed thu fri sat);
And now there's no possibility of conflict, because we're using a
quoted string instead of a bareword. The nifty part is that use
strict 'subs'> (included as part of use strict) takes care
of enforcing this automatically. Once enabled, barewords will be flagged
while the program is being parsed, before execution even begins.
Note that barewords are still permitted in a few specific locations.
For example, the key to a hash can always be specified as a bareword:
my $age = $data{age}; # same as $data{"age"}
Also, the left side of a "fat arrow" is also automatically quoted
if it resembles a bareword:
my %data = (age => 19); # same as ("age", 19)
These two automatic quotings make working with hashes with program-significant
keys easier, presuming the keys you choose are all barewords.
Finally, a pre-declared subroutine can be treated as a subroutine
call, even if the definition of the subroutine had not yet been
seen:
sub deeper; # declaration
...
my $result = deeper;
I don't recommend this practice, since it is just as easy (and clearer)
to follow the subroutine call with empty parens:
my $result = deeper(); # no declaration needed
The final aspect of the use strict pragma is the disabling
of soft references (or symbolic references). Normal
references (sometimes called hard references to distinguish
them from soft references) come from an explicit referencing operation:
my $ref = \@foo; # now $ref is a reference to @foo
or from one of the anonymous reference constructors:
my $ref2 = [3, 4, 5]; # array reference created
An auto-vivification will also create a hard reference:
my $ref3; # variable is undef initially
$ref3->[5] = 10; # $ref3 is now an array reference
Following this reference using a dereferencing operation gets us back
to the original data:
print $ref2->[2]; # prints 5, from the anon array
However, the dereferencing operation can also be
performed against a simple scalar string:
my $sref = "happy";
$sref->[3] = "hello"; # symbolic reference
This dereferencing is performed at execution time. Perl looks up the
value to be dereferenced, notes that it is not a hard reference, and
then examines the package variable symbol table for a same-named variable.
Because package variables spring into existence as needed, nearly
any name in $sref will be considered legal, causing new variables
to be created dynamically.
As if that weren't already scary enough, the variable name does
not need to be a standard Perl identifier. Any string will do:
my $sref = "A [variable] {name} !normally! *illegal*";
$$sref = 12;
We now have a scalar package variable in the current
package with a very crazy name.
Because of the likelihood of an accidental symbolic dereference
operation, the use strict 'refs' aspect is recommended for
every program that uses references.
If all three of these restrictions are good, why are they not
enabled by default? The answer is "backward compatibility". Perl
version 4 (last updated more than a decade ago) permitted casual
variable naming (and didn't have any option for lexically declared
variables), didn't have the convenient qw() for defining
lists of short values, and used soft references for indirect subroutine
invocation. Thus, adding use strict by default would have
broken nearly every Perl version 4 program!
But Perl4 is now long dead. Be sure to use strict in your
modern Perl5 programs, and you'll get a guaranteed reduction in
development time or double your money back! Until next time, enjoy!
Randal L. Schwartz is a two-decade veteran of the software
industry -- skilled in software design, system administration, security,
technical writing, and training. He has coauthored the "must-have"
standards: Programming Perl, Learning Perl, Learning
Perl for Win32 Systems, and Effective Perl Programming.
He's also a frequent contributor to the Perl newsgroups, and has
moderated comp.lang.perl.announce since its inception. Since 1985,
Randal has owned and operated Stonehenge Consulting Services, Inc.
|