Most people who use Perl seriously have seen Damian Conway‘s book Perl Best Practices. It was, and is, an important book and has a lot of interesting discussions in it.
I don’t find all the suggestions in PBP to be useful, though. Since I’ve just been thinking about them, I thought I’d write them up. Why not go through them all?
At $WORK, I’m currently on a team that’s using all the things in that book which can be handled automatically via the Perl::Critic toolset. That tool has several settings; we’re using one called “brutal”. We are allowed exceptions to that rule, but you have to comment them with a reasonable reason, and the code reviewer can push back on that reason. None of us have been too pushy on that, but it keeps us honest.
On new code, the strictest settings are achievable, and many have benefits. Having the team working all together and having the same expectations has a lot of benefits.
I’m not going to reiterate the practices in detail; they’re in the book. I’m going to post my observations and opinions. That’s all these are; my opinions. I don’t intend to start giant flame wars or heated arguments, nor am I saying anyone who disagrees with me is some kind of idiot. They are no kind of idiot; we merely disagree on a minor technical point.
I’ll begin with the Preface.
The Preface introduces the concept of the book, and encourages you as the reader to think about the suggestions carefully, even if they sound awkward at first. It also discusses how, even if you don’t agree with every one of – or even any of – the suggestions that the examination and consideration makes you a better engineer, and helps you write better code.
I agree with this wholeheartedly and enthusiastically. This may be one of the most missed points in the book. To think and decide what you do, instead of falling into old habits or copy and pasting what you see in front of you. To know why you do something helps you understand what you are doing at every level.
At it’s heart, this is what the book is about, and I think it is a really important thing for Mr. Conway to have said. I found it enlightening when I first read it in 2005, and still find it useful today.
The weakness of the book is that many people don’t do this. They simply say, ‘That’s what is says in PBP!’ and do things mindlessly. They cling to tools and dogma too closely and don’t stop and consider what they’re doing or why. “It said so in The Book!” is not a good reason to do things, at least in my opinion.
There is, of course, nothing Mr. Conway, or Mr. Thalhammer, who wrote Perl::Critic, can do about this. This is human nature; some people want to be told The Right Way and will cling to it no matter what.
Chapter 1: Best Practices
This chapter goes into more detail about why style and consistency matters, and ways to improve style to improve programs as a whole. This was the first place in the book I found myself agreeing with much of what Mr. Conway had to say, but disagreeing with many of his concrete examples.
For example, the book suggests that “appending _ref to the name of every variable that stores a reference makes it harder to accidentally write $array_ref[$n] instead of $array_ref->[$n], because anything except an arrow after _ref will soon come to look wrong.” I agree with the sentiment of what he’s saying there; habit helps you get things right. The example, however, I find weak.
I worked in an time where many C/C++ shops put the variable type in the name, including if it was a pointer, and what kind of pointer. Code like this was common:
char far * lpszDirectoryName = "c:\\foo";
This was useful because you had to be able to tell, editors were primitive, and the compiler was more so. As tools improved, and we could easily hover over something and see what type it was, and as the compiler got better at identifying pointer mismatches automatically, it became burdensome.
It also became wrong, as soon, all pointers were far pointers, and then we simply didn’t care any more, yet our variable names had this soup of unpronounceable cruft at the start. Newer engineers actually didn’t know what they meant.
I don’t think there’s a reason to use _ref, for if you have a reference $array[$n] will fail very quickly, and you’ll fix it. The compiler can tell; you don’t have to burden yourself with looking at it every time you see the variable. I think it’s much more important to know what’s in that array than that it is a reference to an array.
The advice I would offer instead is never to use arrays or hashes. Always use references. Then any reference is always with the -> notation, and you are always prepared to pass them to functions.
So: Good idea, not an example I agreed with.
The section on changing habits being hard is very well thought, and I agree with most of it. Except the examples, but they’re not the point.
I’ll dig into a Practice next. Probably one a day, to keep the posts shorter, as this one’s gotten kind of long.
I actually used hashes for the reason of not accidentally passing them, or accidentally overwriting a reference. I once overwrote an object with a hash, and didn’t know because of how my test was written. So basically the test and code were broken and I didn’t know until it didn’t work in manually testing. It could have been avoided by not having the same dereferencing.
I think if you’re going to use a suffix like “_ref” you should probably go all the way and make it “_aref” or “_href”. Myself, I don’t do this in general, but only when I think it improves readability in some way.
My take is to use arrays and hashes with the @ and % sigils when there used locally: otherwise you’re throwing away a useful feature of perl (and the syntax highlighting of emacs cperl-mode does better with varied sigils), but if you need to pass the structures around, you should almost always use references.
If you use refs gratuitously, you may require closer reading to make sure there’s no action-at-a-distance going on: the @ and % sigils make local scope more apparent.
Personally, I like @ and %, because they make it obvious what I’m manipulating.
For argument passing, in some projects i use Data::Alias:
sub foo {
alias my @ingredients = @{ shift() };
…
}
foo(\@list);
I have snippets in Vim that enter this line for a list or for a hash.
For personal projects at the moment, I’m enjoying Kavorka’s ref_alias :
fun foo (@ingredients is ref_alias) { … }
foo(\@list);
If I was always using references like you propose, i think I would either indicate the nature of the variable through its name. for example:
– arrays are plurals: $ingredients->[0];
– hashes are singular with a preposition: $ingredient_of->{$recipe}
Or I might try this:
– $a_ingredients->[0];
– $h_ingredients->{$recipe}
But maybe it’s just because I’m used to looking for the type where the sigil is.
I certainly don’t think there’s any problem with using the sigils to help keep track of what is in the variable. The issue I have with that is that you can’t tell by the sigil what something is – it changes. The sigil tells you how you’re handling it right now. The sigil and notation after the variable is how you actually know what it is.
my @odd = qw/ one three five seven /;
my $three = $odd[1];
my @numbers = @odd[1,2];
All valid Perl.
So, since I never found the sigil that reliable, I’ve come to pretty much ignore them. The variable’s type will be enforced by the interpreter anyway; if I call keys on a variable, it’ll complain.
I more care that this is the list of odd numbers, the table of zip codes, or whatever it is. The variable type is important, but not meaningful; the interpreter can always let me know what it is if I don’t recall.
Clearly, different people read code differently, and depend on different things to make it understandable. The question becomes what is needed to be generally understandable.
I think Perl would be a better language if there were only one value of $foo, and if it contained an arrayref, @foo would dereference it, and if it were a hash ref, %foo would dereference it. (I found Larry Wall’s discussion of it in http://www.perl.com/pub/2001/05/03/wall.html#rfc009 decidedly unconvincing.)
But that’s not the case. In the meantime, I think replacing %foo with %$foo and $foo{x} with $foo->{x} would be really noisy — I hate having to look at, and type, that even where it’s necessary to use references, much less where it isn’t. I think my code has a much higher ratio of arrays and hashes to arrayrefs and hashrefs.
When I first read PBP I decided to use _r instead of _ref, which is shorter. Slightly.
But in this way I’m luckier than most people, and now I can do
use 5.022;
use feature (‘refaliasing’);
no warnings (‘experimental::refaliasing’)
\my %hash = function_returning_a_hashref();
and so I’ve started doing that. (I did get bit by the weird current behavior it has around aliasing variables declared in outer scopes, but I’m hopeful that will eventually be fixed.)
Thanks for giving me your blog address last night! It’s interesting so far.