Anti-documentation in Perldoc « Laufeyjarson writes…

Anti-documentation in Perldoc

Found a section that annoyed me in perlsyn today, in the documentation for the foreach loop:

Here’s how a C programmer might code up a particular algorithm in Perl:

for (my $i = 0; $i < @ary1; $i++) {
    for (my $j = 0; $j < @ary2; $j++) {
        if ($ary1[$i] > $ary2[$j]) {
            last; # can't go to outer :-(
        }
    $ary1[$i] += $ary2[$j];
    }
    # this is where that last takes me
}

Whereas here’s how a Perl programmer more comfortable with the idiom might do it:

OUTER: for my $wid (@ary1) {
INNER:     for my $jet (@ary2) {
               next OUTER if $wid > $jet;
               $wid += $jet;
           }
        }

See how much easier this is? It’s cleaner, safer, and faster. It’s cleaner because it’s less noisy. It’s safer because if code gets added between the inner and outer loops later on, the new code won’t be accidentally executed. The next explicitly iterates the other loop rather than merely terminating the inner one. And it’s faster because Perl executes a foreach statement more rapidly than it would the equivalentfor loop.

First, the smug, “See how much easier this is?” makes me want to slap someone. No, I don’t. I see equally bad code. One has a goto in the middle, and one uses awkward loops. Both have no documentation, no explanation of what they’re doing, and indecipherable yet different variable names. I have little to no confidence these snippets do the same thing because I can not tell what either of are supposed to do.

The concerns about why the code is bad seem to be vague as well. “Cleaner because it’s less noisy.” Really? What does “noisy” mean? I find the labels to be a fairly unusual construct that increase my cognitive load on reading the code pretty significantly. Is that noisy? I can almost see the argument for ‘safer’, but ‘safer’ code that’s unreadable isn’t a good trade-off in my mind.

If I’ve learned anything about Perl, I’ve learned that comments about speed are often wrong. I think this is actually correct, but I don’t think it matters. This isn’t an advantage of for or foreach, but is a comment about choosing a better algorithm. That really shouldn’t be a surprise, nor should it be part of the discussion of the for loop!

Worst, this makes me feel stupid. I don’t see how this is “easier”, and I can’t read it at all and the text, by assuming I will and that this is crystal clear to every reader, seems to say to me, “You aren’t smart enough to write Perl. Go away.”

This section of perldoc has been unhelpful since Perl 5.8.8 – over six years – and damaging people’s opinion of Perl that entire time. Why do we keep writing things like this?

This entry was posted on Saturday, July 28th, 2012 at 9:15 am and is filed under Perl. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

6 Responses to “Anti-documentation in Perldoc”

Michael Roberts says:

July 28, 2012 at 11:42 am

Not to take away from your central point that both examples are pretty crappy, I think the *main* point the original author wanted to make is that if you use foreach VAR (LIST) instead of stepping through with a counter variable, your Perl will be shorter and more natural.

Tossing the outer/inner labels in was a very poor choice, as it entirely obscures that point.

Reply
- Laufeyjarson says:
  
  July 29, 2012 at 1:14 pm
  
  This tells me that my central point was lost.
  
  The central point was, “Isn’t that easy?” makes me feel stupid, because it isn’t easy or clear, and it isn’t explained, so I’m just left feeling stupid.
  
  Why would anyone use a language where the core documentation says “Too bad you’re too stupid!” between the lines?
  
  Clearly, I digressed too far with the other issues in that part of the documentation.
  
  Reply
- Christopher Cashell says:
  
  July 29, 2012 at 9:26 pm
  
  I would only partially agree. I think the author is making two points in this example:
  
  1. Perl’s “foreach” is simpler (cleaner) and faster than the traditional “for” loop that is typically used in other languages. It’s simpler because it will automatically iterate through each element in the list without the need for you to manually handle the array index variable and it’s initialization, check, and update. It’s faster because in Perl, array indexing notation is slower than the variable aliasing used with “foreach”.
  
  2. When you have nested loops, a language like C has very real limitations on how you can control those loops. You are limited to controlling the loop you are in, and even in that you are more limited than in Perl. With Perl, you can use loop labels and be very explicit about what your control action is doing. In the Perl code, you can see at a glance that if the condition is met, the “next” is taking you to the next iteration of the OUTER loop. In the C code, you have to walk back through the code to figure out where the last applies and where that statement leaves the processing.
  
  I have some familiarity with C and with Perl, so that might be why I found this documentation snippet fairly understandable, and saw it as fairly easy and self-explanatory. However, I can see how someone with less experience in either might have issues with it.
  
  I think the documentation would probably be more clear and readable if the two points mentioned above were split out and shown separately. A “here’s why Perl’s ‘foreach’ style looping is generally better than traditional ‘for’ style looping, and in the next example, here’s how Perl’s loop labels can make nested loops more flexible and the loop control more understandable”.
  
  Reply
Anonymous says:

July 28, 2012 at 11:21 pm

Where’s your bug report and documentation patch?

Reply
- Laufeyjarson says:
  
  July 29, 2012 at 1:15 pm
  
  Since I don’t know what it’s trying to say, it’s difficult to write a patch, isn’t it?
  
  Reply
Steve Throckmorton says:

July 29, 2012 at 6:56 am

I have no actual knowledge of who wrote these bits of code and commentary, but this being the internet, that won’t stop me from guessing. Reading the tea leaves, I would say that they most likely sprang from the mind of Tom Christiansen. It’s certain that they are included with minor changes in the 4th edition of the Camel book with no attribution, giving the strong impression that one or more of the team of Christiansen, foy, Orwant, & Wall is responsible for writing them. That makes the recommended code pretty much the definition of idiomatic Perl. (Though you may care to know that in the book the example is followed by this Perlish comment: “But write it however you like. TMTOWTDI.”)

Of course, appeals to authority prove nothing, but they may serve to open a mind to new possibilities if the authority is respected. Have you read the Camel book? It’s a pretty respectable piece of work.

###

To explain why the “comfortable with the idiom” code is preferred, it may help to start with a plain language explanation of the job at hand. Something like this: Compare every element of @ary1 (in array order) to every element of @ary2 (also in array order). For each comparison, if the @ary1 element is less than or equal to the @ary2 element, add the value of the @ary2 element to the @ary1 element. Otherwise, stop processing that @ary1 element immediately and begin processing the subsequent @ary1 element.

Now compare each of the Perl versions to the plain language description of the task.

The C-like version reads approximately as follows. For every integer $i from 0 to the length of @ary1 do this: for every integer $j from 0 to the length of @ary2 do this: compare the the element of @ary1 at position $i to the element of @ary2 at position $j. If the @ary1 element is the greater, then processing of @ary2 is complete; otherwise, increment the @ary1 element by the value of the @ary2 element; then increment $j, check to make sure it’s still in range, and if it is, use it to get at the next @ary2 element for comparison. When the processing of @ary2 is complete (either because we found an element of @ary1 that was greater than an element of @ary2, or because we ran out of @ary2 elements), increment $i, check to make sure it’s still in range, and if it is
use it to get at the next element of @ary1 for processing.

The idiomatic Perl reads something like the following. Your OUTER loop says “for every element $wid in @ary1 do your INNER loop.” Your INNER loop says: for every element $jet in @ary2, compare the current $wid with the current $jet, and if $wid is greater, then jump out of INNER and begin the next iteration of OUTER. Otherwise, increment element $wid by the value of $jet.

###

When perlsyn talks about noisy code, it’s talking about the cognitive load of interpreting all the stuff about $i++ and $j<@ary2 in terms of the task at hand. The "foreach" version eliminates all that and expresses the routine more succintly than plain language can (as one would hope a language for programming computers would do).

I suppose that at first reading the LABELs and "next" can be off-putting, but they're really very expressive and useful. The "next" operator is not at "goto" for at least two reasons. First, because it reads very much like
English: "if this is true, then jump to the next iteration of OUTER." second, because it says to the reader that this isn't a jump to some random point in the program–it just jumps to the next interation of the named loop construct.

I've tried to be of use here, and I'm sorry if this is no help. If you still
dislike both of the demonstrated ways to solve this problem, perhaps you could put together a solution you do find readable and post that.

Steve T.

Reply

Laufeyjarson writes…

Anti-documentation in Perldoc

6 Responses to “Anti-documentation in Perldoc”

Leave a Reply