PBP: 036 Escaped Characters

The PBP suggests using named escape characters instead of hardcoding ASCII values.  It has some reasonable examples, such as this:


$escape_seq = "\127\006\030Z"; # DEL-ACK-CAN-Z

$escape_seq = "\x7f\x06\x22Z;' # Same, but hex

This example proves their point, because it’s wrong.  It’s really easy to misread those numbers, or be sure you know that DEL is 127 (it is, but not in octal!).

And why do that, when you can use words?


use charnames qw( : full );

$escape_seq = "\N{DELETE}\N{ACKNOWLEDGE}\N{CANCEL}Z";

Self-documenting, and way better, right?

Actually, I’m of mixed opinion.  Yes, it gets rid of some magic numbers, but it replaces them with magic words.  I’d never have thought “ACKNOWLEDGE” for ACK, even if that’s what it’s been called – it’s just ACK.  And I spelled CANCEL wrong three different times trying to write it out.

So the replacement suggestion has similar pitfalls, and a learning curve.  No engineer I know will know what \N does, and will have to look it up.  When the “use charnames” is a hundred lines up at the top of the module, will you know where to look?  Maaaaybe.  And likely your favorite search engine will sort you pretty quick.  But you still had to go look.

I’m not strongly against this, I just don’t think it’s that big an improvement.

I’m also glad to say that it’s probably a lot less likely to be commonly needed lately.  Today, if you’re accessing other services and devices, it’s probably via some HTTP-based connection, and we have libraries for that.  In 2005 when the book was printed, that was fairly common, but direct access via serial and telnet and all sorts of other wacky protocols was more common than it has become.  YAY!

 

3 Responses to “PBP: 036 Escaped Characters”

  1. Andrew says:

    of course if you use constants (of the variable-constant kind, not the subroutine kind since those don’t interpolate) you do have the chance to write “${DEL}${ACK}${CAN}” which is pretty readable :)

  2. use English has similar problems. Sounds nice in theory, lots of smart people recommend it, and yet I’ve never bothered with it.

    However, you’re going to see \N a lot more now that we’re all living in unicode.

    use 5.10.0;
    binmode STDOUT, ‘encoding(utf-8)’;
    use charnames ‘:full’;
    say “\N{BLACK CHESS ROOK}\N{MULTIPLICATION X}\N{WHITE CHESS QUEEN}”;

  3. Max Lybbert says:

    I know the \N{…} syntax, but largely because of PBP and the fact that I enjoy using one-liners to get Unicode code points that I know the name of. Especially if I want to find out if a font supports that code point. e.g., perl -Mv5.18 -e ‘say “\N{BLACK SMILING FACE}\N{WHITE SMILING FACE}\N{PILE OF POO}\N{LATIN SMALL LETTER THORN}\N{SUPERSCRIPT TWO}”‘.

Leave a Reply to Max Lybbert