PBP: 057 Match Variables

Lots of people (not just the Best Practices) tell you never to use the regex match variables.  Listen to them!The regex match variables $PREMATCH ($`), $MATCH ($&), and $POSTMATCH ($’) capture the stuff before the match, the matched stuff, and the stuff after the match.

If they’re used anywhere in the program, the interpreter realizes, since they’re globals, it can’t tell where they’ll be used and relevant, so it begins to update them for every regex parsed.  Perl does a LOT of regex parsing, and setting three additional scalars will impact the speed of your program.

Don’t do it!

When you use the English module, don’t let it do it either:

<code lang=”perl”>

use English qw/  -no_match_vars /;

</code>

If you happen to need one of these things, you can use your own capturing parenthesis in that regular expression to capture it yourself, without mucking up the speed of the entire program run.  The PBP and many articles on-line will explain how.

(Footnote: The docs for the English module talk about the speed penalty being fixed in 5.20.  Or 5.16, depending on where you look.  I suggest you add the -no_match_vars anyway, because it’s explicit, and lots of people use older perls.)

2 Responses to “PBP: 057 Match Variables”

  1. Like most things in perl, it’s a little more complicated than “just don’t do it”. What if you’re code is a one-off that does something simple and then the process goes away? It doesn’t make any difference if you slow down all your regexps if you’re only doing one regexp match anyway.

    • Laufeyjarson says:

      I don’t agree with this, although I see the point you’re trying to make.

      My big worry with using these is that, even if I know what I’m doing, someone else probably doesn’t, because they’ve never seen them used before. My one-liner gets turned into an alias, then a script, and then a module, and suddenly, everything is slow. Or it gets pasted into a forum or blog post where the Internet finds it and never lets it die, with poor confused newbies copying and pasting that bad habit off forever. See also: Matt’s Script Archive (http://en.wikipedia.org/wiki/Matt's_Script_Archive).

      My other worry is that this gets me into the habit of doing something dangerous. I then might think, “Eh, it’s okay here too, who cares about performance for this?” and poison a bunch of other people’s work. Bad habits are too easy to start. If I need one of these constructs, I can just as easily build the capture explicitly myself.

      It’s a strange form of action at a distance, and I don’t think it should ever be used. If it were up to me, I’d deprecate it and remove it. This is a wart that should go away. And if we break AWK scripts, maybe it’s time they get updated anyway.

Leave a Reply