in reply to perl performance vs egrep

Your regex can be optimized a little. This:
/^(CP|KL|KM|ME|PA|PM|SL|SZ|WX|YZ)XX1/
should be faster than the one you provided.

holli, regexed monk

Replies are listed 'Best First'.
Re^2: perl performance vs egrep
by Qiang (Friar) on Jan 24, 2005 at 04:05 UTC
    base on my "programming perl" 2nd edition P538, things like

    if /one-hump/ || /two/; is likely to be faster than: if /one-hump|two/;
    so you may speed it up with this sorta regex.

      Sure that is somewhat common advice. But it doesnt apply here. Consider the regex holli posted:

      /^(CP|KL|KM|ME|PA|PM|SL|SZ|WX|YZ)XX1/

      This allows Perls regex algorithm to exploit a number of optimizations that aren't available in your variant. First, the regex has a common constant string 'XX1' so the regex engine knows it cant match unless it finds that first (and once found it knows where to start looking to see if its a match). Second the algorithm is anchored at the start which means the search is constrained to one spot. It won't traverse the options more than once per line.

      The trick you showed actually allows the constant string match to work, but is still probably a touch slower than using two index calls.

      ---
      demerphq