kprice:

It depends whether you want to find all matches or if you're happy with finding any matches. With alternation, once you find one, then it stops looking, and won't find the rest. If you're satisfied with that, then by all means use it. If you need to find *each* match, then alternation could easily become troublesome:

Roboticus@Roboticus-PC ~ $ cat regex_alt.pl use strict; use warnings; my $text=<<EOT; Now is the time for all good men to come to the aid of their party. EOT print "ORDER MATTERS FOR COLLISIONS:\n"; while ($text=~/(the|their)/g) { print "\tfound $1 at $-[0], $+[0].\n"; } print "VS:\n"; while ($text=~/(their|the)/g) { print "\tfound $1 at $-[0], $+[0].\n"; } Roboticus@Roboticus-PC ~ $ perl regex_alt.pl ORDER MATTERS FOR COLLISIONS: found the at 7, 10. found the at 44, 47. found the at 55, 58. VS: found the at 7, 10. found the at 44, 47. found their at 55, 60.

Here, you see that if you're not careful in building your regex from a collection, that you can have problems. If you put 'the' before 'their' in your alternation, you'll never match 'their'. In an automated system, you can't simply go by string length, and put the shorter strings after the longer strings. For example, if one of your regex strings was "t[ho]e", then placing it before "the" would still prevent you from matching the second one.

Now all of this may or may not matter depending on your requirements. (After all, if "the" can match either expression, is it significant which one matched?) My point is simply that you'll need to think about things before simply building an alternation...

...roboticus


In reply to Re: Alternation vs. looping for multiple searches. by roboticus
in thread Alternation vs. looping for multiple searches. by kprice++

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.