kulls has asked for the wisdom of the Perl Monks concerning the following question:

Greetings,
I tried with regular expression for pattern matching. Can anyone please optimize the code ?.
Here is my code and it's relavent output
use strict; use warnings; my @str=qw(ab_xcc3 ab_xcc3_000 ok_and rajak_acc_idx rajak_acs_idx BMW_acc BMW_asc vc_ba vc_ba_002 dc_sts1 dc_sts6_005); for(@str) { my ($temp)=$_=~/^(\w{2})\_/xi; if($temp) { ($temp) =$_=~/^(\w{2}\_[a-z0-9]+)/xi; print $temp."\n"; } else { ($temp)=$_=~/^([a-z0-9]+)\_/xi; print $temp,"\n"; } } __OUTPUT___ ab_xcc3 ab_xcc3 ok_and rajak BMW vc_ba vc_ba dc_sts1 dc_sts6

Thank you in advance.
- kulls

Replies are listed 'Best First'.
Re: Regular Expression - pattern matching
by McDarren (Abbot) on Feb 22, 2006 at 08:47 UTC
    kulls,

    You've provided some sample data - good!
    You've provided the code that you've written so far - excellent!
    But - have actually told us what the code is supposed to do?
    Some pattern matching obviously, but what are the rules?

    Once again, you're asking people to second-guess what it is that you are trying to do. If you expect sensible and helpful answers to your questions, you need to be more explicit.

    For example, you might have said: "If the string starts with two word characters, I want to extract the first 13 characters, otherwise I want to extract the first 27 characters"

    (That obviously isn't what you are trying to do with your code, but hopefully you get the idea)

    Cheers,
    Darren :)

      Hi Darren,

      I need to get the preferred patterns from the  @str array.As i don't have a any general/common rule to due to different patterns,i guess the following assumption can yield the good results.


      My assumptions are,

      #. check the number of characters before the first "_" (underscore).
      #. If it's return only two characters, then get the string untill it reach the second "_"(underscore) from the string else if more than 2 characters then it must be a targetted pattern.
      #. I guess i can't able to use the "\w" (word) pattern, because it will take the whole string as a word.

      Thank you in advance.
      -kulls
        #. check the number of characters before the first "_" (underscore). #. If it's return only two characters, then get the string untill it reach the second "_"(underscore)
        Okay, well because you are looking for anything that is not an underscore, then a negated character class is probably the way to go. Something like this:
        if (/^([^_]{2}_[^_]+)_/) { print $1; }
        else if more than 2 characters then it must be a targetted pattern
        Sorry, but I don't get that. What is a "targetted pattern"?

        I guess i can't able to use the "\w" (word) pattern, because it will take the whole string as a word.
        Yes and no. It's okay, because you are limiting it to only the first two characters with {2}. But it's probably not okay because \w will also match an underscore. So \w{2} would match something like "_a", which I'm pretty sure you don't want.

        Cheers,
        Darren :)

Re: Regular Expression - pattern matching
by borisz (Canon) on Feb 22, 2006 at 08:47 UTC
    # untested for(@str) { /^(?:(\w{2}_[a-z0-9]+)|([a-z0-9]+)_)/i and print $1 || $2, "\n"; }
    or
    for(@str) { /^(\w{2}_[a-z\d]+|[a-z\d]+(?=_))/i and print $1, $/; }
    Boris
Re: Regular Expression - pattern matching
by pKai (Priest) on Feb 22, 2006 at 08:33 UTC

    You get warnings for (non-matching) input like

    ab_$ $abc

    Is that acceptable?