cbtshare has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, I have a simple issue, I am learning regex in perl from http://perldoc.perl.org/perlre.html but I cant seem to get how to match the need string, can you help please.Thanks
#!/usr/bin/perl use strict; use warnings; my $row = 'We_need_feed'; #my ($last) = $row =~ /[^_]$/g; #match we #my ($last) = $row =~ /^\w?[^_]/g;#match feed my ($last) = $row =~ /[need]+$/g ; print "$last\n";

Replies are listed 'Best First'.
Re: perl regex
by GrandFather (Saint) on Nov 13, 2016 at 04:23 UTC

    You need to read the regex documentation. perlretut is a good place to start.

    Your immediate problem is using [...] instead of (...) for capturing. The square bracket version matches any one of the characters in the set of characters given.

    Premature optimization is the root of all job security
Re: perl regex
by BrowserUk (Patriarch) on Nov 13, 2016 at 04:22 UTC

    You haven't asked a question, nor explained what isn't working, so this is a guess.

    my $row = 'We_need_feed'; #my ($last) = $row =~ /[^_]$/g; #match we # [^_] matches everything except underscore, but only one character; # Plus you've anchored it to the last character before the end of the +string, # so it will always and only match the last character of the string; # unless it is an underscore, in which case it fails to match at all. #my ($last) = $row =~ /^\w?[^_]/g;#match feed # ^\w? matches a single non-whitespace character at the start of the s +tring, # if the first character isn't whitespace, otherwise it matches nothin +g. # [^_] matches a single non-underscore as above. # Ie. You matched two characters "We" my ($last) = $row =~ /[need]+$/g ; # [need]+ matches a one or more characters, so long as they are either + 'd' or 'e' or 'n'; other wise nothing # but the $ means only at the of the line. # so this matched the 3-char string 'eed'.
    • To match the first word, delimited by underscore (_), you could use /^[^_]+/.
    • To match the last word delimited by underscore, you could use /[^_]+$/.
    • To match the word 'need', somewhere in the line you could use /need/.

    You'll get better answers if you tell us what you need to know in words as well as code. Relying on us to read between the lines of your non-functional code and comments to extract the meaning and question, is unlikely to get the best answers. Indeed, If I wasn't bored out of my mind waiting for a process that has been running for 50+ hours to finish, I wouldn't have bothered answering your lazy post at all.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.
      thank you very much but: To match the word 'need', somewhere in the line you could use /need/. would just return 1 or 0, so /(need)/ would match the word need.Thank you for your explanation :)
        so /(need)/ would match the word need.

        ... and return the matched word 'need'; but yes.

        However, what is the purpose of capturing the matched word when you already know what it is?

        Ie. If your intent is to do:

        my( $foundWord ) = $string =~ /(need)/;

        That is equivalent to doing:

        my $foundWord = $string =~ /need/ ? 'need ' : undef;

        The point being that capturing is expensive and the need to capture a known constant string is strongly indicative of flawed logic.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: perl regex
by BillKSmith (Monsignor) on Nov 13, 2016 at 15:16 UTC

    I can offer one additional comment. The /g is unnecessary for your example. (The anchors ^ and $ each refer to a single place in the string.) It may be needed if your real data is a multi-line string, but then you would have other issues to consider.

    For your example, you can use:

    use strict; use warnings; my $row = 'We_need_feed'; my($first_word, $last_word) = $row =~ /^(\w+)_.*_(\w+)$/; print $first_word, ' ', $last_word;
    Bill
Re: perl regex
by AnomalousMonk (Archbishop) on Nov 13, 2016 at 16:35 UTC

    All the regex operators you're using in the OPed examples are supported by Perl version 5.6 and before, so you could also have used YAPE::Regex::Explain (which only supports regexes through Perl version 5.6) to give you some insight:

    c:\@Work\Perl>perl -wMstrict -le "use YAPE::Regex::Explain; ;; print YAPE::Regex::Explain->new(qr/[need]+$/)->explain; " The regular expression: (?-imsx:[need]+$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- [need]+ any character of: 'n', 'e', 'e', 'd' (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------
    (Note, however, that this explanation does not make clear the point already made by BrowserUk ++here that the  [need] character set only encompasses three characters  n e d and has nothing to do with a sequence 'need'.)


    Give a man a fish:  <%-{-{-{-<

Re: perl regex
by Marshall (Canon) on Nov 14, 2016 at 02:45 UTC
    Sometimes using a split can work out better than a full blown regex. Consider this:
    #!/usr/bin/perl use strict; use warnings; my $row = 'We_need_feed'; my ($first, $middle, $last) = split /_/, $row; print "first=$first, middle=$middle, last=$last\n"; #Update: just demo of indices: ($first, $middle, $last) = (split (/_/,$row))[0,1,-1]; print "middle=$middle\n"; __END__ Prints: first=We, middle=need, last=feed middle=need