calypso has asked for the wisdom of the Perl Monks concerning the following question:

I have images named 012346.jpg, 012346-all.jpg, 012346-EXP.jpg and more with the same first 6digits. Right now I can search and get 012346.jpg with this ($imgNumber =~ /^(\d)$/) but how do I get the rest with regular expressions? thanks

Replies are listed 'Best First'.
Re: regex help
by si_lence (Deacon) on Nov 16, 2004 at 15:38 UTC
    Your code will match only files named with one single digit (i.e "4" or "7")
    because you match the beginning of the string with ^, then one digit with \d
    and then the end of a string with $.


    If you just want the number then use
    ($imgNumber =~ /^(\d+)/);
    if you want the whole name of the file without the extention use
    ($imgNumber =~ /^(\d+.+)\.jpg/);
    if you want something else be more precise in your question ;-)
    si_lence
Re: regex help
by eieio (Pilgrim) on Nov 16, 2004 at 15:33 UTC
    If you simply want a regular expression that matches six digits and other optional text, the following should work:

    $imgNumber =~ /^(\d{6})(.*)\.jpg$/

    This saves the six digits in $1 and the optional text in $2 (excluding the extension).

Re: regex help
by borisz (Canon) on Nov 16, 2004 at 15:33 UTC
    what rest? Like this??
    my ( $num, $rest ) = $img_name =~ /^(\d{6})(.*\.jpg)$/i;
    Boris
Re: regex help
by TedPride (Priest) on Nov 16, 2004 at 16:11 UTC
    I think what you want is something like the following:
    my $str = '012346.jpg 012346-all.jpg 012346-EXP.jpg'; while ($str =~ /(\d+)(.*?\.jpg)/gi) { print "$1$2\n"; }
    Where $1 is the number, $2 the rest.
Re: regex help
by ww (Archbishop) on Nov 16, 2004 at 16:46 UTC

    Don't see how your regex could do what you say. Your "^(\d)$" is looking for a SINGLE digit between start of record/string, "^," and the end of same, "$."

    What follows sorta' does what you're talking about altho in the regex you'll need to do "(012346" rather than "(\d{6}" or some variant of (any digit except 5,7, 8 or 9) or similarly clunky approaches to deal with the exact string

    and, is this homework? If so, it's a good idea to mention ('fess up) that fact, so monks can teach without actually DOING the homework.

    #!C:/perl/bin -w use strict; use vars qw ($input @input @img $img); @input = <DATA> ; foreach $input (@input) { if ( $input =~ # Caret in next line: Start at begining of +string/record/whatever /^(\d{6} # sTART CAPTURE, with any digit, [0-9] +, exactly six times. (- # optionally (trailing "?") in a NON +-capture-group: a dash followed by a \w{3})? # word char, exactly 3 times; end group +ing parens \.jpg$ # literal period followed by "jpg" endin +g the string ) # end capture /ix ) { #case insensitive (to catch -all and - +EXP); extended form push(@img, $input); } } print "\n\t+++++\n"; print @img; print "\n\tdone\n"; exit; __DATA__ 012346.jpg 012346-all.jpg 012346-EXP.jpg 012346-exp.jpg 012346-ALL.jpg 012345.jpg 12345.jpg 01234.jpeg 0123345+ALL.jpg 0123345-exp.jpg 0123345-ALL.jpg not_jpg.last
    note that the last 6 items in data do NOT match the regex.
      I sure am a perl novice. Its not homework, its a little project script I am working on. sorry, I mistyped that regex, this "$imgNumber =~/^(\d{6})/" not "^(\d)$" is what gave me 012346.jpg. I took care of the .jpg by just recursively printing "$imgNumber.jpg"; and it worked. I was just confused as to how to catch the "-ALL" or "-EXP" "-SAMPLE" part of it with regular expressions. And if possible, I sure would like some teaching on regex. i googled some tut but was still quite confused. Thanks for the help.
        Calypso -- I'm not more'n a step or two ahead of you so forgive any of this which you know and practice already:
        • PREVIEW, PREVIEW, PREVIEW!
        • CUT'n'PASTE FROM TESTED CODE (it doesn't have to work if it's an illustration of a failure, but it does have to be -- verbatim -- what you intended to ask about.
        • Specifically re regexen, since you've reached the point you describe, your next step (IMO) should be to read up on /x, extended format. It really is worth the effort because it makes the regex so readable. Think you'll find good info in perlretut, for which, search or supersearch here. Also will try (when next at home computer) to find some links to some of the better on-line tutorials I stumbled upon.

        And BTW, I have been chastised for using telegraphese (which bears some resemblence to sms texting as practiced by the teen crowd; which association may or may not have been what most annoyed the critics). Nonetheless, they offered other observations which were thoughtful and useful. I, for one, don't wish to impede my communication with them by using forms of which they disapprove, nor by being ambiguous/unclear or inaccurate.

        So I offer you a related (that is, I hope you think it's related) thought: Format your post in a manner that approximates standard syntax and grammar. Structure your notes with paragraphs and typographic conventions that make your message easy to read and follow.... and, above all, make sure you've thought thru, clearly and carefully, just what it is you wish to say.

        A few of us are so elderly or dinosaur-ish as to believe it rude to utter notes shot through with typos....