blackadder has asked for the wisdom of the Perl Monks concerning the following question:

Hi

*Sorry this is rather simple, but RegEx is my weakest points

I have lines like this:
CFD123_TRE_MXD TRD981_GHD_MXW . . .
And I only want to capture the numbers in the string. i.e
123 981
I tried this code:
($var) =~ /^[^a-z]+([0-9]+)/; print "$1\n";
By which i mean; scan from left to right, from the begining of the string, ignore multiple alphabetical chars, grab multiple digits, save it to memory. However it doesn't seem to work

Can some one please enlighten me on how I can do this.

Thanks

********* UPDATE ******************

Ok I did this
$var =~ s/([0-9]+)//g;
it returned the right values, but still wondering if there is a better way for doing this - without destroying the original string?
Blackadder

Replies are listed 'Best First'.
Re: Help with regEX
by japhy (Canon) on Sep 14, 2005 at 13:20 UTC
    Your original regex's problem is that you have [^a-z] which matches any character EXCEPT lowercase letters. Anyway, taking your working solution, sure you can do it without destroying the original string. ($number) = $string =~ /(\d+)/, presto.

    Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
    How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
Re: Help with regEX
by davidrw (Prior) on Sep 14, 2005 at 13:24 UTC
    as a rule, please always clarify what "doesn't seem to work" means -- e.g. what did you expect to get? what did you actually get?

    that aside, it looks like it's just a matter of case -- the a-z is just lower case letters, but your data has upper case.. to fix this, either do [a-zA-Z] or add the /i modifier.

    One other issue -- you have [^a-z] -- that caret in there actually negates the character class, so what it means is NOT a letter (but you're right that the /^ matches the beginning of the string)..

    so altogether, try: /^[A-Z]+([0-9]+)/i; You might also consider /^\w+(\d+)/ (note this will accept underscores as well as letters) -- see perlre for \w and \d ...
Re: Help with regEX
by reneeb (Chaplain) on Sep 14, 2005 at 13:18 UTC
    my ($number) = $string =~ /^\w{3}(\d+)/; print $number,"\n";
Re: Help with regEX
by Codon (Friar) on Sep 14, 2005 at 16:57 UTC
    Get rid of that substitution:
    my ($digits) = $var =~ /([0-9]+)/;
    And then you can use the "digit" metacharacter \d:
    my ($digits) = $var =~ /(\d+)/;
    For your learning pleasure: the reason your original regex failed is you were getting all characters that were not a lower-case letter in a greedy fasion. That sucked up the entire string with your first atom. There were no digits left to match. Might I suggest Mastering Regular Expressions for much more information.

    Ivan Heffner
    Sr. Software Engineer, DAS Lead
    WhitePages.com, Inc.