perlfever has asked for the wisdom of the Perl Monks concerning the following question:

I am working on a script were I need to strip everything out of a HTML file except a bunch of 10 digit numbers that are in it.

I tried this RE but I am not having any luck getting it to work.

$string !~ s/\d{10}//g;


This RE will grab the numbers, why can't I just reverse it?

$string =~ s/\d{10}//g;


Thanks.

Replies are listed 'Best First'.
Re: Regular Expression Fun
by zengargoyle (Deacon) on Feb 26, 2002 at 20:00 UTC

    Try:

    push @ten_digit_numbers, $1 while ($string =~ /(\d{10})/g); print "found: @ten_digit_numbers\n";
      The "push ... while" is not needed:
      my $string = "this 12345567890 is a 1029384756 test"; my @results = $string =~ /\d{10}/g; print "@results\n";
      In other words, instead of trying to remove stuff, flip it around and try to get stuff. If all you want is the 10-digit numbers, zenfargoyle's answer is the appropriate one.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Re: Regular Expression Fun
by dvergin (Monsignor) on Feb 26, 2002 at 20:09 UTC
    You tried:     $string !~ s/\d{10}//g; and asked "why can't I just reverse it?" What you wrote did reverse something. But, as you say, not what you intended.

    What you wrote means, "do s/\d{10}//g and then return success if it failed." Although that sounds a little funny and is not what you intended, it could actually be useful. Perhaps this will help you conceptualize what is happening:

    if ($string !~ s/\d{10}//g) { print "String unchanged: couldn't find 10 digits.\n"; # do stuff to deal with regex failure } else { print "All cases of 10 digits have been deleted.\n"; }
    Of course if we really wanted that effect, we would probably want to avoid the easy-to-miss "!" and say either of the following:
    if (not $string =~ s/\d{10}//g) { unless ($string =~ s/\d{10}//g) {
    To achieve what you wanted, check out the other responses which have offered good solutions.

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

Re: Regular Expression Fun
by ehdonhon (Curate) on Feb 26, 2002 at 19:59 UTC
    $string =~ s/\d{1,9}//g; $string =~ s/\d{11,}//g; $string =~ s/\D//g;

    Note: If your $string starts with 3 10 digit numbers, you will end up with one thirty digit number when you are done. Probably not such a big deal since you know the exact size of each number