in reply to Regex: plucking numbers from a large string

You could always use the funky regex eval features

my @exts;
$largeStr =~ /Tel: 06(\d+)(?{push @exts, $1})/g;

Or the ever handy \G zero-width assertion

my @exts; push @exts, $1 while $largeStr =~ /\GTel: 06(\d+)/g; # or better yet my %exts; $exts{$1}++ while $largeStr =~ /\GTel: 06(\d+)/g;
Then getting the frequency is just a matter of looping through the keys of %exts
print qq[found "$_" $exts{$_} times],$/ for sort keys %exts;

HTH

_________
broquaint

update: removed first suggestion as it doesn't seem to work as I expected :-/

Replies are listed 'Best First'.
Re: Re: Regex: plucking numbers from a large string
by Juerd (Abbot) on May 01, 2002 at 19:13 UTC

    Or the ever handy \G zero-width assertion

    Which is great if your data is "Tel: 061Tel: 062Tel: 063", 'cause you'd have to use something to match stuff in between, and there's probably a better solution to this than using .*.

    You can use m//g in list context, and get a list of matches (or a list of captures if you use them):

    my @extensions = $large_string =~ /Tel: 06(\d+)/g; my %extension; $extension{$_}++ for @extensions;
    If you don't need the list, you can of course use the match itself as for's expression.

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      If you don't need the list, you can of course use the match itself as for's expression.

      I thought so, too :(

      $ perl -e'$x="1 2 3 4 5 5 5 5 5 5"; $counts[$1]++ for $x=~/(\d)/g; pri +nt "$_ $c ounts[$_]\n" foreach (0..$#counts)' 0 1 2 3 4 5 10
      How's that for a wierd problem?

      Even stranger, if you s/for/while/:

      $ perl -e'$x="1 2 3 4 5 5 5 5 5 5"; $counts[$1]++ while $x=~/(\d)/g; p +rint "$_ $counts[$_]\n" foreach (0..$#counts)' 0 1 1 2 1 3 1 4 1 5 6
      This is with 5.6.1.

      Ignore me; it makes sense that $1 would be the last value with a for loop. $_ works fine.

      $ perl -e'$x="1 2 3 4 5 5 5 5 5 5"; $counts[$_]++ for $x=~/(\d)/g; pri +nt "$_ $counts[$_]\n" foreach (0..$#counts)' 0 1 1 2 1 3 1 4 1 5 6
      With a while, you have to use $1; that's confused me.
      $ perl -e'$x="1 2 3 4 5 5 5 5 5 5"; $counts[$1]++ while $x=~/(\d)/g; p +rint "$_ $counts[$_]\n" foreach (0..$#counts)' 0 1 1 2 1 3 1 4 1 5 6

      --
      Mike