in reply to Re: Using Regexp::Common
in thread Using Regexp::Common

Ya from your example it is clear that something is wrong with the -keep option. I am going to ditch this module altogether.

Replies are listed 'Best First'.
Re^3: Using Regexp::Common
by AnomalousMonk (Archbishop) on Sep 19, 2015 at 16:13 UTC
    I wanted to extract all the numbers from a string that may be separated by any delimiter.

    I don't understand what "separation by any delimiter" means.

    I am going to ditch [Regexp::Common] altogether.

    I think that would be rash. Regexp::Common and number, the extension I think you need, are designed to do many things and are correspondingly complicated, but will, I think, repay effort invested to understand them. (Update: And I think the  -keep option just needs more study.) I'm still not sure exactly what you require, but here's a sample of code that may be near the ballpark.

    File:

    use 5.010; # need perl 5.10+ regex enhancements -- (?|alts) use warnings; use strict; use Regexp::Common qw(number); my $str = '10,101,110.11010110,123,101.010E-01'; # offsets: 0123456789012345678901234567890123456789 # 1 2 3 my $bin_int = qr{ $RE{num}{int} {-keep}{-base=>2} }xms; my $bin_real = qr{ $RE{num}{real}{-keep}{-base=>2} }xms; my $binary = qr{ (?| $bin_int | $bin_real) }xms; MATCH: while ($str =~ m{ \b $binary \b }xmsg) { my $entire = $1; my $fraction = $6; my $exponential = $8; my $expon = defined $exponential && length $exponential; my $real = ! $expon && defined $fraction && length $fraction; next MATCH unless length $entire; printf "matched '%s' at offset %d; is %s \n", $entire, $-[1], $expon ? 'exponential' : $real ? 'real' : 'integer' # default ; }
    Output:
    c:\@Work\Perl\monks\justrajdeep>perl extract_binary_nums_1.pl matched '10' at offset 0; is integer matched '101' at offset 3; is integer matched '110.11010110' at offset 7; is real matched '101.010E-01' at offset 24; is exponential


    Give a man a fish:  <%-{-{-{-<

Re^3: Using Regexp::Common
by AnomalousMonk (Archbishop) on Sep 19, 2015 at 22:36 UTC

    Ok, the penny finally dropped for the  -sep=>','  -group=>3 stuff. Is this more like what you're after? (This still needs Perl version 5.10+.)

    c:\@Work\Perl\monks\justrajdeep>perl -wMstrict -MRegexp::Common=number + -le "my $str = '100,11,111,111,10,101,110.11010110,10,101,010.0E-01,123,1 +11'; ;; my $bin_int = qr{ $RE{num}{int} {-keep}{-sep=>','}{-group=>3}{-base= +>2} }xms; my $bin_real = qr{ $RE{num}{real}{-keep}{-sep=>','}{-group=>3}{-base= +>2} }xms; ;; my $binary = qr{ (?| $bin_int | $bin_real) }xms; ;; while ($str =~ m{ \b $binary \b }xmsg) { ;; my $entire = $1; my $fraction = $6; my $exponential = $8; my ($start, $end) = ($-[1], $+[1]); ;; my $type = (defined $exponential && length $exponential) ? 'exponen +tial' : (defined $fraction && length $fraction) ? 'real' + : 'integer' ; ;; print qq{matched $type}; my $ruler = (' ' x $start) . '^' . ('-' x ($end - $start - 2)) . '^ +'; print qq{'$str'}; print qq{ $ruler \n}; ;; } " matched integer '100,11,111,111,10,101,110.11010110,10,101,010.0E-01,123,111' ^-^ matched integer '100,11,111,111,10,101,110.11010110,10,101,010.0E-01,123,111' ^--------^ matched real '100,11,111,111,10,101,110.11010110,10,101,010.0E-01,123,111' ^-----------------^ matched exponential '100,11,111,111,10,101,110.11010110,10,101,010.0E-01,123,111' ^--------------^ matched integer '100,11,111,111,10,101,110.11010110,10,101,010.0E-01,123,111' ^-^
    (The  'exponential' 'real' 'integer' type classification may be a bit wobbly. Perhaps an exercise for the reader?)


    Give a man a fish:  <%-{-{-{-<

      WOW this is a great help. I can take it from here and get it building. Thanks a ton :).