Monkomatic has asked for the wisdom of the Perl Monks concerning the following question:

Oddly i cannot find a simple answer to this rather important question anywhere on the web.

I am trying to extract ip address and port from multiple occurrences in a file.

if ($_ =~ /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5}:/ +) {print MYFILE2 "$_ \n";}

Will find it and print it if the line contains it.

BUT.. How do i match it and assign the IP address to a variable even if the string contains multiple addresses?

I had though i could split the file into words and then for each word compare but there has to be a better way.

Thanks in advance

monkomatic

Thank you for your quick reply MORITZ but what am i comparing and where do i assign the data?

if ($_ =~ /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5}:/ +) {print MYFILE2 "$_ \n";}

becomes

while <$_> { while ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5}/g) + { @iparray[$iparraycounter] = $1; $iparraycounter++; }}

@iparray now contains just the ip addresses? $_= is the entire page with addresses
@iparray = list of addresses found

THANK YOU AnomalousMonk (P.s. I like the name :)

Your example is easy to follow and almost understood ... :)

I am abit confused as to where to insert the regex however.

p.s. Yup i am aware it will do 999. It will get kicked out later though. when i do a verify site is up.

I did try adding the following code but got an error:

my $str = "br>94.198.240.132:60988 asdfasdf 174.142.24.201:3128 asdfas +dfasdf"; if (my @matches = $str =~ m{ ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0- +9]{1,3}:[0-9]{1,5}) }xmsg) { print qq{matched @matches}; }
Invalid [] range "0-" in regex; marked by <-- HERE in m/ ([0-9]{1,3}\. +[0-9]{1,3}\.[0-9]{1,3}\.[0-<-- HERE +9]{1,3}:[0-9]{1,5}) / at C:\CC\BUY\ptest.pl line 43.

I Tried removing the enclosing ()

$str =~ m{ [0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0- +9]{1,3}:[0-9]{1,5} }xmsg)
= same error

I Tried adding / /

$str =~ m{ /[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0- +9]{1,3}:[0-9]{1,5}/ }xmsg)
= same error

Sorry will add a new message next time

Replies are listed 'Best First'.
Re: Extracting IP address from large text file.
by moritz (Cardinal) on Oct 19, 2010 at 21:25 UTC
    while (/(yourregexhere)/g) { print $1; }

    See also: perlretut

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Extracting IP address from large text file.
by AnomalousMonk (Archbishop) on Oct 19, 2010 at 21:48 UTC

    Also:

    >perl -wMstrict - "my $str = 'here a matchme there a matchmoe too'; ;; if (my @matches = $str =~ m{ (matchmo?e) }xmsg) { print qq{matched @matches}; } " matched matchme matchmoe

    BTW: I assume you're aware that the regex in the OP matches things like '999.999.999...'.

Re: Extracting IP address from large text file.
by AnomalousMonk (Archbishop) on Oct 19, 2010 at 22:15 UTC
    IS THE BELOW CORRECT?
    my $str = 'BIG BIG DATA FILE'; if (my @matches = $str =~ m{ ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0- +9]{1,3}:[0-9]{1,5}) }xmsg) { print qq{matched @matches}; }

    It will extract all occurrences of what you define as an 'ip address' to the array.

    BTW again: It is better to reply to a post immediately after (and 'below') the post rather than as an addendum to the OP: makes the conversation a lot easier to follow.

      The little red plus sign (note that it is, indeed, red in the reply above) at the beginning of  +9{1,3}: is a line-wrap flag and is not intended to be included in the regex. The proper way to write this piece of the regex would be  [0-9]{1,3} or better yet  \d{1,3}

      Also: Please, Please, Puh-leeeeze use code tags. Please see Markup in the Monastery.

        Sorry i did miss the +9 but still getting an error

        # ATTEMPT 1 (works but will need to split file into words and test ea +ch word) my $str = "br>94.198.240.132:60988 asdfasdf 174.142.24.201:3128 asdfas +dfasdf"; ($p1, $p2, $p3 , $p4 , $p5) = ($str =~ /([0-9]{1,3}).([0-9]{1,3}).([0- +9]{1,3}).([0-9]{1,3}):([0-9]{1,5})/g); print "$p1 $p2 $p3 $p4 $p5 \n"; # ATTEMPT 2 (still getting an error message) my $str = "br>94.198.240.132:60988 asdfasdf 174.142.24.201:3128 asdfas +dfasdf"; if (my @matches = $str =~ m{ ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9 +]{1,3}:[0-9]{1,5}) }xmsg) { print qq{matched @matches};
        Missing right curly or square bracket at C:\CC\BUY\ptest2.pl line 16, at end of line syntax error at C:\CC\BUY\ptest2.pl line 16, at EOF Execution of C:\CC\BUY\ptest2.pl aborted due to compilation errors.

        Thank you again for all your help AnomalousMonk and Moritz

      I did try adding the following code but got an error:
      my $str = "br>94.198.240.132:60988 asdfasdf 174.142.24.201:3128 asdfas + +dfasdf"; if (my @matches = $str =~ m{ ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0- ++9]{1,3}:[0-9]{1,5}) }xmsg) { print qq{matched @matches}; }

      Invalid [] range "0- " in regex; marked by <-- HERE in m/ (0-9{1,3}\.0-9{1,3}\.0-9{1,3}\.0- <-- HERE +9{1,3}:0-9{1,5}) / at C:\CC\BUY\ptest.pl line 43.

      I Tried removing the enclosing () $str =~ m{ 0-9{1,3}\.0-9{1,3}\.0-9{1,3}\.0- +9{1,3}:0-9{1,5} }xmsg) = same error

      I Tried adding / / $str =~ m{ /0-9{1,3}\.0-9{1,3}\.0-9{1,3}\.0- +9{1,3}:0-9{1,5}/ }xmsg) = same error

      I also tried the below method with limited success.

      my $str = "br>94.198.240.132:60988 asdfasdf 174.142.24.201:3128 asdfas +dfasdf"; ($p1, $p2, $p3 , $p4 , $p5) = ($str =~ /([0-9]{1,3}).([0-9]{1,3}).([0- +9]{1,3}).([0-9]{1,3}):([0-9]{1,5})/g); print "$p1 $p2 $p3 $p4 $p5 \n";

      output: 94 198 240 132 60988

      But does not work /g globally without some kind of looping construct.. Sigh... well there is the split the entire file into words and do it for each word method available to me now at least :)

        ...well there is the split the entire file into words and do it for each word method available to me now at least :)

        I think that approach might be best for all concerned.

Re: Extracting IP address from large text file.
by Anonymous Monk on Oct 20, 2010 at 02:03 UTC

    Have a look at the various items available under Regexp::Common on CPAN . . .

      my $str = 'br>94.198.240.132:60988 asdfasdf 174.142.24.201:3128 asdfas +dfasdf'; if (my @matches = $str =~ m{ ([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9 +]{1,3}:[0-9]{1,5}) }xmsg) { print qq{matched @matches}; }

      Finally got this it work... It was bugging me :)

      Thanks everyone for the help. sorry for my stupidity missing the embedded code.