comment on

You'd be much more likely to read the file reliably if you take advantage of the format and keywords.

If you look for the line storing IP addresses (the ones beginning with "IP address" you are guarenteed to get all of the IP addresses. On the other hand, if you scan for what matches an IP4 syntax, you have two risks. (1) you are just crossing your fingers that only lines that actually have IP addresses store values that look like IP addresses. (2) I'd also note that your IP address pattern assume IP4, but can you really be sure you won't have a few entries in IP6 format?

I'm guessing from your sample that your format rules look something like this:

Records are delimited by blank (all whitespace) lines
The first run of non-whitespace on the first line of a record is its id
The remaining lines consist of attribute value pairs: attributeName: value

To pick out IP addresses based on attribute names, you would use a very simple state machine that determines the current record based on the value of $sId and the current attribute based on the value of $sAttribute, like this:

use strict;
use warnings;

my $sId='';
my $sAttribute='';
my $sValue='';

while (my $sLine = <DATA>) {
  chomp $sLine;

  #figure out the id of the current record
  if ($sLine !~ /^\s*$/) {
     # $sId is '' when we read the first line
    if (!$sId) {
      # first line of record contains id, e.g. vfiler0, vfiler1, etc
      # id is first run of non-white characters - store it in $sId
      ($sId) = ($sLine =~ /^\s*(\S*)/);
    }

    #skip all vfiler0 record lines
    next if ($sId eq "vfiler0");

    #get IP addreses for other records if attribute is "IP Address"
    #attribute name goes from first non-white character to first ':'
    ($sAttribute, $sValue) = ($sLine =~ /^\s*(\S[^:]*):\s*(\S*)/);

    if ($sAttribute && ($sAttribute eq 'IP address') && $sValue) {
      print "$sValue\n";
    }

  } elsif ($sLine =~ /^\s*$/) {
    # records divided by entirely blank lines
    # when not in record set id to ''
    $sId='';
  }
  #print STDERR "line ($sId): <$sLine>\n";
}
[download]

In reply to Re: Extracing (and excluding) IP's from a file by ELISHEVA
in thread Extracing (and excluding) IP's from a file by sporesbash

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.