sporesbash has asked for the wisdom of the Perl Monks concerning the following question:

I have a file containing the following information I need to extract only the IP addresses from this file and move them to another file. I also need to ALWAYS exclude the IP information that comes under "vfiler0."
===Base File=== vfiler0 running ipspace: default-ipspace IP address: 10.42.156.16 [e0a] IP address: 10.42.156.19 [bae_test] IP address: 10.42.156.18 [e5a] Path: / [/etc] UUID: 00000000-0000-0000-0000-000000000000 vfiler1 running ipspace: default-ipspace IP address: 123.123.123.123 [unconfigured] Path: /vol/palermo [/etc] UUID: 4fa0d3d8-ecd6-11df-af94-00a09808463c vfilerbae running ipspace: default-ipspace IP address: 123.124.123.123 [unconfigured] Path: /vol/bae_testing [/etc] UUID: 605f7aa8-ecd6-11df-af94-00a09808463c ===Desired Output File=== 123.123.123.123 123.124.123.123
I am starting with a snippit of my script like this, but i'm running to problems and this doesn't account for the vfiler0 removal.
#! /usr/software/bin/perl open (FaHout, ">> /tmp/vflist2.out"); open (FaH, "/tmp/vflist1.out"); while ( <FaH> ) { chomp; if (m/{[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1 +,3}\.[0-9]{1,3}}/) { $ip=$1; print FaHout "$ip\n"; } } close FaH;

Replies are listed 'Best First'.
Re: Extracing (and excluding) IP's from a file
by samarzone (Pilgrim) on Nov 09, 2010 at 13:52 UTC

    You haven written curly brackets instead of round brackets to capture the match into "$1".

    For excluding "vfiler0" block you'll have to identify where new block begins. It may be a blank line.

    ... my $vfiler0 = 0; while(<FaH>) { chomp; if(/^\s*$/) { $vfiler = 0; } if(/vfiler0/) { $vifiler = 1; } if (!$vfiler && m/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/) { $ip=$1; print FaHout "$ip\n"; } } ...
Re: Extracing (and excluding) IP's from a file
by SuicideJunkie (Vicar) on Nov 09, 2010 at 14:05 UTC

    You want to use round brackets to capture to $1, not curly brackets. Be sure to copy and paste code instead of retyping in order to avoid typos.

    You also need to tell your script what you mean by vfiler0 removal. It sounds like you meant to parse the file as records, not line by line.

    Think about how you decide what to keep... in slow motion.

    1. How far do you get before you're sure you want to ignore stuff?
      • What made you decide to start ignoring things? Write a regex to match that.
    2. Where do you go next from there?
      • What rule do you use to stop ignoring stuff?
        • There should be a unique way to identify when a new record is starting.
        • Write a regex or string comparison to match that.
        • If it crosses multiple lines, you will want a variable to track what state your match has reached so far.
    3. Make an ignore/keep flag to set or clear when you decide to process or ignore the lines as above.
    4. For each new line, check the flag before looking for IP addresses to print.

    Learning to simplify and break down your own thought processes into an algorithm that the computer can follow is a very important skill.

Re: Extracing (and excluding) IP's from a file
by ELISHEVA (Prior) on Nov 09, 2010 at 19:19 UTC

    You'd be much more likely to read the file reliably if you take advantage of the format and keywords.

    If you look for the line storing IP addresses (the ones beginning with "IP address" you are guarenteed to get all of the IP addresses. On the other hand, if you scan for what matches an IP4 syntax, you have two risks. (1) you are just crossing your fingers that only lines that actually have IP addresses store values that look like IP addresses. (2) I'd also note that your IP address pattern assume IP4, but can you really be sure you won't have a few entries in IP6 format?

    I'm guessing from your sample that your format rules look something like this:

    • Records are delimited by blank (all whitespace) lines
    • The first run of non-whitespace on the first line of a record is its id
    • The remaining lines consist of attribute value pairs: attributeName: value

    To pick out IP addresses based on attribute names, you would use a very simple state machine that determines the current record based on the value of $sId and the current attribute based on the value of $sAttribute, like this:

    use strict; use warnings; my $sId=''; my $sAttribute=''; my $sValue=''; while (my $sLine = <DATA>) { chomp $sLine; #figure out the id of the current record if ($sLine !~ /^\s*$/) { # $sId is '' when we read the first line if (!$sId) { # first line of record contains id, e.g. vfiler0, vfiler1, etc # id is first run of non-white characters - store it in $sId ($sId) = ($sLine =~ /^\s*(\S*)/); } #skip all vfiler0 record lines next if ($sId eq "vfiler0"); #get IP addreses for other records if attribute is "IP Address" #attribute name goes from first non-white character to first ':' ($sAttribute, $sValue) = ($sLine =~ /^\s*(\S[^:]*):\s*(\S*)/); if ($sAttribute && ($sAttribute eq 'IP address') && $sValue) { print "$sValue\n"; } } elsif ($sLine =~ /^\s*$/) { # records divided by entirely blank lines # when not in record set id to '' $sId=''; } #print STDERR "line ($sId): <$sLine>\n"; }
Re: Extracing (and excluding) IP's from a file
by usr345 (Sexton) on Nov 09, 2010 at 14:16 UTC
    if the order of categories is preserved for all the files - you can wait until 'vfiler1' is encountered and afterwards start getting the IPs:
    open (FaH, "/tmp/vflist1.out"); my $flag = 0; while ( <FaH> ) { chomp; if($flag == 0) { if($_ =~ /vfiler1/) { $flag = 1; } if($flag == 0) { next; } } if (m/{[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}}/) { $ip=$1; print FaHout "$ip\n"; } }
Re: Extracing (and excluding) IP's from a file
by poolpi (Hermit) on Nov 09, 2010 at 14:57 UTC
    #!/usr/bin/perl use strict; use warnings; use Regexp::Common qw /net/; $/ = ""; while (<DATA>){ next if /\A vfiler0 /x; /$RE{net}{IPv4}{-keep}/ and print $1, "\n"; } __DATA__ vfiler0 running ipspace: default-ipspace IP address: 10.42.156.16 [e0a] IP address: 10.42.156.19 [bae_test] IP address: 10.42.156.18 [e5a] Path: / [/etc] UUID: 00000000-0000-0000-0000-000000000000 vfiler1 running ipspace: default-ipspace IP address: 123.123.123.123 [unconfigured] Path: /vol/palermo [/etc] UUID: 4fa0d3d8-ecd6-11df-af94-00a09808463c vfilerbae running ipspace: default-ipspace IP address: 123.124.123.123 [unconfigured] Path: /vol/bae_testing [/etc] UUID: 605f7aa8-ecd6-11df-af94-00a09808463c

    Output:
    123.123.123.123
    123.124.123.123


    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: Extracing (and excluding) IP's from a file
by sporesbash (Initiate) on Nov 09, 2010 at 14:08 UTC
    This works for the extract but still ends up including the IP addresses from vfiler0.
    open (OUTDATA, ">> $vfile2"); open (GETDATA, "$vfile"); my $vfiler0 = 0; my $vfiler = 0; while(<GETDATA>) { chomp; if(/^\s*$/) { $vfiler = 0; } if(/vfiler0/) { $vfiler = 1; } if (!$vfiler && m/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1 +,3})/) { my $ip=$1; print OUTDATA "$ip\n"; } }
    Current Output:
    10.42.156.16 10.42.156.19 10.42.156.18 123.123.123.123 123.124.123.123 123.123.123.123 123.124.123.123
    Thanks!
      small syntax error, got it to work. Thanks Much!
Re: Extracing (and excluding) IP's from a file
by mjscott2702 (Pilgrim) on Nov 09, 2010 at 15:39 UTC
    while(<>) { next until /vfiler1/; print (split)[2], "\n" if /^IP address:/; }

    which would consume lines until the "vfiler1" line, then split lines beginnning with "IP address" on whitespace and print the third field - the address you are looking for.

    Not tested! :)

Re: Extracing (and excluding) IP's from a file
by suhailck (Friar) on Nov 10, 2010 at 02:40 UTC
    perl -le 'while(<>) { $flag=1 if /vfiler1/;print $1 if $flag and m/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/}' infile