dbrock has asked for the wisdom of the Perl Monks concerning the following question:

I am currently trying to find out if I can "chomp" (or another command) multiple lines once I find what ever im searching for... I'm currently opening a ascii text file sending it to an array... Then while in a for loop i am reading each element of the array... for($i=0;$i<@txtfile;$i++) { what I have run into is that inside of the text file that I'm reading, i have many lines that have the same data... Currently i'm using the chomp command to find and set the variables i need...
##### Get Keyword Variable ##### if ($txtfile[$i] =~ /\bKeyword:/) { chomp ($keyword = $txtfile[$i]); $keyword =~ s/Keyword: //; #print "$keyword\n"; }
is there a way to chomp / capture multiple lines while i'm already in the for loop incrementing each line... so as example i would like to capture the following data from example below... (each "Include:" seperatley) text file excerpt:
Client/HW/OS/Pri: natsciexevs19 PC WindowsNET 0 0 0 0 ? Include: Microsoft Information Store:\SG1 Include: NEW_STREAM Include: Microsoft Information Store:\SG2 Include: NEW_STREAM Include: Microsoft Information Store:\SG3 Exclude: (none defined)
I would like to capture each of the "Include:" lines and set them to $include1 , $include2 , $include3 etc... the problem i continue to run into is that I over write my variable with the second instance of "Include"... ------------------------------update------------------------------------------ this is what i have written thus far...
#!c:\perl\bin\perl.exe # # $policypath = "c:\\temp\\policy\\"; #host properties file $error = "c:\\temp\\error.txt"; #error output file $csv = "c:\\temp\\host.csv"; #csv output file open (CSV, ">>$csv") or die "no file as such $csv $!"; open (ERROR, ">>$error") or die "no file as such $error $!"; #print " ***$csv is currently being - opened***\n\n"; #print " ***$policypath is being opened***\n\n"; # $polname = "Policy Name"; $poltype = "Policy Type"; $active = "Policy Active"; $compress = "Client Compress"; $nfs = "Follow NFS Mnts"; $cross = "Cross Mnt Points"; $collect = "Collect TIR info"; $block = "Block Incremental"; $$mult = "Mult. Data Stream"; $snapshot = "Perform Snapshot Backup"; $offhost = "Perform Offhost Backup"; $backupcopy = "Backup Copy"; $datamover = "Use Data Mover"; $datamovertype = "Data Mover Type"; $alternateclient = "Use Alternate Client"; $alternateclientname = "Alternate Client Name"; $virtualmachine = "Use Virtual Machine"; $instantrecovery = "Enable Instant Recovery"; $policypriority = "Policy Priority"; $disasterrecovery = "Disaster Recovery"; $bmr = "Collect BMR Info"; $keyword = "Keyword"; $dataclass = "Data Classification"; $storagelifecycle = "Residence is Storage Lifecycle Policy"; $clientencrypt = "Client Encrypt"; $checkpoint = "Checkpoint"; $checkpointinterval = "Checkpoint Interval"; $residence = "Residence"; $volumepool = "Volume Pool"; $servergroup = "Server Group"; $granularrestore = "Granular Restore Info"; $generation = "Generation"; #print header row in .csv # have not completed all of the variables yet following line not yet c +omplete #print (CSV "$polname,$poltype,$active,$compress,$nfs,$cross,$collect, +$block,$mult,$snapshot,$offhost,$backupcopy,$datamover,$datamovertype +,$alternateclient,$alternateclientname,$virtualmachine,$instantrecove +ry,$policypriority,$disasterrecovery,$bmr,$keyword,$dataclass,$storag +elifecycle,$clientencrypt,$checkpoint,$checkpointinterval,$residence, +$volumepool,$servergroup,$granularrestore,$generation\n"); # print to (xls - .csv file) # opendir (Directory, $policypath) or die "Cannot Open Directory at $pol +icypath $!"; @contents = grep !/^\.\.?$/, readdir (Directory); closedir(Directory); foreach(@contents){ $policyfile = "$policypath"."$_"; # open(FILE, "$policyfile"); # open logfile for reading @txtfile = <FILE>; close(FILE); # # print " ***$policyfile is currently being - processed***\n\n"; # for($i=0;$i<@txtfile;$i++) { ##### Get Policy Name Variable ##### if ($txtfile[$i] =~ /\bPolicy Name:/) { chomp ($polname = $txtfile[$i]); $polname =~ s/Policy Name: //; print "$polname\n"; } ##### Get Policy Type Variable ##### if ($txtfile[$i] =~ /\bPolicy Type:/) { chomp ($poltype = $txtfile[$i]); $poltype =~ s/Policy Type: //; print "$poltype\n"; } ##### Get Policy Active Variable ##### if ($txtfile[$i] =~ /\bActive:/) { chomp ($active = $txtfile[$i]); $active =~ s/Active: //; print "$active\n"; } ##### Get Client Compress Variable ##### if ($txtfile[$i] =~ /\bClient Compress:/) { chomp ($compress = $txtfile[$i]); $compress =~ s/Client Compress: //; print "$compress\n"; } ##### Get Follow NFS Mnts Variable ##### if ($txtfile[$i] =~ /\bFollow NFS Mnts:/) { chomp ($nfs = $txtfile[$i]); $nfs =~ s/Follow NFS Mnts: //; print "$nfs\n"; } ##### Get Cross Mnt Points Variable ##### if ($txtfile[$i] =~ /\bCross Mnt Points:/) { chomp ($cross = $txtfile[$i]); $cross =~ s/Cross Mnt Points: //; print "$cross\n"; } ##### Get Collect TIR info Variable ##### if ($txtfile[$i] =~ /\bCollect TIR info:/) { chomp ($collect = $txtfile[$i]); $collect =~ s/Collect TIR info: //; print "$collect\n"; } ##### Get Block Incremental Variable ##### if ($txtfile[$i] =~ /\bBlock Incremental:/) { chomp ($block = $txtfile[$i]); $block =~ s/Block Incremental: //; print "$block\n"; } ##### Get Mult. Data Stream Variable ##### if ($txtfile[$i] =~ /\bMult. Data Stream:/) { chomp ($mult = $txtfile[$i]); $mult =~ s/Mult. Data Stream: //; print "$mult\n"; } ##### Get perform Snapshot Backup Variable ##### if ($txtfile[$i] =~ /\bPerform Snapshot Backup:/) { chomp ($snapshot = $txtfile[$i]); $snapshot =~ s/Perform Snapshot Backup: //; print "$snapshot\n"; } ##### Get Perform Offhost Backup Variable ##### if ($txtfile[$i] =~ /\bPerform Offhost Backup:/) { chomp ($offhost = $txtfile[$i]); $offhost =~ s/Perform Offhost Backup: //; print "$offhost\n"; } ##### Get Backup Copy Variable ##### if ($txtfile[$i] =~ /\bBackup Copy:/) { chomp ($backupcopy = $txtfile[$i]); $backupcopy =~ s/Backup Copy: //; print "$backupcopy\n"; } ##### Get Use Data Mover Variable ##### if ($txtfile[$i] =~ /\bUse Data Mover:/) { chomp ($datamover = $txtfile[$i]); $datamover =~ s/Use Data Mover: //; print "$datamover\n"; } ##### Get Data Mover Type Variable ##### if ($txtfile[$i] =~ /\bData Mover Type:/) { chomp ($datamovertype = $txtfile[$i]); $datamovertype =~ s/Data Mover Type: //; print "$datamovertype\n"; } ##### Get Use Alternate Client Variable ##### if ($txtfile[$i] =~ /\bUse Alternate Client:/) { chomp ($alternateclient = $txtfile[$i]); $alternateclient =~ s/Use Alternate Client: //; print "$alternateclient\n"; } ##### Get Alternate Client Name Variable ##### if ($txtfile[$i] =~ /\bAlternate Client Name:/) { chomp ($alternateclientname = $txtfile[$i]); $alternateclientname =~ s/Alternate Client Name: //; print "$alternateclientname\n"; } ##### Get Use Virtual Machine Variable ##### if ($txtfile[$i] =~ /\bUse Virtual Machine:/) { chomp ($virtualmachine = $txtfile[$i]); $virtualmachine =~ s/Use Virtual Machine: //; print "$virtualmachine\n"; } ##### Get Enable Instant Recovery Variable ##### if ($txtfile[$i] =~ /\bEnable Instant Recovery:/) { chomp ($instantrecovery = $txtfile[$i]); $instantrecovery =~ s/Enable Instant Recovery: //; print "$instantrecovery\n"; } ##### Get Policy Priority Variable ##### if ($txtfile[$i] =~ /\bPolicy Priority:/) { chomp ($policypriority = $txtfile[$i]); $policypriority =~ s/Policy Priority: //; print "$policypriority\n"; } ##### Get Disaster Recovery Variable ##### if ($txtfile[$i] =~ /\bDisaster Recovery:/) { chomp ($disasterrecovery = $txtfile[$i]); $disasterrecovery =~ s/Disaster Recovery: //; print "$disasterrecovery\n"; } ##### Get Keyword Variable ##### if ($txtfile[$i] =~ /\bKeyword:/) { chomp ($keyword = $txtfile[$i]); $keyword =~ s/Keyword: //; print "$keyword\n"; } ##### Get Data Classification Variable ##### if ($txtfile[$i] =~ /\bData Classification:/) { chomp ($dataclass = $txtfile[$i]); $dataclass =~ s/Data Classification: //; print "$dataclass\n"; } ##### Get Residence is Storage Lifecycle Policy Variable ##### if ($txtfile[$i] =~ /\bResidence is Storage Lifecycle Policy:/) { + chomp ($storagelifecycle = $txtfile[$i]); $storagelifecycle =~ s/Residence is Storage Lifecycle Policy: / +/; print "$storagelifecycle\n"; } ##### Get Client Encrypt Variable ##### if ($txtfile[$i] =~ /\bClient Encrypt:/) { chomp ($clientencrypt = $txtfile[$i]); $clientencrypt =~ s/Client Encrypt: //; print "$clientencrypt\n"; } ##### Get Checkpoint Variable ##### if ($txtfile[$i] =~ /\bCheckpoint:/) { chomp ($checkpoint = $txtfile[$i]); $checkpoint =~ s/Checkpoint: //; print "$checkpoint\n"; } ##### Get Checkpoint Interval Variable ##### if ($txtfile[$i] =~ /\b Interval:/) { chomp ($checkpointinterval = $txtfile[$i]); $checkpointinterval =~ s/ Interval: //; print "$checkpointinterval\n"; } ##### Get Residence Variable ##### if ($txtfile[$i] =~ /\bResidence:/) { chomp ($residence = $txtfile[$i]); $residence =~ s/Residence: //; print "$residence\n"; } ##### Get Volume Pool Variable ##### if ($txtfile[$i] =~ /\bVolume Pool:/) { chomp ($volumepool = $txtfile[$i]); $volumepool =~ s/Volume Pool: //; print "$volumepool\n"; } ##### Get Server Group Variable ##### if ($txtfile[$i] =~ /\bServer Group:/) { chomp ($servergroup = $txtfile[$i]); $servergroup =~ s/Server Group: //; print "$servergroup\n"; } ##### Get Granular Restore Info Variable ##### if ($txtfile[$i] =~ /\bGranular Restore Info:/) { chomp ($granularrestore = $txtfile[$i]); $granularrestore =~ s/Granular Restore Info: //; print "$granularrestore\n"; } ### have not finished script yet at this point ### ### the lines of the data file starts to have duplicates ### ### this is where my question begins about multipul lines ### } #for loop # # at this point I will print each set of variables to the CSV file ## +# # inside of the for loop # }#foreach loop # # close (CSV); close (ERROR); ###EOF###
an example of the file / array i am reading through is listed below... my specific question is when in the example (near the bottom of the text) when the lines have duplicates... ie. include, schedule, sunday, monday, tuesday, etc... I'm not quit sure of how to capture multiple identical lines and have the variables that I'm setting increment up... ie. $include1, $include2, etc...
Policy Name: EXCHANGE_19 Options: 0x0 template: FALSE c_unused1: ? Names: (none) Policy Type: MS-Exchange-Server (16) Active: yes Effective date: 03/23/2004 16:14:45 Mult. Data Stream: yes Perform Snapshot Backup: no Snapshot Method: (none) Snapshot Method Arguments: (none) Perform Offhost Backup: no Backup Copy: 0 Use Data Mover: no Data Mover Type: 2 Use Alternate Client: no Alternate Client Name: (none) Use Virtual Machine: no Enable Instant Recovery: no Policy Priority: 500 Max Jobs/Policy: Unlimited Disaster Recovery: 0 Collect BMR Info: no Keyword: (none specified) Data Classification: - Residence is Storage Lifecycle Policy: no Client Encrypt: no Checkpoint: no Residence: MA_06_HCART2_NATB Volume Pool: MA_06_100000_HCART2 Server Group: *ANY* Granular Restore Info: no Generation: 12 Client/HW/OS/Pri: natsciexevs19 PC WindowsNET 0 0 0 0 ? Include: Microsoft Information Store:\SG1 Include: NEW_STREAM Include: Microsoft Information Store:\SG2 Include: NEW_STREAM Include: Microsoft Information Store:\SG3 Exclude: (none defined) Schedule: Exchangefull Type: FULL SExchange (0) Frequency: 1 day(s) (86400 seconds) EXCLUDE DATE 0 - 11/25/2004 Maximum MPX: 4 Synthetic: 0 PFI Recovery: 0 Retention Level: 1 (2 weeks) u-wind/o/d: 0 0 Incr Type: DELTA (0) Alt Read Host: (none defined) Max Frag Size: 0 MB Number Copies: 1 Fail on Error: 0 Residence: (specific storage unit not required) Volume Pool: (same as policy volume pool) Server Group: (same as specified for policy) Residence is Storage Lifecycle Policy: 0 Daily Windows: Day Open Close W-Open W-Close Sunday 017:00:00 021:10:00 017:00:00 021:10:00 Monday 017:00:00 021:00:00 041:00:00 045:00:00 Tuesday 017:00:00 021:00:00 065:00:00 069:00:00 Wednesday 017:00:00 021:00:00 089:00:00 093:00:00 Thursday 017:00:00 021:00:00 113:00:00 117:00:00 Friday 017:00:00 021:00:00 137:00:00 141:00:00 Saturday 017:00:00 021:00:00 161:00:00 165:00:00 Schedule: UsrBackup Type: UBAK Exchange (2) Frequency: 0 day(s) (0 seconds) Maximum MPX: 1 Synthetic: 0 PFI Recovery: 0 Retention Level: 0 (1 week) u-wind/o/d: 0 0 Incr Type: DELTA (0) Alt Read Host: (none defined) Max Frag Size: 0 MB Number Copies: 1 Fail on Error: 0 Residence: (specific storage unit not required) Volume Pool: (same as policy volume pool) Server Group: (same as specified for policy) Residence is Storage Lifecycle Policy: 0 Daily Windows: Day Open Close W-Open W-Close Sunday 000:00:00 168:00:00 000:00:00 168:00:00 Monday 000:00:00 000:00:00 Tuesday 000:00:00 000:00:00 Wednesday 000:00:00 000:00:00 Thursday 000:00:00 000:00:00 Friday 000:00:00 000:00:00 Saturday 000:00:00 000:00:00
any assistance would be helpful... thank you...

Replies are listed 'Best First'.
Re: capture multiple lines
by toolic (Bishop) on Apr 21, 2009 at 16:35 UTC
    You could just chomp each line as you read it in:
    my @txtfile; while (<$fh>) { chomp; push @txtfile, $_; }

    Alternately, after you have read the file into your array, chomp the whole array:

    chomp @txtfile;
Re: capture multiple lines
by bichonfrise74 (Vicar) on Apr 21, 2009 at 17:32 UTC
    Another possible solution...
    #!/usr/bin/perl use strict; my @array; while( <DATA> ) { push( @array, $1 ) if ( /^(Include:\s.*)/ ); } print @array; __DATA__ Client/HW/OS/Pri: natsciexevs19 PC WindowsNET 0 0 0 0 ? Include: Microsoft Information Store:\SG1 Include: NEW_STREAM Include: Microsoft Information Store:\SG2 Include: NEW_STREAM Include: Microsoft Information Store:\SG3 Exclude: (none defined)
Re: capture multiple lines
by jck000 (Novice) on Apr 21, 2009 at 16:50 UTC
    Use grep to get all matches into a new array:
    @matches = grep { /keyword/ } @file_records; foreach (@matches) { push(@chomped, chomp($_)); }

    Jack
      foreach (@matches) { push(@chomped, chomp($_)); }

      Unfortunately, that's not going to do what you want. chomp returns the number of characters removed from all of its arguments, in this case, if on *nix, 1 each time for a newline. Thus, your @chomped will just contain a series of '1's. The following shows what happens with your method and also a couple of other ways to do it. (Note the -l command-line switch to append a newline after each print operation and the $count variable in the third snippet to show what chomp returns.)

      $ cat alpha abc def ghi $ perl -le ' -> @arr = <>; -> push @chomped, chomp( $_ ) for @arr; -> print for @chomped;' alpha 1 1 1 $ perl -le ' -> @arr = <>; -> @chomped = map { chomp; $_ } @arr; -> print for @chomped;' alpha abc def ghi $ perl -le ' -> @arr = <>; -> $count = chomp @arr; -> print $count; -> print for @arr;' alpha 3 abc def ghi $

      I hope this is of interest.

      Cheers,

      JohnGG