svnipp has asked for the wisdom of the Perl Monks concerning the following question:

I need to learn how to parse a portion of a text file. The text file is the LVM filesystem information for every filesystem on a particular server. The format is as follows and could be over 100 sets of similar data in a single file.
<---snipp---> --- Logical volumes --- LV Name /dev/vg00/lvol3 VG Name /dev/vg00 LV Permission read/write LV Status available/syncd Mirror copies 1 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 300 Current LE 75 Allocated PE 150 Stripes 0 Stripe Size (Kbytes) 0 Bad block off Allocation strict/contiguous IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c3t6d0 75 75 /dev/dsk/c38t6d0 75 75 --- Logical extents --- <---snip--->
The "LV Name" is the line that I need to match, and I need to parse the data thru the "Logical extents" line. I am trying to figure out the best way to accomplish this. Thanks in advance for any ideas. Scott

Replies are listed 'Best First'.
Re: Getting only part of a text file...
by GrandFather (Saint) on Jun 08, 2006 at 20:49 UTC

    The scalar context range operator (flip flop) can be powerful fu:

    use strict; use warnings; my $match = 'lvol[35]'; while (<DATA>) { next if ! (my $it = /^LV Name/ .. /^LV Name/ && ! /$match/); if ($it =~ /E0/) { print "---\n"; next; } print "$it: $_"; } __DATA__ --- Logical volumes --- LV Name /dev/vg00/lvol3 VG Name /dev/vg00 Various stuff omitted LV Name /dev/vg00/lvol4 VG Name /dev/vg00 Various stuff omitted LV Name /dev/vg00/lvol5 VG Name /dev/vg00 Various stuff omitted LV Name /dev/vg00/lvol6 VG Name /dev/vg00 Various stuff omitted

    Prints:

    1: LV Name /dev/vg00/lvol3 2: VG Name /dev/vg00 3: Various stuff omitted --- 1: LV Name /dev/vg00/lvol5 2: VG Name /dev/vg00 3: Various stuff omitted ---

    If you are unfamiliar with it take a look at Flipin good, or a total flop?.


    DWIM is Perl's answer to Gödel
Re: Getting only part of a text file...
by traveler (Parson) on Jun 08, 2006 at 20:33 UTC
    It is probably easire to work with the information directly. If you are using LVM1 (and I think that output above is LVM2) you could use Linux::LVM. If you are indeed using LVM2, you could update the module so everyone could use it.

      Hi everybody, I'm just a perl newbie that entered through these monastery gates not so many weeks ago...I was wondering if I could add some extra functionality and fixes to the module Linux::LVM instead of forking it, as many people do with many modules around.

      I believe I'm able to work on the module itself with the creator's permission, or to do it in some other way.

      What shall I do? Any comments, help and guidelines would be very appreciated, fellow brothers. On the meantime, I'll try to contact the module creator.

      Thank you!

      (let my faith in Perl Best Practices, Perl::Critic and nice coding guidelines enlighten my path...)

Re: Getting only part of a text file...
by crashtest (Curate) on Jun 08, 2006 at 20:52 UTC

    [Update: GrandFather beat me to it.]

    This might be a good place to make use of the flip-flop operator. It can help you avoid the ugly $start / $end / $started variables to keep track of where you are:

    use strict; use warnings; while (<DATA>){ # Returns true after matching "LV Name", # and returns false after # matching "--- Logical extents ---". if (/^LV Name/ .. /--- Logical extents ---/){ print "$_"; } } __DATA__ Line above --- Logical volumes --- LV Name /dev/vg00/lvol3 VG Name /dev/vg00 LV Permission read/write ... --- Logical extents --- Line below
    Output:
    LV Name /dev/vg00/lvol3 VG Name /dev/vg00 LV Permission read/write ... --- Logical extents ---

Re: Getting only part of a text file...
by dsheroh (Monsignor) on Jun 08, 2006 at 20:09 UTC
    Unless you're dealing with either fixed-size records or a pre-indexed text file, you don't really have much choice other than to scan through it ignoring the parts you're not interested in.
    my $started = 0; open(LVMINFO, '<', $filename); while (<LVMINFO>) { $started = 1 if /^LV Name/; next unless $started; last if /--- Logical extents ---/; chomp; # Process # Process # Process } close(LVMINFO);
Re: Getting only part of a text file...
by thundergnat (Deacon) on Jun 08, 2006 at 20:59 UTC

    Alternately, you can play with the input record separator to get the chunks you need.

    use strict; use warnings; use Data::Dumper; my @filesys; { local $/ = '--- Logical extents ---'; while ( my $record = <DATA>) { chomp $record; next unless $record =~ /--- Logical volumes ---/; $record =~ s/^.*--- Logical volumes ---\n//s; my ($beginning, $end) = split /--- Distribution of logical volum +e ---/, $record; my %params = split / +|\n/, $beginning; my @lines = split /\n/, $end; for my $line(@lines) { next if $line =~ /PV Name/; next if $line !~ /\S/; my ($name, $le, $pe) = split ' ', $line; push @{$params{PV}}, {'Name', $name, 'LE on PV', $le, 'PE on + PV', $pe}; } push @filesys, \%params; } } print Dumper \@filesys; __DATA__ <---snipp---> --- Logical volumes --- LV Name /dev/vg00/lvol3 VG Name /dev/vg00 LV Permission read/write LV Status available/syncd Mirror copies 1 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 300 Current LE 75 Allocated PE 150 Stripes 0 Stripe Size (Kbytes) 0 Bad block off Allocation strict/contiguous IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c3t6d0 75 75 /dev/dsk/c38t6d0 75 75 --- Logical extents --- Some other bogus crap that needs to be ignored. --- Logical volumes --- LV Name /dev/vg00/lvol33 VG Name /dev/vg003 LV Permission read/write LV Status available/syncd Mirror copies 1 Consistency Recovery MWC Schedule parallel LV Size (Mbytes) 500 Current LE 70 Allocated PE 150 Stripes 0 Stripe Size (Kbytes) 0 Bad block off Allocation strict/contiguous IO Timeout (Seconds) default --- Distribution of logical volume --- PV Name LE on PV PE on PV /dev/dsk/c3t6d03 75 75 /dev/dsk/c38t6d03 75 75 --- Logical extents --- <---snip--->
Re: Getting only part of a text file...
by rodion (Chaplain) on Jun 08, 2006 at 20:44 UTC
    Here's my best guess at what you're looking for (tested). I left the "Ditribution of ..." section out for brevity. It would be just another "if ($section eq ...)" clause pushing data onto an array ref in the %data hash. If more sections are needed, consider generalizing the section parsing. Hope this helps.
    sub Get_Sect { my $targetLV = shift; my $section=''; my ($name,$value); my %data; open my $IN, '<', 'testpars.dat'; LINE: while (<$IN>) { chomp; if (/^\s*$/) { return \%data if ($data{'LV_Name'} = $targetLV); $section = ''; next LINE; } if (/^--- Logical volumes ---/) { $section = 'vol'; next LINE; } if ($section eq 'vol') { ($name,$value) = unpack('a28 a*',$_); $name =~ s/(\s+\(.+\))?\s+$//; #remove paren & spaces at end $name =~ s/\s/_/g; $data{$name} = $value; } } }
Re: Getting only part of a text file...
by moot (Chaplain) on Jun 08, 2006 at 22:06 UTC
    Since no-one else has mentioned it, you might also want to look at Parse::RecDescent - write a parser that responds to the lines in which you are interested. It might not be suitable for your application - it can be somewhat slow - but if you like writing BNF it might be a good solution.