exsnafu has asked for the wisdom of the Perl Monks concerning the following question:

I have a bunch of files that contain paragraphs of text and two pieces out of each paragraph of text I want to print out as a single line.. seems easy enough.. but I'm missing something fundamental I think. sample file output:
Pseudo name=hdiskpower97 Symmetrix ID=000187751303 Logical device ID=0380 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ====================================================================== +======== ---------------- Host --------------- - Stor - -- I/O Path - -- S +tats --- ### HW Path I/O Paths Interf. Mode State Q-IO +s Errors ====================================================================== +======== 0 fscsi0 hdisk100 FA 10cA active alive +0 0 2 fscsi2 hdisk254 FA 7cA active alive +0 0 Pseudo name=hdiskpower90 Symmetrix ID=000187751303 Logical device ID=0381 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ====================================================================== +======== ---------------- Host --------------- - Stor - -- I/O Path - -- S +tats --- ### HW Path I/O Paths Interf. Mode State Q-IO +s Errors ====================================================================== +======== 2 fscsi2 hdisk247 FA 7cA active alive +0 0 0 fscsi0 hdisk93 FA 10cA active alive +0 0
now, the code I've written thus far:
#!/usr/bin/perl -w use strict; use warnings; my @filelist = <usc*.devs>; foreach my $file (@filelist) { open(FH,$file) or die "cant open $file: $!"; while(<FH>) { if(/^.*Pseudo.*?\=(.*?$)/ .. /^.*Logical.*?\=(.{4})/sg) { print "$1,$2\n; } } close(FH); }
ok, so i'd expect it to fill $1 and $2 up with my groupings and print out just those two fields but $2 is unitialized and if I just print $1 it seems to be going back and forth between my groupings.. clearly I'm missing something basic about how this works. any advice?

Replies are listed 'Best First'.
Re: help printing items across multiple lines
by mr_mischief (Monsignor) on Jun 04, 2008 at 17:47 UTC
    What you have there isn't a regex with two captures. It's two regexes with a flip-flop operator between them. I think you want to replace .. with || and just print $1. There is no $2 in your code.
      aha, yes.. my understanding of a flipflop operator seems to have been flawed but now the behavior makes sense to me. however, since I'm now just grabbing $1 twice across two regexes, is there any way I can print both instances on one line? with one regex? I can put two if statements in there with one regex print per but that seems clunky. my end goal is to come up with a file that has a line that shows the Pseudo Name and Logical Device ID fields as a pair for each paragraph entry.
Re: help printing items across multiple lines
by toolic (Bishop) on Jun 04, 2008 at 20:54 UTC
    Here is a solution that does not use the Range Operator:
    use strict; use warnings; my $name; while (<DATA>) { if (/^.*Pseudo.*?\=(.*?$)/) { $name = $1 } if (/^.*Logical.*?\=(.{4})/sg) { print "$name,$1\n" } } __DATA__ Pseudo name=hdiskpower97 Symmetrix ID=000187751303 Logical device ID=0380 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ====================================================================== +======== ---------------- Host --------------- - Stor - -- I/O Path - -- S +tats --- ### HW Path I/O Paths Interf. Mode State Q-IO +s Errors ====================================================================== +======== 0 fscsi0 hdisk100 FA 10cA active alive +0 0 2 fscsi2 hdisk254 FA 7cA active alive +0 0 Pseudo name=hdiskpower90 Symmetrix ID=000187751303 Logical device ID=0381 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ====================================================================== +======== ---------------- Host --------------- - Stor - -- I/O Path - -- S +tats --- ### HW Path I/O Paths Interf. Mode State Q-IO +s Errors ====================================================================== +======== 2 fscsi2 hdisk247 FA 7cA active alive +0 0 0 fscsi0 hdisk93 FA 10cA active alive +0 0

    prints:

    hdiskpower97,0380 hdiskpower90,0381
Re: help printing items across multiple lines
by GrandFather (Saint) on Jun 04, 2008 at 21:14 UTC

    Sometimes a useful trick is to change the input line separator ($/ - perlvar). Consider:

    use strict; use warnings; local $/ = 'Pseudo name='; /^(\w*).*?Logical device ID=(\w*)/s and print "$1, $2\n" while <DATA>; __DATA__ Pseudo name=hdiskpower97 Symmetrix ID=000187751303 ...

    given all the samples data prints:

    hdiskpower97, 0380 hdiskpower90, 0381

    Perl is environmentally friendly - it saves trees
Re: help printing items across multiple lines - one regex, one hash, two keys
by Narveson (Chaplain) on Jun 04, 2008 at 22:14 UTC

    Let Pseudo and Logical be the two keys of a hash that you are seeking to populate as you read through each paragraph.

    Write a regular expression with two captures, the first of which is either Pseudo and Logical and the second of which is what comes after the equal sign.

    As soon as your hash has both the desired keys, you are ready to print its values and start over.

    my @KEYS = qw( Pseudo Logical ); my $key_count = scalar @KEYS; my $key_alternation = join q{|}, @KEYS; my $key_value_pattern = qr/ ($key_alternation) # capture a key [^=]* # skip what is not an equal sign = # match the equal sign (.*\S) # capture the value /x; PARAGRAPH: while (not my %value_for) { LINE: while (<DATA>) { my ($key, $value) = /$key_value_pattern/; next LINE if !$key; $value_for{$key} = $value; if (scalar keys %value_for >= $key_count) { print join (q{, }, @value_for{@KEYS}), "\n"; next PARAGRAPH; } } # eof last PARAGRAPH; } __DATA__

    With the given data, this prints

    hdiskpower97, 0380 hdiskpower90, 0381
      thanks all for the responses, some great stuff and I like the hash approach as I think I can now use that for the second half of my script which will be a 3rd match to check and print only those keys that show up in another file. again, much appreciated!