babelfish has asked for the wisdom of the Perl Monks concerning the following question:
Dear fellow monks,
i have a problem collecting data from a record-oriented stream.
In particular, i need to collect strings within each paragraph appearing on different lines and matching a common regex. But my current approach does not work properly, only the first occurrence of the regex is found and the other ones are skipped or ignored.
The data stream i want to process looks like this:
### HEADING OF RECORD 1 #### Logical device ID=08E1 LINE_THAT_DOES_NOT_BOTHER_ME ANOTHER_LINE_THAT_DOES_NOT_BOTHER_ME 29 8/0/2/1/0.18.152.0.0.6.1 c29t6d1 FA 5eA 30 8/0/3/1/0.17.152.0.0.6.1 c30t6d1 FA 12e 31 8/0/8/1/0.17.150.0.0.6.1 c31t6d1 FA 10eA 32 8/0/9/1/0.18.150.0.0.6.1 c32t6d1 FA 11eA ### HEADING OF RECORD 2 #### Logical device ID=08E2 LINE_THAT_DOES_NOT_BOTHER_ME ANOTHER_LINE_THAT_DOES_NOT_BOTHER_ME 29 8/0/2/1/0.18.152.0.0.4.1 c29t4d1 FA 5eA 30 8/0/3/1/0.17.152.0.0.4.1 c30t4d1 FA 12eA 31 8/0/8/1/0.17.150.0.0.4.1 c31t4d1 FA 10eA 32 8/0/9/1/0.18.150.0.0.4.1 c32t4d1 FA 11eA ### HEADING OF RECORD 3 #### (...)
The task is as follows:
Create a hash of arrays from that stream where the "Logical device ID" numbers are the keys and the cXtYdZ strings shall be collected in arrays, being the respective values:
%hash = ( '08E1' => ['c29t6d1','c30t6d1','c31t6d1','c32t6d1'], '08E2' => ['c29t4d1','c30t4d1','c31t41','c32t4d1'], (...) )
I am using this code for processing the stuff:
use strict; use warnings; use Data::Dumper; my %hash; open ( FH, "powermt display dev=all|");# data stream comes from here $/ = ''; while (<FH>) { my ($id) = ( $_ =~ /Logical device ID=(\w+)/ ); push (@{$hash{$id}}, $1) if /(c\d+t\d+d\d+)/; } print Dumper (\%hash);
But when using this code, i only get a HoA containing only the first occurence of the regex within each paragraph, like this:
$VAR1 = { '08E1' => [ 'c29t6d1' ], '08E2' => [ 'c29t4d1' ], (...)
So far, the record processing itself seems to work ok, but i am missing something in the while loop when trying to catch all cXtYdZ strings. I also must mention that the number of lines with that string may also vary, there might be just one line, but there could also be 2,3,4,5 ... another lines containing these strings.
The problem seems to be that i need to execute the push statement as often as the regex pattern appears within each loop.
Can somebody enlighten me for perhaps improving my loop-control skills?
TIA!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: How can i catch strings matching a regex across multiple lines?
by aaron_baugher (Curate) on Jun 30, 2012 at 23:38 UTC | |
|
Re: How can i catch strings matching a regex across multiple lines?
by Anonymous Monk on Jun 30, 2012 at 21:38 UTC | |
by CountZero (Bishop) on Jul 01, 2012 at 11:15 UTC | |
by NetWallah (Canon) on Jun 30, 2012 at 23:14 UTC | |
by aaron_baugher (Curate) on Jun 30, 2012 at 23:42 UTC | |
|
Re: How can i catch strings matching a regex across multiple lines?
by 2teez (Vicar) on Jul 01, 2012 at 11:25 UTC | |
by ww (Archbishop) on Jul 01, 2012 at 13:20 UTC | |
by babelfish (Initiate) on Jul 01, 2012 at 20:11 UTC |