comment on

Dear fellow monks,

i have a problem collecting data from a record-oriented stream.

In particular, i need to collect strings within each paragraph appearing on different lines and matching a common regex. But my current approach does not work properly, only the first occurrence of the regex is found and the other ones are skipped or ignored.

The data stream i want to process looks like this:

### HEADING OF RECORD 1 ####
Logical device ID=08E1
LINE_THAT_DOES_NOT_BOTHER_ME
ANOTHER_LINE_THAT_DOES_NOT_BOTHER_ME
29 8/0/2/1/0.18.152.0.0.6.1  c29t6d1   FA  5eA
30 8/0/3/1/0.17.152.0.0.6.1  c30t6d1   FA 12e
31 8/0/8/1/0.17.150.0.0.6.1  c31t6d1   FA 10eA
32 8/0/9/1/0.18.150.0.0.6.1  c32t6d1   FA 11eA

### HEADING OF RECORD 2 ####
Logical device ID=08E2
LINE_THAT_DOES_NOT_BOTHER_ME
ANOTHER_LINE_THAT_DOES_NOT_BOTHER_ME
29 8/0/2/1/0.18.152.0.0.4.1  c29t4d1   FA  5eA
30 8/0/3/1/0.17.152.0.0.4.1  c30t4d1   FA 12eA
31 8/0/8/1/0.17.150.0.0.4.1  c31t4d1   FA 10eA
32 8/0/9/1/0.18.150.0.0.4.1  c32t4d1   FA 11eA

### HEADING OF RECORD 3 ####
(...)
[download]

The task is as follows:

Create a hash of arrays from that stream where the "Logical device ID" numbers are the keys and the cXtYdZ strings shall be collected in arrays, being the respective values:

%hash = (
'08E1' => ['c29t6d1','c30t6d1','c31t6d1','c32t6d1'],
'08E2' => ['c29t4d1','c30t4d1','c31t41','c32t4d1'],
(...)
)
[download]

I am using this code for processing the stuff:

use strict;
use warnings;
use Data::Dumper;

my %hash;
open ( FH, "powermt display dev=all|");# data stream comes from here

$/ = '';
while (<FH>) {
    my ($id) = ( $_ =~ /Logical device ID=(\w+)/ );
    push (@{$hash{$id}}, $1) if /(c\d+t\d+d\d+)/;
}

print Dumper (\%hash);
[download]

But when using this code, i only get a HoA containing only the first occurence of the regex within each paragraph, like this:

$VAR1 = {
          '08E1' => [
                      'c29t6d1'
                    ],
          '08E2' => [
                      'c29t4d1'
                    ],
(...)
[download]

So far, the record processing itself seems to work ok, but i am missing something in the while loop when trying to catch all cXtYdZ strings. I also must mention that the number of lines with that string may also vary, there might be just one line, but there could also be 2,3,4,5 ... another lines containing these strings.

The problem seems to be that i need to execute the push statement as often as the regex pattern appears within each loop.

Can somebody enlighten me for perhaps improving my loop-control skills?

TIA!

In reply to How can i catch strings matching a regex across multiple lines? by babelfish

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.