I'd suggest a slightly different approach that has the advantage that one can read line by line so that there's no need to have all data in memory (which is nice if you've a lot of data).
#!perl use strict; my %data; my ($key, $data); while (<DATA>) { chomp($_); if (/^(\d+):\s*(.+)$/) { $data{$key} = $data if defined $key; $key = $1; $data = $2; } else { $data .= " $_"; } } $data{$key} = $data if defined $key; foreach my $key (sort {$a <=> $b} keys %data) { print "$key: '$data{$key}'\n"; } __DATA__ 3: Tag <test> found 1 Tag <test> found 2 5: Tag <test> found 3 7: Tag <test> found 4 14: Tag <test> found 5 16: Tag <test> found 6 18: Tag <test> found 7 21: Tag <test> found 8 25: Tag <test> found 9 27: Tag <test> found 10 29: Tag <test> found 11 32: Tag <test> found 12 34: Tag <test> found 13 49: Tag <test> found 14 80: Tag <test> found 15 98: Tag <test> found 16 Tag <test> found 17
Essentially, this is a finite state machine with two states, new-line and continue-line, represented by the if and the else part with the variable $key playing the role of state variable.
Essentially, this is a finite state machine with three states, initial, new-line and continue-line, the last two represented by the if and the else part with the variable $key playing the role of state variable distinguishing between the initial (undef) and the other two states.
(I modified the data slightly to be able to check that the data actually ends up with the right key in the hash.)
Hope this helps, -gjb-
Update: this explanation is more precise than the version I striked out.
In reply to Re: Matching over multiple lines in a scalar
by gjb
in thread Matching over multiple lines in a scalar
by Rich36
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |