Please use <code> tags around your data and code so that
it's properly formatted. Here's the sample data that you posted
(I removed empty lines for space):
Header Line One
***-*** 0 0 ***-MBO 0 0
2TO-T/V 0 0 2TO-T/O 0 0
POC-CNU 1285 0 POC-A/M 0 15567
Header Line Two
***-*** 0 0 ***-MBO 0 0
2TO-T/V 0 0 2TO-T/O 0 0
POC-CNU 1285 0 POC-A/M 0 15567
Here's sample code (untested) that does what you explained, storing each
line in a hash using the first 7 characters as the key, and
checking for duplicates:
my $data={};
my $file;
while(<>) {
chomp;
# Skip blank lines
next if /^\s*$/;
if (/^Header/) {
$data->{$_}={}; # Create a new first-level hash.
$file=$_;
next;
}
if (/^([a-zA-Z0-9*/]{3}-\S{3})\s+/) {
my $key=$1;
if ($file) {
# Check for duplicates.
if (exists($data->{$file}->{$key})) {
warn "Duplicate key $key in $file: $_\n";
next;
}
$data->{$file}->{$key}=$_;
}
else {
warn "Line found before a header line: $_\n";
}
} else {
# Reject improper lines
warn "Badly formatted line found, ignoring: $_\n";
}
}
This stores the data in a structure like this:
$data->{Header Line One}->
{***-***} -> "***-*** 0 0 ..etc"
{2TO-T/V} -> "..."
...
->{Header Line Two}->
....
This may not be precisely what you want, but it should give
you an idea of one way of doing it.
--ZZamboni
|