Re: Generating sequence nos. for data

It looks like most of the earlier replies have missed your point that each line of input contains multiple "tag value" pairs. So you need a loop over each input line:

my %sequences;  # this will be a hash of arrays

while (<DATA>) {
    @fields = split;
    for ( my $i=0; $i<@fields; $i+=2 ) {
        push @{$sequences{$fields[$i]}}, $fields[$i+1];
    }
}

for my $tag ( sort keys %sequences ) {
    my $i = 0;
    for my $val ( @{$sequences{$tag}} ) {
        printf( "%s [ %d ] : %s\n", $tag, $i+1, $sequences{$tag}[$i++]
+ );
    }
}

__DATA__
A 100 B 200 C 400 A 150 C 250 D 550 B 350
A 200 B 300 C 500 A 600 B 700 C 800 D 900
[download]

Now, you didn't actually say how you want to handle the cases where the same tag occurs on multiple lines of input (like in the sample data provided here): should the index numbers reset to 1 for each line, or should they increment continuously over the entire input stream (as done here)?

Some of the stuff you said in a sub-reply to merlyn seemed not to make sense:

A $$ indicates the start of the record. The record is split on the tag. A hash is populated with key being the tag and value being the data. Some fields within a record repeat in a well defined manner.

If I understand that, splitting the record into a hash would be completely wrong: it only lets you keep one value for each distinct tag on a line, no matter how many times that tag appears.

I hope the snippet provided here makes sense -- note that it keeps all the values for each tag in an array that is stored as the value of the hash element for that tag. Then, we just use the array index (plus 1) to provide the sequence numbers that you want for each of the values.

Comment on Re: Generating sequence nos. for data Download Code