It appears to me that the data you have should be bucketized based on the text before the '_' character. If this is correct, perhaps these pointers will help:
- You can use my ( $bucket, $rest ) = split('_', $line, 2) to chop up your data
- You can also store the data into the bucket as an Array of Hashes: push @{ $buckets{$bucket} ||=[] }, $line
- You can then find every bucket you have: keys %buckets
- You can also get the items in each bucket: @items = @{ $buckets{$bucket}
- You can also count the number of items in a bucket: $item_count = scalar( @items ), $item_count = scalar( @{ $buckets{$bucket} } )
- You can join all of your items together: $string = join(" ", @items)
Given the above and your skeleton code, you should be able to piece them together to accomplish your goals.