If there are only two "interesting" columns, ignore the rest. Consider:
use strict; use warnings; my $data = <<DATA; CLS_S3_Contig100_st,CLS_S3_Contig100,53,10,0.3717 CLS_S3_Contig100_at,CLS_S3_Contig100,55,11,0.4321 CLS_S3_Contig100_st,CLS_S3_Contig100,57,10,0.3223 CLS_S3_Contig100_at,CLS_S3_Contig100,59,11,0.4055 CLS_S3_Contig100_st,CLS_S3_Contig100,61,11,0.4511 CLS_S3_Contig100_at,CLS_S3_Contig100,63,11,0.474 CLS_S3_Contig10031_st,CLS_S3_Contig10031,53,12,0.5548 CLS_S3_Contig10031_st,CLS_S3_Contig10031,57,10,0.4871 CLS_S3_Contig10031_st,CLS_S3_Contig10031,61,12,0.547 CLSS3627.b1_F19.ab1,CLS_S3_Contig10031,62,11,0.5129 CLSS3627.b1_F19.ab1,CLS_S3_Contig10031,64,11,0.5789 DATA my %origins; my $numColumns; open my $inFile, '<', \$data; while (<$inFile>) { chomp; next unless length; my @columns = split ','; $numColumns ||= @columns; # Assume first row has correct column co +unt $origins{$columns[1]}[$columns[2] - 1] = \@columns; } close $inFile; for my $oKey (sort keys %origins) { my $origin = $origins{$oKey}; for my $pip (0 .. $#$origin) { my $row = $origin->[$pip]; if (defined $row) { # pip exists in original file print join (",", @$row, '1'), "\n"; } else { # pip doesn't exist in original file print ",$oKey,", $pip + 1, ',' x ($numColumns - 2), "0\n"; } } }
Prints (with large middle portion skipped):
,CLS_S3_Contig100,1,,,0 ,CLS_S3_Contig100,2,,,0 ,CLS_S3_Contig100,3,,,0 ,CLS_S3_Contig100,4,,,0 ,CLS_S3_Contig100,5,,,0 ... CLS_S3_Contig10031_st,CLS_S3_Contig10031,57,10,0.4871,1 ,CLS_S3_Contig10031,58,,,0 ,CLS_S3_Contig10031,59,,,0 ,CLS_S3_Contig10031,60,,,0 CLS_S3_Contig10031_st,CLS_S3_Contig10031,61,12,0.547,1 CLSS3627.b1_F19.ab1,CLS_S3_Contig10031,62,11,0.5129,1 ,CLS_S3_Contig10031,63,,,0 CLSS3627.b1_F19.ab1,CLS_S3_Contig10031,64,11,0.5789,1
which demonstrates what I understand you to want.
The key points are using a HoAoA where the hash is keyed by the origin and the array is indexed by PIP (- 1). Note that Perl generates the missing array elements, but sets them to undef so you can test for defined to see if you encountered the PIP in the original file.
In reply to Re^3: Complex Data Structure
by GrandFather
in thread Complex Data Structure
by sesemin
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |