Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks
I have the following script, which reads a tab-delimited file, where the $snp variable is the ID of each line.
What I am trying to do (unsuccessfully so far) is to only increment the $counter variable if the $snp changes, else, it should stay as it was previously, i.e. all lines with the same $snp should have the same $counter in front.
Can you fix my error please?
$counter=0; %hash=(); while(<>) { $line=$_; chomp $line; @split_line=split(/\t/, $line); $snp=$split_line[0]; $x11=$split_line[1]; $x12=$split_line[2]; $x22=$split_line[3]; $gen11=$split_line[4]; $gen12=$split_line[5]; $gen22=$split_line[6]; $sd11=$split_line[7]; $sd12=$split_line[8]; $sd22=$split_line[9]; $study=$split_line[10]; $snp1=$split_line[11]; $x33=$split_line[12]; $x34=$split_line[13]; $x44=$split_line[14]; $gen33=$split_line[15]; $gen34=$split_line[16]; $gen44=$split_line[17]; $sd33=$split_line[18]; $sd34=$split_line[19]; $sd44=$split_line[20]; $study1=$split_line[21]; $a1=$split_line[22]; $a2=$split_line[23]; if($x11 eq 'NA') {$x11='.';} if($x12 eq 'NA') {$x12='.';} if($x22 eq 'NA') {$x22='.';} if($gen11 eq 'NA') {$gen11='.';} if($gen12 eq 'NA') {$gen12='.';} if($gen22 eq 'NA') {$gen22='.';} if($sd11 eq 'NA') {$sd11='.';} if($sd12 eq 'NA') {$sd12='.';} if($sd22 eq 'NA') {$sd22='.';} if($x33 eq 'NA') {$x33='.';} if($x34 eq 'NA') {$x34='.';} if($x44 eq 'NA') {$x44='.';} if($gen33 eq 'NA') {$gen33='.';} if($gen34 eq 'NA') {$gen34='.';} if($gen44 eq 'NA') {$gen44='.';} if($sd33 eq 'NA') {$sd33='.';} if($sd34 eq 'NA') {$sd34='.';} if($sd44 eq 'NA') {$sd44='.';} $rest=$x11."\t".$x12."\t".$x22."\t". $gen11."\t".$gen12."\t".$gen22."\t". $sd11."\t".$sd12."\t".$sd22."\t". $study."\t".$snp1."\t". $x33."\t".$x34."\t".$x44."\t". $gen33."\t".$gen34."\t".$gen44."\t". $sd33."\t".$sd34."\t".$sd44."\t". $study1."\t".$a1."\t".$a2."\n"; if(not exists($hash{$snp})) { $counter++; print $counter."\t".$snp."\t".$rest; } else { print $counter."\t".$snp."\t".$rest; } }

Replies are listed 'Best First'.
Re: only increment counter if the ID has not been seen before
by toolic (Bishop) on Mar 12, 2015 at 01:15 UTC
    You need to populate the hash inside the loop somehow. Here is one way:
    use warnings; use strict; my $counter = 0; my %hash; while (<DATA>) { my @splits = split; my $snp = $splits[0]; $counter++ if not exists $hash{$snp}; print "$counter $snp\n"; $hash{$snp}++; } __DATA__ foo abc foo cde bar xyz

    Prints:

    1 foo 1 foo 2 bar
      The OP wanted the same number in front of each occurence of the same ID, not a slowly increasing number counting different IDs so far. This can be achieved with a hash storing unique numbers rather than repetition counts.
      use warnings; use strict; my $counter = 0; my %hash; while (<DATA>) { my @splits = split; my $snp = $splits[0]; $hash{$snp} ||= ++$counter; print "$hash{$snp} $snp\n"; } __DATA__ foo abc foo cde bar xyz foo hij
      This should output:
      1 foo 1 foo 2 bar 1 foo
      ... rather than:
      1 foo 1 foo 2 bar 2 foo
      Ah, I just needed this thing then: $hash{$snp}++;
      Damn! Thank you very much!