in reply to Re: trouble understanding boss's code
in thread trouble understanding boss's code

wow, thanks a lot, i shortened it significantly,

sub readdt2 { my $ifn = shift; open(my $IFH, "<$ifn") or die "cannot open file $ifn\n"; my $line; my @nt = ("A","C","G","T"); my %ret; my @tmp; for my $j(@nt) { $ret{$j} = []; $line = <$IFH>; chomp($line); @tmp = split(/\s+/,$line); for (my $i=0; $i<=$#tmp; $i++ ) { $ret{$j}[$i]= $tmp[$i] +0.0001; } } close($IFH); return(\%ret); }

but when i used a data dumper the numbers didnt add up to 1.00, "A" added up to 1.11, "T" added up to 1.04, "C" added up to .93, G added up to .92 does anything stick out to you?

Replies are listed 'Best First'.
Re^3: trouble understanding boss's code
by SuicideJunkie (Vicar) on Jul 20, 2011 at 18:51 UTC

    According to the sample data you posted:

    0.95 0.02 0.07 0.07 #A 0.03 0.01 0.06 0.83 #C 0.01 0.02 0.80 0.09 #G 0.01 0.95 0.07 0.01 #T

    Each column adds up to 1.00, but each row adds up to an arbitrary value. Since you're putting 0.95, 0.02, 0.07 and 0.07 into the A array, it makes sense that the A array adds up to 1.11 :)

    PS: Why are you adding 0.0001 to your data? Printing the output with appropriate printfs should round the values off nicely and hide the artifacts from the floating point math in the CPU.

      oh okay i see what you mean, thanks and the .0001 is supposed to represent the noise factor i think
        actually no its not the noise factor, the .0001 is there to avoid problems with log, which i use in my next subroutine