in reply to Code optimization help and troubleshooting

Dear AnomalousMonk and Athanasius

Thank you for your suggestions and comments, I have incorporated suggestions and it really shortened the code.

Could I get some help in the debugging, somehow, in the sequence after 1-10 character the 11th character is getting truncated. I checked the code for substr and couldn't figure out why this is happening.

and I am trying to color the last 10 (91-100) characters in blue while reading in for loop.

Again thank you for your help.

#!/usr/bin/perl use strict; use warnings; use LWP::Simple; use Data::Dumper; my $sequence="GGCGCAACGCTGAGCAGCTGGCGCGTCCCGCGCGGCCCCAGTTCTGCGCAGCTTCC +CGAGGCTCCGCACCAGCCGCGCTTCTGTCCGCCTGCAGGGCATT"; ############ Make hash of rejected length ############### my $input = "fragments.txt"; my %split; open (my $infile, "<", $input) or die "Cannot open file '$input' for r +eading: $!"; build_hash($_) while <$infile>; close ($infile) or die "Cannot close file '$input': $!"; build_hash(length($sequence) + 1); sub build_hash{ use feature 'state'; state $old = 0; state $cnt = 1; my ($line) = @_; my $gap = $line - $old; my $start = $old + 1; my $end = ($gap > 2) ? $line - 1 : $start; $split{$cnt++} = [$start .. $end] if $gap >= 2; $old = $line; } ############# Write HTML file ############################# #my $header; open (AA, ">fragments.html") or die "Cannot open file:$!"; print AA <<"EndOfHTML"; <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w +3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" +> <head> <title> Colored Gene Walk</title> <meta http-equiv="content-type" +content="text/html;charset=utf-8"/> </head> <body> <pre class="monofont"> <a style="font-size: 12pt"> EndOfHTML #print AA "$header\n"; my $red = '<span style="color:red">'; my $span = '</span>'; my $brk = qq{<br />\n}; for (my $pos=1;$pos<=length($sequence);$pos++){ my $FLAG=0;my @temp=(); foreach my $k(sort {$a <=> $b} keys %split){ my @pos=@{$split{$k}}; my $p1 =$pos[0]; my $p2 =$pos[$#pos]; if($pos == $p1){ push (@temp, $red); for(my $p=$p1;$p<=$p2;$p++){ my $s=substr($sequence,$p-1,1); push (@temp,$s); if($p==int($p/50)*50){ push (@temp, $brk); } $FLAG=1; $pos++; } push (@temp, $span); } } if($FLAG ==0){ my $s=substr($sequence,$pos-1,1); print AA "$s"; } else{ printf AA join("",@temp); } if($pos==int($pos/50)*50){ printf AA "$brk"; } } print AA <<"EndOfHTML"; </a> </pre> </body> </html> EndOfHTML close (AA);

Replies are listed 'Best First'.
Re^2: Code optimization help and troubleshooting
by Athanasius (Archbishop) on Oct 08, 2014 at 03:31 UTC
    Could I get some help in the debugging, somehow, in the sequence after 1-10 character the 11th character is getting truncated.

    The relevant part of the code has the following structure:

    for (my $pos = 1; $pos <= length($sequence); $pos++) { ... foreach my $k (...) { ... if ($pos == $p1) { ... for (my $p = ...) { ... $pos++; } } } ... }

    Incrementing the same variable (viz. $pos) in two places like this is asking for trouble. And, sure enough, within the inner for loop there are one too many increments, producing a typical off-by-one error. You can fix this by decrementing $pos immediately after the inner loop has completed:

    if ($pos == $p1) { $FLAG = 1; push @temp, $red; for my $p ($p1 .. $p2) { my $s = substr($sequence, $p - 1, 1); push @temp, $s; push @temp, $brk if $p == int($p / 50) * 50; $pos++; } $pos--; # <-- ADD THIS push @temp, $span; }

    Note that it is not necessary to set $FLAG each time through the loop.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,