Hi Perlmonks,

I am a beginner in perl programming. My interest is to estimate a few values from three substrings within a string. The string is $DNA1="GGCT CTGCGCGGNN"; At first, I have removed the Ns and white spaces. I have written a code but it makes the cmd run continuously. However, the result output in text file on desktop shows correct results for the 1st substring (4 bases), although it does not show the results for substring 2 (CTGC) and substring 3(GCGG). The substrings are non-overlapping and adjacent to each other. How can I correct the code in line 10 for while loop while (my $fm= substr ($DNA1,0,4)) { so that I get the correct results for all the 3 substrings. I have given the code, the correct results that I have got for 1st substring and my expected results for all the substrings. Can any perlmonk help me correct the mistake in code?

My code goes like

#!/usr/bin/perl use strict; use warnings; my $DNA1 = "GGCT CTGCGCGGNN"; # Total base count my $ total1=12; # Remove N from sequence $DNA1 =~ s/N//ig; # Remove whitespace Line 5 $DNA1 =~ s/\s//g; # In a loop, find every 4-base substring & then find its # GC%, GC-skew & Purine Loading Index (PLI): my $fm = 1.010; # Line 9 do { while ( my $fm = substr( $DNA1, 0, 4 ) ) { my $A = 0; my $T = 0; my $G = 0; my $C = 0; while ( $fm =~ /A/ig ) { $A++ } while ( $fm =~ /T/ig ) { $T++ } while ( $fm =~ /G/ig ) { $G++ } while ( $fm =~ /C/ig ) { $C++ } my $tot1 = $A + $T + $G + $C; my $gc1 = $G - $C; my $gc2 = $G + $C; # Line 16 my $cent = 100; my $gccon2 = $gc2 / $tot1; my $gccon3 = $cent * $gccon2; my $gccon4 = sprintf( "%.2f", $gccon3 ); my $gcskew = $gc1 / $gc2; my $GCSkew = sprintf( "%.4f", $gcskew ); # To find Purine Loading Index (PLI): my $four = 4; my $at1 = $A - $T; my $x1 = ( $gc1 + $at1 ) / $tot1; my $thousand=1000; my $pli = $thousand * $x1; my $PLI = sprintf( "%.0f", $pli ); # No. of sliding Windows: my $numberwin = $total1 / $four; my $NoWindows = sprintf( "%.0f", $numberwin ); print " Purine Loading Index of each 1Kb Window=$PLI bases/4- +base.\n"; my $output = "GC-SkewResult .txt"; unless ( open( RESULT, ">my $output" ) ) { print "Cannot open file\"my $output\".\n\n"; exit; } print RESULT"\n RESULTS for substrings:\n GC-Skew values of substrings:\n $GCSkew\n\n Percent GC Content of substrings:\n $gccon4\n\n"; close(RESULT); } } until ( $fm =~ /^\s*$/ ); exit;

I have got the correct results for 1st substring: i.e.

RESULTS for substrings: GC-Skew values of substrings: 0.3333 Percent GC Content of substrings: 75.00

My Expected Results are:

RESULTS for substrings: GC-Skew values of substrings: 0.3333 1.0000 0.5000 Percent GC Content of substrings: 75.00 50.00 100.00

In reply to How can I get the correct results for substrings 2 and 3 in a do-until loop? by supriyoch_2008

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.