Greetings Monks, I’m a newbie to the monastery and have what on the surface appears to be a very easy question, which has had me flummoxed for over a day so I decided to come to the light side to seek wisdom! I am trying to parse out results from a program that generates an output file like the one below.
Results:1582 1640 6 9.8 6 90 0 69 55 16 13 13 1.68 GACAAT GACAATGACAAT +GACAATGACAATGACAGAGACAGTAACAATAACAATAACAATAACAA "Results:5184 5214 6 5.2 6 96 0 55 16 0 45 38 1.47 TGGTGA TGGTGATGGTGA +TGGTGATGGTGATGTTGAT";
The problem is that the code to take out lines beginning with “R “ and place results into arrays i have written to do this seems to skip either 1, 2, or 3 Results lines depending on how it feels! therfore out of 137 results lines it only ever picks out 69 or 74 lines. A section of the code is below from a larger program that i wrote to do the job, hence the commented out sections.
"TRID=0; $SEQID=0; #$PID=0; $i=0; #$line=<TR_INFILE>; chomp $line; while ($line =<TR_INFILE>) { if ($line =~/^R.*/) { $line=~s/^Results://g; #print "making TR arrays\n"; print OUTFILE3 "$line"; $trstart[$i] = (split(/\s*/,$line))[0]; $trend[$i] = (split(/\s*/,$line))[1]; $period[$i] = (split(/\s*/,$line))[2]; $copy[$i] = (split(/\s*/,$line))[3]; $consize[$i] = (split(/\s*/,$line))[4]; $matches[$i] = (split(/\s*/,$line))[5]; $indels[$i] = (split(/\s*/,$line))[6]; $score[$i] = (split(/\s*/,$line))[7]; $numa[$i] = (split(/\s*/,$line))[8]; $numc[$i] = (split(/\s*/,$line))[9]; $numg[$i] = (split(/\s*/,$line))[10]; $numt[$i] = (split(/\s*/,$line))[11]; $entropy[$i] = (split(/\s*/,$line))[12]; #$TR_consensus[$i]= (split(/\s*/,$line))[13]; #$TR_sequence[$i]= (split(/\s*/,$line))[14]; $TRID++; } # elsif ($line =~/^P.*/){ # print "Making Parameter arrays\n"; # $line =~s/\s/\./g; # $line =~s/^Parameters:\.//g; # $trparameters[$i] = ($line)[0]; # $PID++; # } elsif ($line =~ /^S.*/) { # print "Making seqeunce arrays \n"; $line =~s/^Sequence:\s*//; $TR_Accession[$i] = ($line)[0]; $SEQID++; } else { } $i++; $line=<TR_INFILE>; chomp $line; } close TR_INFILE;"
I will be grateful for all advice! i am sure it has something to do withthe RegEx.Apologies for the bad layout. Thank you in advance, PC.

In reply to RegEx misbehaving? by pdotcdot

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.