I guess what's causing you problems is that on some lines, there's only a carriage return at the end of line (in your example, it's those lines with the sequence data).

So what happens is that all those lines (up to the next proper newline (i.e. the one used on your platform)) are being read in as one line; and when you print that string, the carriage returns reposition the terminal cursor to the beginning of the line, so most of data is overprinted...   The "weird" result you're seeing is what remains.

The following snippet demonstrates the effect (note the "\r" at the end of the substrings):

my $newline = "TTTATGCACTCATGTTTAGACATATTTCCTACACCCATATTTGAAGACCA\r" . "... other lines (overprinted) ...\r" . "TAGATTCTGTAAACTGTGTCCATTTCTGTGCCTATTTACTTGGTATTTGT\r" . "TAACTCTTAGTACACATAAGTTTACGTACC\r"; print "\$newline is: $newline\n"; # (this is your debug print)

In other words, you have to make sure that - after cut-n-pasting from one platform to another - the resulting lines are all conforming to the same newline style (preferably the one being used by your OS :)

Several tools can help here, for example dos2unix, unix2dos, ..., or Perl's tr///, or s/// (e.g. tr/\r/\n/). Some editors can do auto-conversions, too. But you have to fix things before you process the data with your program.   Good luck.


In reply to Re: Regexp and newlines, what am I missing? by almut
in thread Regexp and newlines, what am I missing? by mdunnbass

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.