This looks like FASTA format. I'd say you're on the right track by changing $/; however, it's better to localise the change in an anonymous block and let Perl return it to its previous value (rather than hard-coding your guess of what it was). I typically use: local $/ = "\n>".

If you have Perl v5.14 or higher, you can code it like this:

#!/usr/bin/env perl use 5.014; use strict; use warnings; { local $/ = "\n>"; while (<DATA>) { chomp; my ($top, $seq) = split /\n/, $_, 2; print '>' unless $. == 1; say $top; say $seq =~ y/\n//dr; } } __DATA__ > gi|11SB_CUCMA Train|1 21 MARSSLFTFLCLAVFINGCLSQIEQQSPWEFQGS EVWQQHRYQSPRACRLENLRAQDPVRLLLPGFSNAPKLIFV AQGFGIRGIAIPGCAETYQT SSSSSSSSSSSSSSSSSSSSS.................... ........................... ................. .......... > gi|1A43_HUMAN Train|1 24 MAVMAPRTLVLLLSGALALTQTWAGSHSMRYFYTSVSRPG RGEPRFIAVGYVDDTQFVRFDSDAASQRMEPRAPWIEQEGPEYWSQTDRANLGTLRGYYNQSEDGSHTIQ +R MYGCDVGPDGRFLRGYQQDAYDGKDYIALNEDLRSWTAADMAAQITQRKWETAHEAE SSSSSSSSSSSSSSSSSSSSSSSS.............................................. +........................................... ............. ..........................................

Output:

> gi|11SB_CUCMA Train|1 21 MARSSLFTFLCLAVFINGCLSQIEQQSPWEFQGSEVWQQHRYQSPRACRLENLRAQDPVRLLLPGFSNAP +KLIFVAQGFGIRGIAIPGCAETYQTSSSSSSSSSSSSSSSSSSSSS....................... +................................................... > gi|1A43_HUMAN Train|1 24 MAVMAPRTLVLLLSGALALTQTWAGSHSMRYFYTSVSRPGRGEPRFIAVGYVDDTQFVRFDSDAASQRME +PRAPWIEQEGPEYWSQTDRANLGTLRGYYNQSEDGSHTIQRMYGCDVGPDGRFLRGYQQDAYDGKDYIA +LNEDLRSWTAADMAAQITQRKWETAHEAESSSSSSSSSSSSSSSSSSSSSSSS................ +..................................................................... +...........................................................

If you only have v5.10 or v5.12, the 'r' modifier for transliteration is unavailable (see Non-destructive substitution in perl5140delta: Regular Expressions) and you'll need an extra line of code.

#!/usr/bin/env perl use 5.010; use strict; use warnings; { local $/ = "\n>"; while (<DATA>) { chomp; my ($top, $seq) = split /\n/, $_, 2; print '>' unless $. == 1; say $top; $seq =~ y/\n//d; say $seq; } }

If you're working with older Perl versions, 'say' is unavailable (see perl5100delta: say()) and you'll need to hard-code line endings.

#!/usr/bin/env perl use strict; use warnings; { local $/ = "\n>"; while (<DATA>) { chomp; my ($top, $seq) = split /\n/, $_, 2; print '>' unless $. == 1; print "$top\n"; $seq =~ y/\n//d; print "$seq\n"; } }

All of these versions of the code produce the same output (given the same __DATA__).

— Ken


In reply to Re: Help in joining these lines by kcott
in thread Help in joining these lines by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.