in reply to Print in loop or store and print once done

From the description of your data, it sounds like you may be dealing with FASTA format. Even if you're not, the following technique (with a little modification) may do what you want.

#!/usr/bin/env perl use strict; use warnings; while (<DATA>) { if (/^>/) { print "\n" unless $. == 1; print; } else { chomp; print; } } print "\n"; __DATA__ >chunk1 c1_line1 c1_line2 c1_line3 >chunk2 c2_line1 c2_line2 c2_line3

Output:

>chunk1 c1_line1c1_line2c1_line3 >chunk2 c2_line1c2_line2c2_line3

Use Benchmark, to compare this with any other solutions, to determine which runs the fastest.

For the split and join (or .=) operations you mention, I'd have to guess you're using some form of 'local $/ = ...' (see perlvar: Variables related to filehandles) — I'd need more information to comment further on that. Of course, there's no reason why you can't compare those options with any others.

Two comments regarding stripping embedded newlines from a string using "s/\n//g/":

  1. That's incorrect syntax. There's no slash after the modifier (i.e. it's just s/\n//g). See perlop: Regexp Quote-Like Operators.
  2. Transliteration (e.g. y/\n//d) is probably faster. See perlop: Quote-Like Operators. [Note: y/// and tr/// are synonymous.]

If my guess at your requirements is wrong, please supply more information to help us help you. The guidelines in "How do I post a question effectively?" should point you in the right direction with respect to this.

-- Ken

Replies are listed 'Best First'.
Re^2: Print in loop or store and print once done
by Anonymous Monk on Feb 08, 2014 at 14:01 UTC

    Hi Ken,

    Thanks for your comments, really helpful. I've compared different method but the difference is negligible in my case. Seems the bottleneck was in a different place.

    Great to know about transliteration, seems like a faster though simpler substitution.