comment on

Hi all,

I have a quick question about the behavior of Perl when reading from the STDIN. Specifically about a particular piece of code.

Here is the code:

cat input_file.fq | perl -ne '$s=<>;<>;<>;chomp($s);print length($s)."\n";' > output.txt

The format for the input_file.fq is a FASTQ format file. This is standard for storing biological data.

e.g.

@HWI-EAS283_0004_FC:1:1:1321:1118#0/1
TTGCTCAGCAGGTTCAACTGCAGGTTGCCCAGGACTTTAC
+HWI-EAS283_0004_FC:1:1:1321:1118#0/1
gg/fgag_ffgcfgeffafSKd\\adfRffff]fa[fffaf
@HWI-EAS283_0004_FC:1:1:1399:1117#0/1
CTTGACGATTCCCCGCAGGCTGTTCCCGCGGGCCGCAATG
+HWI-EAS283_0004_FC:1:1:1399:1117#0/1

Every line beginning with '@' is the ID for the next 3 lines. The second line is a collection of letters, typically either ATCG. The line beginning with + is just a repeat for the ID and then the fourth line is the last relevant line for a segment. Then this repeats for a new 4 line segment.

Basically, the above code gets the length of the sequence (ATCG) line for every segment, which is great but I dont understand the behaviour of the $s=<>;<>;<>; part of the code.

Could anyone explain what its doing, and how it knows only to look at the correct line (which will be line number 2, 6, 10, 14, 18 etc)? I've played around with this on different file formats and cant figure it out.

Any advice would be greatly appreciated

In reply to command line perl reading from STDIN by perlhappy

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.