comment on

Based on your further description, I would say that you should probably forget about ForkManager and multiple threads.

If multiple threads are all trying to write to the same output file, then you will need to worry about locking the file so they don't corrupt it. The locking overhead will kill performance, and even if you solve it somehow random differences in how long each thread takes to run will mean that the order of the lines in the output file will get partly randomised, which you probably don't want.

Instead I suggest you go for a single threaded solution that reads the input line by line and only keeps one group of four lines in memory at any one time. That way everything should be simple, and reasonably quick.

I suggest that you add some regular expressions or other tests when you read lines, so that if an extra new line creeps in somehow the script can re-sync with the line groups and not break.

If you are really desperate for maximum performance then you could investigate chopping your raw file up into chunks and then having separate scripts process each. If you do that then the code you have written to find where a group of four lines starts will come in handy.

In reply to Re^3: buffering from a large file by chrestomanci
in thread buffering from a large file by cedance

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.