comment on

I am currently rewriting/moosing a very old perl script (did I really write this horrid code?) that glues together a numerical weather prediction system. (BTW, perl rocks for this application!)

One of tasks here is to use ”wget” to download a 0.5 Gb file. Another is to compress/uncompress 49 files, each of which is on the order of 300Mb. This is currently implemented using syscalls to wget/gzip/gunzip. The forecast model (FORTRAN,C,C++) itself is run as multiple parallel processes on several machines using MPI. The file handling however is NOT parallelized— a single machine is responsible for this task.

This was all conceived and constructed in an era (2004) when hardware was much less muscular. These days, my master node is an 8-core 64-Gb MacPro w/ 2 Tb of SSD. During the file getting/manipulation phases of the master process, this is all the machine is doing. I suspect that some latent compute capability could be used to enhance/speed-up the file manipulation process.

Speed is everything for this application, and a few minutes saved is worth a lot. Should I manipulate files within perl (perhaps avoiding things like unnecessary IO buffering) rather than do the sys calls? (Obviously network speed remains a wild card here.)

I have researched this a bit and already have some (possibly erroneous) thoughts, but thought I would toss the global concept out there to my perlish betters. This may save me some spurious bunny trails. Not that I don't like bunnies…

—The difficulty lies, not in thinking the new ideas, but in escaping from the old ones.

In reply to Getting/handling big files w/ perl by Gisel

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.