comment on

Dear Monks, I am trying to load several big files into memory by reading the entire file. (They're between 5MB and 100MB each.) I want to figure out the best way to design my code so it won't waste memory. After pulling them into memory, I do some work (generate more data), and then I will need to unload some of them once they're no longer needed to free up memory. And then I read some more files into memory. Then finally I write the data I generate to a file. What's the best way to do this? Does this look good?:

my $Data1 = ReadFile($filename1);
my $Data2 = ReadFile($filename2);
my $Data3 = ReadFile($filename3);
my $OutputData = '';

$OutputData .= ......

undef $Data2;
my $Data4 = ReadFile($filename4);

...

CreateFile($output_filename, \$OutputData); # Pass by ref to prevent double copy
exit;

Undef would successfully unload it from the memory. But will it cause large memory blocks to be moved around since it's in the middle? How does Perl deal with memory fragmentation? What happens when there's a memory hole, and I create a new string and slurp the file using sysread() to fill the string. And I imagine that as the string grows, it outgrows that hole and has to be placed into another location. Right? I'm trying to understand what's going on in the background so I can design my code to be efficient.

- I don't want to end up with creating double copies of the same data in memory.
- I don't want to hog memory by not releasing it when I'm done with it.
- I don't want to waste resources unnecessarily.

I would appreciate any helpful advice!!

The rest of the code (which is irrelevant):

# Usage: STRING = ReadFile(FILENAME, START, LENGTH) - Reads an entire file or part of a file in binary mode. Returns the file contents as a string. An optional second argument will move the file pointer before reading, and an optional third argument limits the number of bytes to read.
sub ReadFile {my$F=defined$_[0]?$_[0]:'';$F=~tr#><*%$?\r\n\"\0|##d;-e$F||return'';-f$F||return'';my$S=-s$F;$S||return'';my$L=defined$_2?$_2:$S;$L>0||return'';local*H;sysopen(H,$F,0)||return'';binmode H;my$P=defined$_1?$_1:0;$P>=0or $P=0;$P<$S||return'';$P<1||sysseek(H,$P,0);my$D='';sysread(H,$D,$L);close H;return$D;}

In reply to Memory efficient design by harangzsolt33

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.