Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Parse string greater than 2GB

by rjt (Curate)
on Jun 30, 2013 at 01:57 UTC ( [id://1041537]=note: print w/replies, xml ) Need Help??


in reply to Parse string greater than 2GB

foreach my $l (split("\n", $map)) { print $l; }

It seems to me your code is doing nothing more than printing out the input with newlines removed. You can achieve the same result by removing all newlines with:

$map =~ y/\n//r;

For me this took a few seconds on a 2GiB + 16 byte string (whereas creating the same string with the repetition operator took more than twice as long, and that was without any IO).

Your approach with split runs out of memory on my 4GiB VM, because split generates a new list with new strings, more than doubling the memory requirement (depending on density of newlines). I strongly suspect, however, that even if it worked, the split would be much slower.

I also wonder if this may be an XY Problem: You say the read cannot be changed, and try as I might, I can't imagine why you'd want to read a huge binary file and print out everything but the newlines. If my advice doesn't hit the mark, can you give us a few more details on what it is you're doing?

open (INFILE, "$FILE") || die "Not able to open the file: $FILE \n";

Be careful with open. If you ever intend $FILE to be user-specified (and even if you don't), I'd recommend using the 3-argument open:

open INFILE, '<', $FILE or die "Not able...";

See Two-arg open() considered dangerous.

I'd also use a lexical filehandle (open my $infile, ...) instead of INFILE.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1041537]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2024-04-19 08:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found