Parsing very big files GB

rootcho has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Parsing very big files GB by FunkyMonk (Bishop) on Aug 30, 2007 at 23:08 UTC
There's something else causing your problem, but you haven't shown us what it is. `while (<$fh>)` wil read the file line by line, and not all at once. Are you putting your read lines into an array or a hash?	[reply] [d/l] [select]
Re^2: Parsing very big files GB by rootcho (Pilgrim) on Aug 30, 2007 at 23:46 UTC
That was what I thought too ! To be sure I'm not leaking info..for all hashes, arrays and objects I do "undef var", after they are no longer needed.	[reply]
Re^3: Parsing very big files GB by GrandFather (Saint) on Aug 31, 2007 at 00:03 UTC
Show us more of the code. Sounds like some restructuring is in order. In particular, anything that you "undef" ought to be inside the while loop so that it is cleaned up when it goes out of scope at the end of the loop. Only variables that should retain content after the loop should be declared outside the loop. If you are not using strictures already I strongly recommend that you add `use strict; use warnings;` to your code. DWIM is Perl's answer to Gödel	[reply] [d/l]
Re^4: Parsing very big files GB by rootcho (Pilgrim) on Aug 31, 2007 at 17:13 UTC
Re: Parsing very big files GB by GrandFather (Saint) on Aug 30, 2007 at 23:12 UTC
How long are the lines in the file? Unless they are extremely long what you have shown should be fine. Are you sure you are not leaking memory in the loop or accumulating data in the loop? As a sanity check you might like to try: `open my $fh,... while (<$fh>) { } close $fh;` [download] and check that that runs correctly. Assuming it does, start adding back the contents of the while loop and see where the problem happens. If it doesn't run correctly (terminate cleanly) you need to rethink what constitutes a line and possibly come back for more advice. DWIM is Perl's answer to Gödel	[reply] [d/l]
Re^2: Parsing very big files GB by goibhniu (Hermit) on Aug 31, 2007 at 06:50 UTC
ooo - long lines or maybe a weird record separator I humbly seek wisdom.	[reply]
Re: Parsing very big files GB by f00li5h (Chaplain) on Aug 31, 2007 at 01:34 UTC
My guess is that you're doing something like `sub is_useful; open $fh, '<', 'purr' or die "Sad kitty: $!"; my @useful_things = (); while(<$fh>){ push @useful_things if is_useful( $_ ); }` [download] So all that data is ending up stashed in memory, even though you're reading the file a line at a time. `@_=qw; ask f00li5h to appear and remain for a moment of pretend better than a lifetime;;s;;@_[map hex,split'',B204316D8C2A4516DE];;y/05/os/&print;`	[reply] [d/l] [select]
Re: Parsing very big files GB by cengineer (Pilgrim) on Aug 31, 2007 at 13:23 UTC
Check out Tie::File .. From the description: `The file is not loaded into memory, so this will work even for gigantic files.`	[reply] [d/l]