thezip has asked for the wisdom of the Perl Monks concerning the following question:
I am having difficulty parsing a huge data file. Since it is a huge file, I can only read line-by-line (with the exception of the small buffer I'm using).
The rule I am trying to implement has these preconditions (please refer to the @sourcedata array below):IFF these conditions are met, then the newlines between the "1" and "_____ 2" lines are removed. Any other condition just prints everything (including the buffer.)
For some reason, I cannot execute the loop where I am buffering the intermediate newlines.
Also please note that for ease of this discussion, I have described this question in terms of arrays, rather than file I/O -- this is not germaine to the solution I am seeking.
Here's some sample data
#!/usr/bin/perl use strict; use warnings; # NOTE: The @sourcedata array is the representation of # the data as if it were read from a file by: # # open(FH, $sourcefilename) || die ... # my @sourcedata = <FH>; # close FH; # # Since the source file is huge, I need to process the # file line by line # NOTE: I updated this array to reflect an array of lines my @sourcedata = ( "\n", "1\n", "\n", "\n", "b\n", "\n", "1\n", "\n", "\n", "\n", "\n", "_____ 2\n", "\n", "\n", "\n" ); # The desired result of processing a small data sample: my @desiredoutput = qq( 1 b 1 _____ 2 # NOTE: preceding newlines have been collapsed );
Here's my source code...
my @buffer = (); my $length = scalar @sourcedata; for (my $I = 0; $I < $length; $I++) { my $line = $sourcedata[$I]; if ($line =~ /^1$/) { push(@buffer, $line); $I++; $line = $sourcedata[$I]; # Here's the loop I can't seem to execute: while ($line =~ /^\n$/ && $I != $length) { print "Buffering...\n"; push(@buffer, $line); $I++; last if $I == $length; $line = $sourcedata[$I]; } if ($line =~ /_____ 2/) { # print only the first and last items in the buffer, # effectively removing the empty lines print shift(@buffer), pop(@buffer); print $line; } else { print join(@buffer); } } else { print $line; } @buffer = (); }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Implementing a parsing rule in a huge data file
by BrowserUk (Patriarch) on Dec 08, 2006 at 01:48 UTC | |
by thezip (Vicar) on Dec 08, 2006 at 16:39 UTC | |
|
Re: Implementing a parsing rule in a huge data file
by ikegami (Patriarch) on Dec 08, 2006 at 00:41 UTC | |
|
Re: Implementing a parsing rule in a huge data file
by derby (Abbot) on Dec 08, 2006 at 00:39 UTC | |
|
Re: Implementing a parsing rule in a huge data file
by andyford (Curate) on Dec 08, 2006 at 00:06 UTC |