Re^2: Running out of memory...

I've changed my code to the following:

#!/usr/local/bin/perl -w
use strict;

open IN,  '< :raw', $ARGV[ 0 ] or die "$ARGV[ 0 ] : $!";
open OUT, '> :raw', $ARGV[ 1 ] or die "$ARGV[ 1 ] : $!";

my $a = '<!-- rsecftr.htm - Course Sections Table Footer -->';
my $b = '<!-- rsechdr.htm - Course Sections and Course Section Search 
+Table Header -->';

my $buffer;
sysread IN, $buffer, 5800, 5800;
do{
    ## Move the second half of the buffer to the front.
    $buffer = substr( $buffer, 5800 );

    ## and overwrite it with a new chunk
    sysread IN, $buffer, 5800, length( $buffer );

     ## Apply the regex
     $buffer =~ s|$b(.*?)$a||g;
     print $buffer;

     ## Write out the first half of the buffer
     syswrite OUT, $buffer, 5800;
} until eof IN;

close IN;
close OUT;
[download]

auburn_courses.txt contains a load of html files all bunched one after the other...I'd like to remove the bits between the footer of one section that I want to see and the header of the next section that I'd like to see. They're delimited by the $a and $b lines.

Update! All fixed, ignore me

Comment on Re^2: Running out of memory... Download Code

Replies are listed 'Best First'.
Re^3: Running out of memory... by BrowserUk (Patriarch) on Jan 29, 2005 at 21:11 UTC
I realise you have fixed your problem, but I have to say that without seeing your actual data, 5800 is a very strange choice of buffer size? Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply]
Re^4: Running out of memory... by knewter (Novice) on Mar 03, 2005 at 20:22 UTC
I don't exactly remember why I did that now, but it had something to do with the average size of the html file I was slurping...that number's something like twice it I think, so that I'm guaranteed to have the data I need in the middle somewhere...yeah, I don't remember why exactly now, but I'm glad to say that the scraper portion of my project is completed and that the data is happily in a database serving out useful information to people that didn't have it as easily before :)	[reply]