in reply to Re^2: Parsing 12GB Entourage database in pieces...
in thread Parsing 12GB Entourage database in pieces...
I could still be wrong, but I have a hard time believing that all that is really necessary. I wrote but didn't test this:
my $msg_marker = "\0\0MSrc"; my $tiny_read = 16 + length( $msg_marker ); while ( ! eof ) { $/ = \$tiny_read; $_ = <>; # marker in here? while ( -1 == index $_, $msg_marker and ! eof ) { # chop the beginning if still no marker $_ = substr $_, $tiny_read if length > $tiny_read; $_ .= <>; } $_ .= <>; # make sure to get those 16 bytes $/ = "\0\0"; $_ .= <>; # read to the end of the message message_in_here( $_ ); }
The down side is that I'm reading 12G in 22 byte increments (except during messages). That might be too slow. On the other hand, it's short and fairly comprehensible (especially if you give names to things I didn't).
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Parsing 12GB Entourage database in pieces...
by ikegami (Patriarch) on Aug 29, 2008 at 00:15 UTC |