Simplified, short version:
How do I multithread read access to a single file? (using fork)
Long version: 8)
perl 5.005, sun solaris and linux (rh7)
I have a requirement to parse and load a flat file to an rdbms. Perl will need to scrub the data and the files can be on the large scale. (Couple million records) I thought, heck, lets multithread this thing! How hard could it be? Here's some sample code that I thought would work... (Copied from memory and commented)
--- Code snipplet--- Code snipplet#!/usr/bin/perl package SMF::Threader; # Used for IPC between processes. # Create a SMF::Threader object for memory sharing sub new { my($class)=shift; my $self; open($self->{filehandle},"test.dat") || die $!; bless $self, $class; } # Every process will have to wait her turn to get a record sub lock { my($self, $pid)=@_; push($self->{waits},$pid); until (${$self->{waits}}[0] == $pid) { ; #waiting for my turn }; 1; } # Release the next process sub unlock { my $self=shift; shift (@{$self->{waits}}); 1; } # Get a record from the filehandle sub fetch { my($self, $pid)=@_; $self->lock($pid); # Can anyone tell me how to combine these next 2 lines? # <$self->{filehandle}> is a syntax problem my $fh=$self->{filehandle}; my $row=<$fh>; $self->unlock; return $row or undef; } 1; package main; use POSIX; my $new=new SMF::Threader; for (1..2) { # Fork 2 processes unless (fork) { open(OUT, ">".$$.".out"); while(my $record=$new->fetch($$) ) { # Record format is "0000000000abcdefg..xyz" my($num,$alpha)=unpack("a10 a26",$record); print $record unless length($alpha) == 26; } close OUT; exit; } sleep 1; # I don't think this is necessary because of my locking met +hod, # But.. Just in case. } my $child; do { $child = waitpid(-1,POSIX::WNOHANG); # Is WNOHANG not exported?? } until $child == -1; exit;
My assumption was that if I build $new (SMF::Threader) in the parent and use that in each child, it would create a memory segment shareable between the processes. Is that true? The problem is that the processes don't always get a complete record. (RS=newline) What am I overlooking? Am I going to have to use a semaphore to keep track of the locks? I think I will still build that into SMF::Threader (Named something different) as I might have a reason to reuse it for database read access. (MUCH LATER) 8) Any problems you see with that? (CORBA? Definately overkill I think)
All help will be greatly appreciated!
Shawn M Ferris
Oracle DBA - Time Warner Telecom
In reply to Threading read access to a filedescriptor by smferris
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |