LostShootingStar has asked for the wisdom of the Perl Monks concerning the following question:
Once all this is established, i start itterating the input file (which can be millions and millions of lines), and sending the filenames over the ssh connection to this remote script, then it waits for a response. when it gets a response, it sends the next line. the semi-full code looks like this:sub remote_script { my ($mode) = @_; if ($mode eq "test") { return ' $|=1; print "READY\n"; while (<STDIN>) { chomp; my $found = glob($_); print "$found\n"; } '; } }
The problem is that this gets really slow when there is a large inputfile, taking up to 45 min to search for 1,000,000 lines. Id like to see this improve, even a little bit. If anyone has any advice for this, please share. Thank you!sub process3 { my ($start,$end,$node) = @_; my %workers; my ($cur, $line, $pos); my $done = 0; my ($rnode, $obj); my @file; tie @file, 'Tie::File', "inputfile" or die "couldnt tie"; $workers{$node} = open_handle($node, "glob"); #this op +ens the ssh connection. my $to_node = $workers{$node}[WRITE]; my $from_node = $workers{$node}[READ]; $workers{$node}[SENT] = $start; $line = $file[$workers{$node}[SENT]]; print $to_node "$cur\n"; $workers{$node}[SENT]++; while(1){ my $res; $res = $from_node->getline(); chomp($res); ($obj, $rnode) = split(',',$res); print "$obj\n" if $res; last if ($workers{$node}[SENT] > $end); $line = $file[$workers{$node}[SENT]]; print $to_node "$line\n" unless($workers{$nod +e}[SENT] > $end ); $workers{$node}[SENT]++; } } #i know i can put this in a loop, but i decided to leave it for clarit +y. my $thr1 = threads->new(\&process3, 0,$endline, "c001n05"); my $thr2 = threads->new(\&process3, 0,$endline, "c001n06" ); my $thr3 = threads->new(\&process3, 0,$endline, "c001n07" ); my $thr4 = threads->new(\&process3, 0,$endline, "c001n08" ); my $thr5 = threads->new(\&process3, 0,$endline, "c001n09" ); my $thr6 = threads->new(\&process3, 0,$endline, "c001n10" ); my $thr7 = threads->new(\&process3, 0,$endline, "c001n11" ); my $thr8 = threads->new(\&process3, 0,$endline, "c001n12" ); my $thr9 = threads->new(\&process3, 0,$endline, "c001n13" ); my $thr10 = threads->new(\&process3, 0,$endline, "c001n14" ); my $thr11 = threads->new(\&process3, 0,$endline, "c001n15" ); my $thr12 = threads->new(\&process3, 0,$endline, "c001n16" ); $thr1->join(); $thr2->join(); $thr3->join(); $thr4->join(); $thr5->join(); $thr6->join(); $thr7->join(); $thr8->join(); $thr9->join(); $thr10->join(); $thr11->join(); $thr12->join();
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Searching a distributed filesystem
by BrowserUk (Patriarch) on Apr 16, 2007 at 04:53 UTC | |
|
Re: Searching a distributed filesystem
by GrandFather (Saint) on Apr 16, 2007 at 03:14 UTC | |
|
Re: Searching a distributed filesystem
by varian (Chaplain) on Apr 16, 2007 at 06:06 UTC | |
|
Re: Searching a distributed filesystem
by graff (Chancellor) on Apr 16, 2007 at 04:05 UTC | |
by LostShootingStar (Novice) on Apr 16, 2007 at 04:12 UTC | |
by Anonymous Monk on Apr 16, 2007 at 14:29 UTC | |
|
Re: Searching a distributed filesystem
by shmem (Chancellor) on Apr 16, 2007 at 07:30 UTC |