This is perl 5, version 20, subversion 1 (v5.20.1) built for MSWin32-x
+86-multi-thread-64int
(with 1 registered patch, see perl -V for more detail)
Copyright 1987-2014, Larry Wall
Binary build 2000 [298557] provided by ActiveState http://www.ActiveSt
+ate.com
Built Oct 15 2014 22:10:49
you do realize require is a run time operation? and it modifies the global symbol table(GST)? did you realize that Text::CSV_XS REQUIRES Exporter? I understand that modifying the GST in threads is a very BAD thing.
that said i was able to modify your code so it doesnt fail. although this stops at 1000, i ran it to 10,000 without fail. "use strict; use warnings;" caused me to notice that your $csv in $outth $inth was not declared too, again a modifcation of the GST.
#!/usr/bin/perl
use strict; use warnings;
# if these are commented it will fail
use Carp qw( croak );
use IO::Handle;
#g1 set ... uncommenting these makes it run in 70% of the time (oneway
+=0)
# but it doesnt fail if left comment
#use IO::File;
#use DynaLoader ();
#use Exporter;
use threads;
use Thread::Queue;
my $joinset=0; # =1 still fails
my $oneway=0; # =1 runs in 35% of time (g1 commented)
require Text::CSV_XS if ($oneway);
#require Text::CSV_XS;
my $input = 'input.csv';
my $output = 'output.csv';
my @cols = qw(a b c d);
my $q;
my $count = 0;
my $bdt=time;
while(1) {
$q = Thread::Queue->new;
my $inth = threads->create(\&inputthread);
my $outth = threads->create(\&outputthread);
if ($joinset){
$inth->join();
$q->enqueue(undef);
}
$outth->join() or die("thread crashed?\n");
$inth->join() unless ($joinset);
$count++;
print "$count\n";
last if ($count>=1000);
}
print 'end:'.(time-$bdt)."\n";
sub inputthread {
require Text::CSV_XS unless ($oneway);
# require Text::CSV_XS;
my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\n", sep_c
+har => ",", quote_char => '"', escape_char => '"', empty_is_undef =>
+0, decode_utf8 => 0 });
$csv->column_names(@cols);
my $fh;
open($fh,'<',$input);
binmode($fh);
my @rows = ();
while(1) {
my $data;
if($csv) {
$data = $csv->getline_hr($fh);
unless($data) {
last if($csv->eof);
die("Failed to parse CSV line: ".$csv-
+>error_diag."\n");
}
}
else {
die("Data parsed another way\n");
}
my @arr = values(%$data);
push(@rows,\@arr);
}
close($fh);
$q->enqueue(\@rows);
$q->enqueue(undef) unless ($joinset);
}
sub outputthread {
my $fh;
open($fh,'>',$output);
binmode($fh);
my $csv;
require Text::CSV_XS unless ($oneway);
# require Text::CSV_XS;
$csv = Text::CSV_XS->new ({ binary => 1, eol => "\n", sep_char
+ => ',', quote_char => '"', escape_char => '"' });
while(1) {
my $rows = $q->dequeue;
last unless(defined $rows);
foreach my $data(@$rows) {
if($csv) {
$csv->print($fh,$data);
}
else {
die("Data output another way\n");
}
}
}
close($fh);
return 1;
}
I added the g1 uses when joinset was "essentialy 1" (well missing that kinda code) and it went farther but still failed. so i thought about what happens at join, my theory is that is when "garbage collection" takes place on the now unreferenced variables in $inth. I suspect it was destroying things that $outth will still reference, maybe because they existed in the GST when some use under "require Text::CSV_XS" took place in $outth. by deferring the $inth join to after the $outth join those items were still "valid?" in the GST
i found this very interesting and decided to look further, you may want to play with these mods to get a better understanding. commenting "use croak" and "use IO::Handle" produce interesting errors.
funny my perlmonks quote here was "Clear questions and runnable codeget the best and fastest answer". I started just to see if i could even run your code..
edit:global $cst was in $inth |