enchanter has asked for the wisdom of the Perl Monks concerning the following question:

Win32, CGI, IE6. (Currently I can't move from win to *nix, so I need just win32-ActiveState-Perl help.)
1) First of all, I would like to ask for some ready win32-Perl-code I looking for.
2) If there isn't 1), then I asking to offer me a proper, good, detailed, effective, bugless and CPU- and memory-cheap algorithm.
3) If neither 1) nor 2), then help me, please, to correct my code (see below).

I trying to write something like a simplified visual model of multistream file downloading client (without a function of splitting a single large file on chunks) with limitation of max number of simultaneously downloading files and joining a next waiting file from queue to currently downloading group of files just after any file has been arrived.
In my case it's several (3) child processes working in parallel. Grand total of files: 9 ('a'. 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'). The limit on simultaneously working processes is just the number of processes working in parallel (3). The rest remain in queue. Each child just typing a column of a letter via certain pause and different number of times in the LOOP:

1-st child: 'a' via 3 seconds 8 times;
2-nd child: 'b' via 7 seconds 2 times;
3-rd child: 'c' via 4 seconds 5 times.

I was forced to resort to writing the CGI script working in browser, because it seems unable to demonstrate visually the idea of the parallel work of processes in a shell.
Typing of letters within every column by each child simbolizes the process of the file's arriving.
Each column simbolizes a file's body, each pause between letters in a single column is something like delay (response) from a server and also it may be ISP connection speed, and different number of times of typing within each column simbolizes different length each of files.
It's most likely, all the files won't be able to arrive at the same time. Although it's unable in the real Inet to calculate exactly what file will be arrived most early and what - latest, but I have the model with known parameters (see above). Therefore we can calculate here the order of file's arriving:

1-st child: 'a': 3 * 8 = 24;
2-nd child: 'b': 7 * 2 = 14;
3-rd child: 'c': 4 * 5 = 20.

(Of course, for approaching the model to a real situation we should use rand() values instead of exact.)

Now we aware:

most first file was arrived with 2-nd child ('b'-letter);
second file was arrived with 3-rd child ('c'-letter);
last file was arrived with 1-st child ('a'-letter).

Therefore, the 'd'-letter should be taken from queue (array) just by 2-nd child and start typing in the same column just after all the 'b'-letters will be printed.
The 'e'-letter should be taken by 3-rd child (unless 2-nd child has time for printing all the 'd'-letters and start to pick out from array and type 'e'-letter earlier of 3-rd child).
The 'f'-letter should be taken by 1-st child (unless 2-nd and/or 3-rd children has time to do it earlier).
Anyway, every time just any letter is over printing by a child, a nearest accessable letter should be taken from head of array by current child (and also parameters of a letter: its new pause and number of times) untill letters still there are in array.

Also, I use 'redo' built-in Perl function for synchronization of the starts all the threads. But I want to know: is it cheaper for CPU and memory to use 'redo' for synchronization or it will be better to use semaphores? Or, maybe, something esle (like IPC signals between parent process and each spawn child or cond_wait, cond_timedwait, cond_signal, cond_broadcast from threads::shared module)?

Note: every time once any column of a letter is over printing, I don't exit_from/kill current thread with following spawning a new one! No. I just keep the existing child; on the last loop of WHILE/FOR's LOOP I do conditional check: if LOOP's variable has approached to the right LOOP's boundary, I move it some more right, and the child with even its LOOP keep alive! I guess it more cheap than exit from current child and spawning a new one.

In my case, I have to share an array of hashes among all the threads. It's probably, I have to dereference and share each refrence explicitly, and then lock and unlock each one as well. But I don't know how to. Once I add ': shared' to my @params, my script becomes unworkable (excepting the HTML-table dumping).

------------------------------------------------------------------------
Although this CGI-script is workable, it has runtime 'bugs' (no sync of starts all the threads, appearing the message: "A thread exited while 3 threads were running.", perceptible slowing-down of working speed at using lock() for shared arrays or absence any affect at all), incorrect results (number of printings of the same kind of letter and its program-calculated value may be different in about 1-2% of runnings, you can be able to see this divergence for sure, if uncomment #@rv, #@pauses and put comment at the beginning of first strings of current @rv, @pauses below; you'll see that 'a' prints for 5 times (it's incorrectly!), and calculated value indicates here: "for current letter 'a' repetitions 3 via pause 0.325543212890625 sec" (it's correctly!)) and also the script has some illiteracy (single $thr->join for several threads, for instance). DO NOT waste your time on learning this code, just run it and look at the idea. I posting it just for you can be able to visualize working idea I meaning. Sorry for this ugly code:

----------------------------------------------------------
(1) Source Code:
----------------------------------------------------------
#!C:\Perl\bin\perl -w $|++; print "Content-type: text/html\n\n"; # use strict; use warnings; use Time::HiRes; use threads; use threads::shared; my ($doc_top, $doc_middle, $doc_bottom); $doc_top = "<html>\n"; $doc_top .= "<head>\n"; $doc_top .= "<script>\n"; $doc_top .= "function set(id,text) {\n"; $doc_top .= " document.getElementById(id).innerText = text\n"; $doc_top .= "}\n"; $doc_top .= "</script>\n\n"; $doc_top .= "</head>\n"; $doc_top .= "<body>\n\n"; $doc_top .= "<table cellSpacing=\"0\" cellPadding=\"0\" width=\"10%\" +border=\"0\">\n"; $doc_top .= "\t<tr>\n"; $doc_top .= "\t\t<td>\n"; $doc_top .= "\t\t\t<table width=\"100%\">\n"; #$doc_top .= "\t\t\t\t<tr>\n"; $doc_middle =''; #$doc_bottom .= "\t\t\t\t</tr>\n"; $doc_bottom .= "\t\t\t</table>\n"; $doc_bottom .= "\t\t</td>\n"; $doc_bottom .= "\t</tr>\n"; $doc_bottom .= "</table>\n\n"; #$doc_bottom .= "</body>\n"; #$doc_bottom .= "</html>\n"; for ($j = 0; $j <= 22; $j++) { $doc_middle .= "\t\t\t\t<tr>\n"; for ($i = 0; $i <= 2; $i++) { $doc_middle .= "\t\t\t\t\t<td id='cell$i$j' bgColor=\"#eeeeee\" al +ign=\"center\">&nbsp</td>\n"; } $doc_middle .= "\t\t\t\t</tr>\n"; } print $doc_top.$doc_middle.$doc_bottom; our (@rv, @pauses, @letters) : shared; #@rv[0,1,2, 3,4,5, 6, 7, 8] = (2, 5, 5,<br> @rv[0,1,2, 3,4,5, 6, 7, 8] = (int(rand(4))+2, int(rand(4))+2, int( +rand(4))+2, int(rand(4))+2, int(rand(4))+2, int( +rand(4))+2, int(rand(4))+2, int(rand(4))+2, int( +rand(4))+2); #@pauses[0,1,2, 3,4,5, 6, 7, 8] = (0.325543212890625, 0.3333251953 +125, 0.345562744140625,<br> @pauses[0,1,2, 3,4,5, 6, 7, 8] = ((rand(1))+.3, (rand(1))+.3, (ran +d(1))+.3, (rand(1))+.3, (rand(1))+.3, (ran +d(1))+.3, (rand(1))+.3, (rand(1))+.3, (ran +d(1))+.3); @letters[0,1,2, 3,4,5, 6,7,8] = ('a','b','c', 'd','e','f', 'g' +,'h','i'); my $cur_right_value; my $cur_pause; my $cur_letter; for($f = 0; $f <= 2; $f++) { $cur_right_value = @rv[$f];# print $cur_right_value; print "<br>\n"; $cur_pause = @pauses[$f]; # print $cur_pause; print "<br>\n"; $cur_letter = @letters[$f];# print $cur_letter; print "<br>\n"; print "for current letter <b>$cur_letter</b> repetitions <b>".($cur_ +right_value+1)."</b> via pause <b>$cur_pause</b> sec<br>\n"; $thr = threads->new(sub { my $cur_var = 0; for($cur_var = $cur_var; $cur_var <= $cur_right_value; $cur_var++) { print "<script>set('cell$f$cur_var', '$cur_letter')</script>\n"; if(Time::HiRes::sleep($cur_pause)) { lock(@rv); lock(@pauses); lock(@letters); } if($cur_var == $cur_right_value) { # print "<b>\$f = $f</b><br>\n"; $cur_right_value = @rv[0]+$cur_var; $cur_pause = @pauses[0]; $cur_letter = @letters[0]; shift @rv; # print @rv; print "<br>\n"; shift @pauses; # print @pauses; print "<br>\n"; shift @letters;# print @letters; print "<br>\n"; } } }); } shift @rv; shift @rv; shift @rv; # print @rv; print + "<br>\n"; shift @pauses; shift @pauses; shift @pauses; # print @pauses; print + "<br>\n"; shift @letters; shift @letters; shift @letters;# print @letters; print + "<br>\n"; $thr->join; print "@letters<br>\n"; print "<br>\n"; print "</body>\n"; print "</html>\n";
------------------------------------------------------------------------



There is also some more literate CGI-script. It even can sync threads starts. But I can't complete it so it woluld be work accordingly to idea of first script, but without those 'bugs', errors, incorrect results etc and pretty literately. I tried to complete it independent, but I couldn't. I would be glad if you'll be able to help me to gain this idea on first script basis, because it's still very hard to me to do it alone.


----------------------------------------------------------
(2) Source Code:
----------------------------------------------------------
#!C:\Perl\bin\perl -w print "Content-type: text/html\n\n"; use warnings; use Time::HiRes; use threads; use threads::shared; $doc_top = "<html>\n"; $doc_top .= "<head>\n"; $doc_top .= "<script>\n"; $doc_top .= "function set(id,text) {\n"; $doc_top .= " document.getElementById(id).innerText = text\n"; $doc_top .= "}\n"; $doc_top .= "</script>\n\n"; $doc_top .= "</head>\n"; $doc_top .= "<body>\n\n"; $doc_top .= "<table style=\"font: 8pt Verdana, Arial, Helvetica, Sans- +serif; line-height:8pt;\" cellSpacing=\"1\" cellPadding=\"2\" width=\ +"21%\" border=\"1\">\n"; $doc_top .= "\t<tr>\n"; $doc_top .= "\t\t<td>\n"; $doc_middle =''; $doc_bottom .= "\t\t</td>\n"; $doc_bottom .= "\t</tr>\n"; $doc_bottom .= "</table>\n\n"; for ($j = 0; $j <= 32; $j++) { $doc_middle .= "\t\t\t<tr>\n"; for ($i = 0; $i <= 2; $i++) { $doc_middle .= "\t\t\t\t<td width=\"10%\" id='cell$i$j' bgColor=\" +#eeeeee\" align=\"center\">&nbsp</td>\n"; } $doc_middle .= "\t\t\t</tr>\n"; } print $doc_top.$doc_middle.$doc_bottom; print "<font style=\"font: 8pt Verdana, Arial, Helvetica, Sans-serif; +line-height:8pt;\">\n"; $|++; my $threads = 3; my $count : shared = $threads; my @threads; my @params = ( {id => '0', up => 8, to => 0.5, char => 'a'}, {id => '1', up => 5, to => 0.2, char => 'b'}, {id => '2', up => 11, to => 0.3, char => 'c'}, {id => '3', up => 6, to => 1.1, char => 'd'}, {id => '4', up => 2, to => 0.5, char => 'e'}, {id => '5', up => 9, to => 0.6, char => 'f'}, {id => '6', up => 3, to => 0.7, char => 'g'}, {id => '7', up => 5, to => 0.8, char => 'h'}, {id => '8', up => 6, to => 0.9, char => 'i'} ); sub fun { my $arg = shift; $count -= 1; my $cur_var = 0; for($cur_var = $cur_var; $cur_var <= $arg->{up}; $cur_var++) { redo if $count; print "<script>set('cell$arg->{id}$cur_var', '$arg->{char}')</scri +pt>\n"; Time::HiRes::sleep($arg->{to}); } } foreach(1..$threads) { push @threads, threads->new(\&fun, shift @params); } foreach(1..$threads) { my $thid = shift @threads; $thid->join; } print "</body>\n"; print "</html>\n";
------------------------------------------------------------------------
Thanx.

Edit by castaway - added readmore tags

Replies are listed 'Best First'.
Re: how to share array of hashes among several threads?
by mugwumpjism (Hermit) on May 24, 2005 at 01:02 UTC

    Did you see Thread::Queue ?

    If you want a fast, CPU-efficient, bug-free way to do this, another common way is to simply use a database.

    Fortunately there are many lightweight and even Pure Perl database packages out there which present themselves in a number of useful ways to make this a relatively easy undertaking. That way, all the "hard" problems of consistency are generally solved for you.

    Have a play with a module like DB_File or SDBM_File to get a feel for how you can use this, then try it in concert with MLDBM. As they don't provide a queue for you, you'll have to make a "lock" key that you acquire before updates and release afterwards to prevent concurrent access from wedging your database.

    $h=$ENV{HOME};my@q=split/\n\n/,`cat $h/.quotes`;$s="$h/." ."signature";$t=`cat $s`;print$t,"\n",$q[rand($#q)],"\n";
Re: how to share array of hashes among several threads?
by BrowserUk (Patriarch) on May 24, 2005 at 02:19 UTC

    If the body of your post related to it's title and if your code compiled with strict and warnings and if I didn't have to fire up a webserver to see the output that could just as easily be displayed on the console, and if it was at all clear what it is that you are trying to achieve, I'd be only too willing to help with this.

    As it is, even having re-written your (original) code so that it compiles clean, is readable, and writes simple text to the console rather than unintelligable html/javascript, and having corrected a few of the more obvious errors like creating multiple threads and assigning them all to the same variable, I'm still at a complete losss as to what you are trying to achieve.

    So, try posting a short, simple textual, description of what you are hoping to achieve (Not how you think it should be achieved and (no more code!)) and I'll have another go.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: how to share array of hashes among several threads?
by kwaping (Priest) on May 24, 2005 at 02:31 UTC
Re: how to share array of hashes among several threads?
by zentara (Cardinal) on May 24, 2005 at 10:05 UTC
    From the title of this node, a quick look at your code, and my experience with shared vars on linux:

    When you share hashes, only the top level keys get shared, unless you declare each hash element as shared. I have doubts whether a hash, in an shared AoH will actually get shared, and will just fail without error.

    As an experiment, try making a minimal example with shared AoH, and see if you can access them from all threads. You probably can't. Then try to (laboriously) individually predeclare all hash elements as shared, and it will probably work. For instance:

    my %shash; #share(%shash); #will work only for first level keys my %hash; my %workers; my $numworkers = 3; foreach my $dthread(1..$numworkers){ share ($shash{$dthread}{'go'}); share ($shash{$dthread}{'progress'}); share ($shash{$dthread}{'timekey'}); #actual instance of the thread share ($shash{$dthread}{'frame_open'}); #open or close the frame share ($shash{$dthread}{'handle'}); share ($shash{$dthread}{'data'}); share ($shash{$dthread}{'pid'}); share ($shash{$dthread}{'die'}); $shash{$dthread}{'go'} = 0; $shash{$dthread}{'progress'} = 0; $shash{$dthread}{'timekey'} = 0; $shash{$dthread}{'frame_open'} = 0; $shash{$dthread}{'handle'} = 0; $shash{$dthread}{'data'} = $data; $shash{$dthread}{'pid'} = -1; $shash{$dthread}{'die'} = 0; $hash{$dthread}{'thread'} = threads->new(\&work,$dthread); }

    I'm not really a human, but I play one on earth. flash japh

      You don't need to individually shared the hash elements. You only need to ensure that the hashes are shared before you assign anything to them.

      Here's a quick demo:

      #! perl -slw use strict; use threads; use threads::shared; sub test { my $href = shift; for( 1 .. 10000 ) { for my $ref ( values %$href ) { for my $value ( values %{ $ref } ) { $value++; } } } } my %hash : shared = map{ my %hash2 : shared = map{ $_ => 0 } 'a' .. 'd'; $_ => \%hash2; } 'A' .. 'D'; threads->create( \&test, \%hash )->detach;; for( 1 .. 5 ) { lock %hash; print "$_ => [ @{[ %{ $hash{ $_ } } ]} ]" for keys %hash; sleep 1; } __END__ P:\test>thr-hash.pl A => [ c 0 a 0 b 0 d 0 ] B => [ c 0 a 0 b 0 d 0 ] C => [ c 0 a 0 b 0 d 0 ] D => [ c 0 a 0 b 0 d 0 ] A => [ c 3447 a 3447 b 3447 d 3447 ] B => [ c 3447 a 3447 b 3447 d 3447 ] C => [ c 3446 a 3446 b 3446 d 3446 ] D => [ c 3446 a 3446 b 3446 d 3446 ] A => [ c 6903 a 6903 b 6903 d 6903 ] B => [ c 6902 a 6902 b 6902 d 6902 ] C => [ c 6902 a 6902 b 6902 d 6902 ] D => [ c 6902 a 6902 b 6902 d 6902 ] A => [ c 10000 a 10000 b 10000 d 10000 ] B => [ c 10000 a 10000 b 10000 d 10000 ] C => [ c 10000 a 10000 b 10000 d 10000 ] D => [ c 10000 a 10000 b 10000 d 10000 ] A => [ c 10000 a 10000 b 10000 d 10000 ] B => [ c 10000 a 10000 b 10000 d 10000 ] C => [ c 10000 a 10000 b 10000 d 10000 ] D => [ c 10000 a 10000 b 10000 d 10000 ]

      Far from ideal I agree, but there are other parts that are much more broken than this.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.