libvenus has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

which amongst the following is a better strategy considering the application i m builing would fire multiple queries to a App Prod/test server ( App server can spawn mutiple children to work parallely) compare the results( can contain 10 K records each for prd and test).The comparison is a complex and would take time :-

Single process multiple threads using the threaded queue containg the workitem - queries.

Single parent process spawns multiple children to work on th queries parallely using parallel::forkManager.

sidering the application that i m desinging which is a better approach.

Thanks

Replies are listed 'Best First'.
Re: best strategy
by tilly (Archbishop) on Aug 25, 2008 at 06:38 UTC
    There are critical questions you have not answered. What operating system are you running on? (Any form of *nix would make fork work well. Windows does an emulation which could make that solution significantly worse.) What are you expecting to be your performance bottleneck? (CPU? Disk? Network delays?) What kind of hardware are you working on? (Number of CPUs? Number of disks?) Are there significant initialization costs? (eg Database connections cannot be preserved across a fork, and are expensive to create.) How much data needs to be passed around? Is there any possibility of moving this to a cluster?

    For an extreme example, if you're using Windows and are expecting to bottleneck on local CPU on a 1-CPU machine, you absolutely should make this job a single process, that is single-threaded.

    Suppose that you're bottlenecked on network time delays and there is an Oracle database connection needed per worker. Then you really want several persistent workers. Single process, multiple threads would beat constant forking.

    Suppose that you're bottlenecked on disk seek time, you're on a Unix system, and there are no startup costs. Then I would recommend the fork approach.

    Suppose that you're bottlenecked on network round trips CPU and there is a possibility of throwing multiple machines at the problem. Then I'd recommend neither of your approaches. Instead I'd look for a way to farm out jobs to multiple processes on multiple machines. One approach is to use a standard clustering solution. A very cheesy approach that I must admit to having used in the past is to make the job run in a webserver, and then use a load balancer to distribute requests. (Hey, I had the webservers already set up and sitting there mostly idle...) Another interesting approach is to have a database table with a table for open jobs. Then have workers on multiple machines query it. (I set up a batch processing system on this principle and it worked well. It was suggested to me by a former boss who had set up a swaps trading system on the principle, with some of the "workers" for some types of jobs really being people.)

    Every one of these solutions and more have been successfully used. Every one has advantages and cases where it is best. Anyone who gives you an absolute answer saying that one of them is always the right way to go doesn't know what they are talking about.

    I didn't really answer your question. But hopefully I gave you enough to think about that you can have a better chance of coming up with the right solution for your situation. Oh, and I gave you a few more options to consider. :-)

    Update: I messed up one of my examples. If you're bottlenecked on network round trips then a single machine should be able to run enough copies to move the bottleneck to the server on the other end. In which case there is no need to complicate things with the cluster. But if CPU is your problem then you would want to split up work onto multiple machines.

      What operating system are you running on ?

      unix flavour

      What are you expecting to be your performance bottlenecks?

      processing speed and Memory utilization

      What kind of hardware are you working on?

      minimum CPUS available 4 max - 12

      Are there significant initialization costs and How much data needs to be passed around

      i have to read many queries which are into very big files around 500 in no.The output of queries can also be bulky.Then i need to compare them.Maximizing speed with minimum memory overhead is what i m trying to achieve

      Is there any possibility of moving this to a cluster?

      not sure right now...

      Well i have received some valuable advice from various monks in the thread " Problem in Inter process Communication" though i still cannot decide

Re: best strategy
by BrowserUk (Patriarch) on Aug 25, 2008 at 09:55 UTC

    One cannot help but think that at this point your best strategy would be to actually start writing some code.

    In the time since you first asked this question, you could have written prototypes of both a forking and a threaded solution and now be in a position to do some empirical tests to determine which works best on your particular setup.

    With a little care, the subroutines for issuing the queries, and comparing the results, should be reusable by both prototypes without change. You have already written code for the threading infrastructure. Knocking up a forking equivalent using Parallel::ForkManager should be relatively simple. Once you have both, you will be in a position to make some real progress on deciding which is going to work best in your environment as well as deciding if moving to a Perl solution is really going to produce any benefit over your exists C++ solution.

    On the basis of the accumulation of the sparse information you've provided spread across your 3 threads on this subject, my gut feel is that a threaded solution will be most flexible and efficient, as your comparisons seem to be consuming the bulk of the time, using a reusable pool of workers will have less startup overhead and cause least memory thrashing. It will also require the least amount of infrastructural overhead to control the asynchronicity.

    But, given your lack of information regarding the performance of the hardware setup--network bandwidth and latency--along with the spread of inherent hardware parallelism available--from 4 to 12 cpus--and the only measure of where the bottlenecks of the existing system lie, being hearsay that "the comparison is where most of the time is spent", attempting to draw any conclusions is only ever going to be speculation.

    The only ways you are going to come up with any definitive answers is

    1. perform some deep analysis of the existing system and attempt to extrapolate that to your two alternative implementations;
    2. knock up some prototypes and perform some measurements.

    And the latter approach will be quicker to do; require less in-depth knowledge; and provide the most accurate results.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      well i have done exactly that, though i m unable to use Parallel::Forkmanager as it is not available(in my env)

      Using threads

      use strict; use warnings; use Benchmark;# use threads; #open(PROD,"/ms/user/j/juyva/dev/files_xls_tmpl_cfg_nonscripts/prod. +txt") || die " $! "; #my @allProd = <PROD>; #close PROD; #open(TEST,"/ms/user/j/juyva/dev/files_xls_tmpl_cfg_nonscripts/test. +txt") || die " $! "; #my @allTest = <TEST>; #close TEST; my @allPort = qw(22600 22610); my %hashOp; sub boss { for(my $i = 0;$i < @allPort; $i++) { my $thr = threads->new(\&worker,$allPort[$i]); } foreach my $thr (threads->list) { # Don't join the main thread or ourselves if ($thr->tid && !threads::equal($thr, threads->self)) { $thr->join; } } } sub worker { my $port = shift; my $timeTakenDm = timeit(1,sub { system(" /ms/dist/pcs/bin/cli +ent hqsas501 $port 200 \-f /ms/user/j/juyva/dev/files_xls_tmpl_cfg_no +nscripts/sql1.clmod.NEW.txt > $port.txt " )}); print "Dm took:",timestr($timeTakenDm),"\n"; if ($? == -1) { print "failed to execute: $!\n"; } elsif ($? & 127) { printf "child died with signal %d, %s coredump\n", ($? & 127), ($? & 128) ? 'with' : 'without'; } else { printf "child exited with value %d\n", $? >> 8; } } my $obj = timeit(1, sub { my $thrboss = threads->new(\&boss); $thrboss->join; my (@allProd,@allTest); foreach my $port (@allPort) { open(HAN,"$port.txt") || die " $!"; my @temp = <HAN>; $hashOp{$port} = \@temp; #print $hashOp{$port}; close HAN; } my $timeSort = timeit(1, sub { @allProd = sort +@{$hashOp{"22600"}}; @allTest = sort +@{$hashOp{"22610"}}; }); print "sort took:",timestr($timeSort),"\n"; #my @allProd = sort @{$hashOp{"22600"}}; #my @allTest = sort @{$hashOp{"22610"}}; #print @allProd; #print @allTest; unless(@allProd == @allTest) { print " inside unequal rows retunred\n +"; my $whichhasmoreelements = @allProd > +@allTest ? 'allProd' : 'allTest'; if($whichhasmoreelements =~ /Prod/) { print " the no of lines do not ma +tch prod has more rows are they are \n"; my @tempallProd = @allProd; my @diffProdTest = splice(@tempa +llProd,(@allTest -1),(@allProd - @allTest)); print @diffProdTest; print " do u want to continue : e +nter y/n "; my $choice = <STDIN>; exit if($choice =~ /^n$/i); } else { print " the no of lines do not ma +tch test has more rows are they are \n"; my @tempallTest = @allTest; my @diffProdTest = splice(@tempal +lTest,(@allProd - 1),(@allTest - @allProd)); print @diffProdTest; print " do u want to continue : e +nter y/n "; my $choice = <STDIN>; print " $choice "; exit if($choice =~ /^n$/i); } } for( my $i = 0;$i < (@allProd > @allTest ? @ +allProd : @allTest); $i++) { unless($allProd[$i] eq $allTest[$i]) { my @defaultProd = split/\|/,$allP +rod[$i]; my @defaultTest = split/\|/,$allT +est[$i]; unless(@defaultProd == @defaultTe +st) { my $whichhasmoreelements = +@defaultProd > @defaultTest ? 'defaultProd' : 'defaultTest'; if($whichhasmoreelements =~ + /Prod/) { print " the no of l +ines do not match prod has more rows are they are \n"; my @tempallProd = @ +defaultProd; my @diffProdTest = + splice(@tempallProd,(@defaultTest -1),(@defaultProd - @defaultTest)) +; print @diffProdTest +; print " do u want t +o continue : enter y/n "; my $choice = <STDIN +>; exit if($choice =~ +/^n$/i); } else { print " the no of l +ines do not match test has more rows are they are \n"; my @tempallTest = @ +defaultTest; my @diffProdTest = +splice(@tempallTest,(@defaultProd - 1),(@defaultTest - @defaultProd)) +; print @diffProdTest +; print " do u want t +o continue : enter y/n "; my $choice = <STDIN +>; print " $choice "; exit if($choice =~ +/^n$/i); } } for( my $a = 0;$a < (@defaultProd + > @defaultTest ? @defaultProd : @defaultTest); $a++) { unless($defaultProd[$a] eq $ +defaultTest[$a]) { print " Column $a differ +s::"; print " PROD value $defa +ultProd[$a] : TEST value $defaultTest[$a] \n"; } } } } } ); print "code took:",timestr($obj),"\n";

      Using multiple processes

      use strict; use Benchmark ; use warnings; my @allPort = qw(22600 22610); my %hashOp; sub spawnChild { for (0..$#allPort) { my $cpid = fork(); die unless defined $cpid; if (! $cpid) { # This is the child #my $wait = int rand 4; #sleep $wait; #print "Child $$ exiting after $wait seconds\n"; print "$_\n"; my $port = $allPort[$_]; my $timeTakenDm = timeit(1,sub { system(" /ms/dist +/pcs/bin/client hqsas501 $port 200 \-f /ms/user/j/juyva/dev/files_xls +_tmpl_cfg_nonscripts/sql1.clmod.NEW.txt > $port.txt " )}); if ($? == -1) { print "failed to execute: $!\n"; } elsif ($? & 127) { printf "child died with signal %d, %s +coredump\n", ($? & 127), ($? & 128) ? 'with' : 'wi +thout'; } else { printf "child exited with value %d\n", + $? >> 8; } print "Dm took $cpid:",timestr($timeTakenDm),"\n"; exit; } } } # Just parent code, after this my $obj = timeit(1, sub { &spawnChild; my (@allProd,@allTest); foreach my $port (@allPort) { open(HAN,"$port.txt") || die " $!"; my @temp = <HAN>; $hashOp{$port} = \@temp; #print $hashOp{$port}; close HAN; } my $timeSort = timeit(1, sub { @allProd = sort +@{$hashOp{"22600"}}; @allTest = sort +@{$hashOp{"22610"}}; }); print "sort took:",timestr($timeSort),"\n"; #my @allProd = sort @{$hashOp{"22600"}}; #my @allTest = sort @{$hashOp{"22610"}}; #print @allProd; #print @allTest; unless(@allProd == @allTest) { print " inside unequal rows retunred\n +"; my $whichhasmoreelements = @allProd > +@allTest ? 'allProd' : 'allTest'; if($whichhasmoreelements =~ /Prod/) { print " the no of lines do not ma +tch prod has more rows are they are \n"; my @tempallProd = @allProd; my @diffProdTest = splice(@tempa +llProd,(@allTest -1),(@allProd - @allTest)); print @diffProdTest; print " do u want to continue : e +nter y/n "; my $choice = <STDIN>; exit if($choice =~ /^n$/i); } else { print " the no of lines do not ma +tch test has more rows are they are \n"; my @tempallTest = @allTest; my @diffProdTest = splice(@tempal +lTest,(@allProd - 1),(@allTest - @allProd)); print @diffProdTest; print " do u want to continue : e +nter y/n "; my $choice = <STDIN>; print " $choice "; exit if($choice =~ /^n$/i); } } for( my $i = 0;$i < (@allProd > @allTest ? @ +allProd : @allTest); $i++) { unless($allProd[$i] eq $allTest[$i]) { my @defaultProd = split/\|/,$allP +rod[$i]; my @defaultTest = split/\|/,$allT +est[$i]; unless(@defaultProd == @defaultTe +st) { my $whichhasmoreelements = +@defaultProd > @defaultTest ? 'defaultProd' : 'defaultTest'; if($whichhasmoreelements =~ + /Prod/) { print " the no of l +ines do not match prod has more rows are they are \n"; my @tempallProd = @ +defaultProd; my @diffProdTest = + splice(@tempallProd,(@defaultTest -1),(@defaultProd - @defaultTest)) +; print @diffProdTest +; print " do u want t +o continue : enter y/n "; my $choice = <STDIN +>; exit if($choice =~ +/^n$/i); } else { print " the no of l +ines do not match test has more rows are they are \n"; my @tempallTest = @ +defaultTest; my @diffProdTest = +splice(@tempallTest,(@defaultProd - 1),(@defaultTest - @defaultProd)) +; print @diffProdTest +; print " do u want t +o continue : enter y/n "; my $choice = <STDIN +>; print " $choice "; exit if($choice =~ +/^n$/i); } } for( my $a = 0;$a < (@defaultProd + > @defaultTest ? @defaultProd : @defaultTest); $a++) { unless($defaultProd[$a] eq $ +defaultTest[$a]) { print " Column $a differ +s::"; print " PROD value $defa +ultProd[$a] : TEST value $defaultTest[$a] \n"; } } } } } ); print "code took:",timestr($obj),"\n"; # Just parent code, after this while ((my $cpid = wait()) != -1) { print "Waited for child $cpid\n"; } print "Parent Exiting\n";

        I hate to say this!. But if as you say, your job is dependant upon your solution to this project, I seriously suggest that you seek help from a local mentor with access to the code, data and hardware.

        Everything about the code you've posted,

        • from that you are timing sorts, and identical external commands which will take exactly the same time,regardless of whether they are a part of a forked or threaded solution.
        • to the way you lay out your code.
        • to your use of C coding idioms rather than Perl idioms.

        Suggest to me that you do not have the experience to tackling a project of this nature when your job is on the line as a result. I seriously wish you the very best of luck, but you need more help than can reasonably be provided through a forum such as this.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: best strategy
by moritz (Cardinal) on Aug 25, 2008 at 07:37 UTC
    Please link to previous discussions on the same or similar subject. It's a question that's not easily answered, and providing only parts of the information each time makes it harder to actually answer the questions (for example this time you didn't mention your operating system).