in reply to Using FastCGI
FastCGI can help a lot, but where it shines is when you have many requests over a short period of time (ie, high traffic). Every time a CGI script (done the old fashioned way) is invoked, the Perl interpreter fires up, loads all the modules, and runs your script. That startup time can be more significant than a trivial script. FastCGI improves this situation dramatically (as does mod_perl and webserver API integration).
But there's only so much improvement you can get there. You next have to start looking at the algorithms. Anywhere you find yourself creating nested loops, or creating multiple sequential loops to deal with the same data set, you have to ask if there's a better way to do it. Profiling is a first step toward improving code already written, but even before the first step is planning and composing efficient code.
I found a few areas where you could eliminate sequential loops, but I would need to know what goes on in the regexp-comparison loop to see if there's room for further efficiency improvements. And without profiling it's very difficult to know where to focus attention.
#!/usr/bin/perl -wT #What is the T for in -wT? use strict; use CGI qw(:standard); use FCGI; use File::Find; require '/Users/jon/Desktop/stanford-postagger-full-2011-04-20/verbTen +seChanger.pl'; my $search_key = "move"; # --- Different forms of the Searchword --- # # I made a refinement here that eliminated a step. my @verbforms = ( $search_key, map { changeVerbForm( $search_key, 0, $_ ) || (); } 1 .. 4 ); my $category_id = 'subj'; # --- Variables for required info from parser --- # my ( $chapternumber, $sentencenumber, $sentence, $grammar_relation, $argument1, $argument2 ); my @all_matches; ## RESULTS OF SEARCH my $dir = '/Users/jon/Desktop/stanford-postagger-full-2011-04-20/'; opendir(my $dh, $dir) or die $!; # Use a lexical directory handle. # I made a change here where the first grep handles both tests. # That eliminates one loop. my @files = map { "$dir/$_" } grep { -f && /^parsed.*\.txt$/ } readdir($dh); while (FCGI::accept >= 0 ) { local $/ = 'Parsing'; # Why slurp/split when we can read it by 'Pa +rsing' record? print header(); print start_html(); foreach my $file ( @files ) { open my $parse_corpus_fh, '<', $file or die $!; # Eliminate slurp, join, split (3 implicit loops) # replaced by one 'while' loop. while ( my $sentblock = <$parse_corpus_fh> ) { chomp $sentblock; if ( $sentblock =~ /file: \s(\S+)\.txt/ ) { $chapternumber = $1; } foreach my $verbform( @verbforms ) { # blah blah # I don't know what you put here. # Here is an opportunitiy to print per verbform. } # You may have had stuff here too. # Here is an opportunity to print per record. } #Here is an opportunity to print per file. } # Here is an opportunity to print per FCGI iteration. print "</ol><br>"; print end_html(); }
This is untested since I don't know what to fill into the blanks. But it has eliminated a few either sequential or nested loops. See comments for indication of where loops have been refined. It's hard to know the impact it will have without knowing size of data sets, quantity of files handled, and so on, what is happening inside the regexp-matching loop, etc. But it could be a start in the right direction.
By the way, the perl -T switch is for Taint Mode, which is described in brief in perlrun.
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Using FastCGI
by jonc (Beadle) on Jun 15, 2011 at 04:40 UTC | |
by davido (Cardinal) on Jun 15, 2011 at 04:53 UTC | |
by jonc (Beadle) on Jun 15, 2011 at 05:19 UTC | |
by davido (Cardinal) on Jun 15, 2011 at 05:30 UTC |