Exceeding CPU Limit

xdbd063 has asked for the wisdom of the Perl Monks concerning the following question:

With the help of the Perl Monk expertise, I've been able to finish a project I started back in February. In a nutshell, it changes the perms on a series of several hundred files in different directories, does a pattern search for two lines separated by text that is discarded, puts the two lines in a hash, then searches the files, one at a time in a loop, for a match to the Key. When a match is found, the Value is linked, using s///.

I am fairly confident that it works; however, everytime I try to test, it runs for several hours, then I get an error messge "Exceeded CPU Limit". I get no error messages from the Perl code, the CPU error is from the system.

I know I have asked alot, but if someone could look at what I've done and see if there is a more efficient way to do any part of the project, I would appreciate it. It takes hours just to change the permissions on the files, and I don't know why.

use strict;
use warnings;
use diagnostics;
use File::Copy;
use FileHandle;
use IO::File;
#---------------------------------------------------------------------
+-

# Variables, Arrays and Hash declarations:

my ($manualdir_param) = @ARGV;
my $working_dir = $manualdir_param;
$working_dir =~ s/manualdir=//i;
my $data_area = "/tmp";
my $html_dir = "$data_area/$working_dir";
my @FigureArray;
my @endingArray;
my $FigFile;
my @FileArray;
my @endArray;
my $DirFile;
my %images;
my $line;
my $img;
my $fig;

#---------------------------------------------------------------------
+-

# Loop to locate HTML files, change permissions, and make working temp
+orary copies

opendir( HTMLSTORIES, "$html_dir") || die "HTML dirs do not exist: $1"
+;
        @FigureArray = grep{/^(09)(\w{1,5})(00)$/} readdir ( HTMLSTORI
+ES );

                foreach $FigFile (@FigureArray)
 {

                        opendir( HTMSTORY, "$html_dir/$FigFile" ) || d
+ie "Files do not exist: $1";
                        @FileArray = grep{/a.htm$/} readdir ( HTMSTORY
+ );

                foreach $DirFile (@FileArray)
                        {
                        #       copy ("$html_dir/$FigFile/$DirFile", "
+$html_dir/$FigFile/$DirFile.bak") or die "Can not make backup copy of
+ file: $1";
                                chmod 0600, "$html_dir/$FigFile/$DirFi
+le";


                foreach $DirFile (@FileArray)
                        {

 open (PATTERN, "<", "$html_dir/$FigFile/$DirFile") or die "Unable to 
+open pattern file: $1";
 
                        while (<PATTERN>)
                        {
                        chomp;
 
                        if ($line = /(\<IMG\sSRC\=.*?\>)/.../(Figure\s
+*\d+)/)
                        {
                                $img = $1;
                                $fig = $3;
                        }
 
                        if ($img && $fig)       # if we find both IMG 
+AND Figure lines, add them to images hash.
                        {
                                $images{$fig} = $img;
                        }
 
                        foreach (sort keys %images)
                        {
                                # print "$_: $images{$_}\n";
                        }
 
# find matching Figure lines in text file and do linking.
 
                        while (( $fig, $img) = each(%images))
                        {
 
                                grep{$fig} "$html_dir/$FigFile/$DirFil
+e";
                                        s/$fig/"<a href = $img>$fig<\/
+a>"/g;

                        %images = ();
                        }
                        }

close (PATTERN);
closedir HTMSTORY;
closedir HTMLSTORIES;

# Loop to do cleanup

opendir( ENDSTORIES, "$html_dir") || die "HTML dirs do not exist: $1";
        @endingArray = grep{/^(09)(\w{1,5})(00)$/} readdir ( ENDSTORIE
+S );

foreach $FigFile (@endingArray)
{

        opendir( ENDSTORY, "$html_dir/$FigFile" ) || die "Files do not
+ exist: $1";
                @endArray = grep{/a.htm.bak$/} readdir ( ENDSTORY );

                foreach $DirFile (@endArray)
                {
                                # unlink ("$html_dir/$FigFile/$DirFile
+.bak") or die "Can not remove backup copy of
file: $1";
                                chmod 0644, "$html_dir/$FigFile/$DirFi
+le";
                }
                        }
close ENDSTORY;
close PATTERN;
closedir HTMSTORY;
closedir HTMLSTORIES;
                        }
}
}
[download]

READMORE tags added by Arunbear

Comment on Exceeding CPU Limit Download Code

Replies are listed 'Best First'.

Re: Exceeding CPU Limit
by jasonk (Parson) on Mar 24, 2006 at 15:35 UTC

If you cleaned up your indenting a little to make this more readable, you would find some interesting things going on that you probably don't expect. For example, this chunk of code:

   while (( $fig, $img) = each(%images))
                        {
 
                                grep{$fig} "$html_dir/$FigFile/$DirFil
+e";
                                        s/$fig/"<a href = $img>$fig<\/
+a>"/g;

                        %images = ();
                        }
[download]

We're not surrounded, we're in a target-rich environment!

[reply]
[d/l]

Re: Exceeding CPU Limit
by lima1 (Curate) on Mar 24, 2006 at 15:35 UTC

Can you refactor your code a little bit? Maybe this function helps you.

use strict;
use warnings;

use English qw( -no_match_vars ) ;
use Data::Dumper;

sub get_files_in_directory {
    my ( $dir, $filter ) = @_;
    opendir my $DIR, $dir or die "Can't open $dir: $OS_ERROR";
    my @files = grep {/$filter/} readdir($DIR);
    closedir $DIR;
    return @files;
}

# print every file in tmp starting with 'N'
print Dumper( get_files_in_directory( '/tmp', qr/^N/ ) );
[download]

perltidy

[reply]
[d/l]

Re^2: Exceeding CPU Limit

by davidrw (Prior) on Mar 24, 2006 at 16:10 UTC

File::Find::Rule

my @files = File::Find::Rule->file()->name( qr/^N/ )->maxdepth(1)->in(
+ '/tmp/' );
[download]

[reply]
[d/l]

Re: Exceeding CPU Limit
by davido (Cardinal) on Mar 24, 2006 at 17:09 UTC

If I'm not mistaken, CPU quotas are created by system administrators so that no one of their user accounts can be used to consume more than a certain number of hours of CPU time. Usually the limits are set high enough that an interactive session will never bump into the limit, but low enough that a rogue script will be terminated before it consumes so much CPU time that all other users suffer.

It appears that the script you're running is consuming more CPU time than your system administrator deems prudent for the smooth running of the system, and your process is being sent a terminate signal explaining that you've exceeded your CPU limit (your quota).

Your first course of action is to re-examine your algorithms to see if there is any way you can reduce nested loops and other time-expensive code.

Dave

[reply]

Re: Exceeding CPU Limit
by bowei_99 (Friar) on Mar 24, 2006 at 23:15 UTC

Avoiding global variables, i.e. declaring a variable within a subroutine with my, also helps keep the problem isolated. You only pass in and out of the subroutine what you have to. For instance, you could move your while statement to a subroutine, like

   while (<PATTERN>){
        # do stuff here
   }
[download]

open (PATTERN, "<", "$html_dir/$FigFile/$DirFile") or die "Unable to 
+open pattern file: $1";
[download]

open my $PATTERN, "<", "$html_dir/$FigFile/$DirFile"
   or die "Unable to open pattern file: $!";
[download]

DoSomethingWithFile($PATTERN);
[download]

sub DoSomethingWithFile {
   my $FH = shift @_;

   <code>
   while (<PATTERN>){
        # do stuff here
   }
}
[download]

perlvar

One thing that would really help with the formatting is if the beginning and ending of each curly brace were lined up. For instance, the fact that you have at the end of your code two curly braces in the same horizontal position makes it harder to read and figure out the flow of the program.

-- Burvil

[reply]
[d/l]
[select]