in reply to Re^2: ignore list of files using readdir function
in thread ignore list of files using readdir function

It shouldn't take 15 minutes to compare lists of filenames. What exactly are you trying to do?
  • Comment on Re^3: ignore list of files using readdir function

Replies are listed 'Best First'.
Re^4: ignore list of files using readdir function
by kaka_2 (Sexton) on Jul 23, 2013 at 08:50 UTC

    You are right, it takes almost 15 minute which is too much

    below is complete code.
    #! /usr/bin/perl use strict; use Math::BigFloat; Math::BigFloat->precision(0); sub GetINDirFiles { my ($path) = @_; opendir DIR, $path or die $!; my @files = readdir DIR; my @files = grep {!/\_ACK_/} readdir DIR; closedir DIR; return(@files); } sub GetOUTDirFiles { my ($path) = @_; opendir DIR, $path or die $!; my @files = readdir DIR; my @files = grep {/\_ACK.xml$/} readdir DIR; closedir DIR; return(@files); } # Main my $inpath = "/AAA/BBB/CCC/IN"; my $outpath = "/AAA/BBB/CCC/OUT"; my $outsuffix = "_ACK.xml"; my $insuffix = ".xml"; # Added by me my $timethreshold = 900; # set time threshold in seconds (900 se +conds equal 15 minutes) my @delindex; my @infiles = &GetINDirFiles($inpath); my @outfiles = &GetOUTDirFiles($outpath); my $index = 0; # index used to get string position in array foreach my $infile (@infiles) { $infile =~ s/(.*)$insuffix/$1/g; # remove suffix to do co +mparation # Added by me foreach my $outfile (@outfiles) { $outfile =~ s/(.*)$outsuffix/$1/g; # remove suffix t +o do comparation if ($outfile eq $infile){ push (@delindex, $index); # get list of st +rings to be removed from array } } $index += 1; } delete @infiles[@delindex]; # remove strings my $currenttime = time; # get current time from system (epoch t +ime) foreach my $file (@infiles) { next unless (-f "$inpath/$file$insuffix"); # ignore directo +ries # INSERT SUFFIX AGAIN ($insuffix) my $mtime = (stat "$inpath/$file$insuffix" )[9]; # get mt +ime from file (epoch time) # INSERT SUFFIX AGAIN ($insuffix) my $diff = ($currenttime - $mtime); if ($diff > $timethreshold) { print "\n - file " . $file . $insuffix . " in " . $inpa +th . " directory was created at more than " . Math::BigFloat->new($d +iff / 60) . " minutes."; # INSERT SUFFIX AGAIN ($insuffix) # PUT THE ACTION THAT YOU WANT DO HERE!!! } }

    i need to do it on regular interval like 5 minute or 15 minute, using an tool i use. so it really does not make sense if i check the files which i have already checked and i would not mind if this completes in minute or less then this but 15 minute is really too much

    kindly assist

    -KAKA-

      Your code can be optimized a lot. Here is some proposal but I cannot test it as I do not have your directories at hand. I added comments to explain what I am doing so I hope it helps:

      my @infiles = &GetINDirFiles($inpath); # instead of having an array with the outfiles use a hash for faster l +ookup # also remove suffix at this stage already, no need to do it again and + again in the loop # you need to escape your suffix variable in \Q...\E for special chara +cters such as the dot # only remove the suffix at the end, no need for (.*) my %outfiles = map { s/\Q$outsuffix\E$//; $_ => 1 } &GetOUTDirFiles($o +utpath); my $index = 0; # index used to get string position in array foreach my $infile (@infiles) { # see above re the replacement $infile =~ s/\Q$insuffix\E$//; # remove suffix to do comp +aration # Added by me # instead of loop through array of outfiles do hash lookup push (@delindex, $index) if exists $outfiles{$infile}; $index += 1; }

      UPDATE: Forget my code above. You can write this as:

      my %outfiles = map { /(.*)\Q$outsuffix\E$/; $1 => 1 } &GetOUTDirFiles( +$outpath); my @infiles = grep { /(.*)\Q$insuffix\E$/; not exists $outfiles{$1} } +&GetINDirFiles($inpath); print "@infiles\n";

      and it should be fast.

      UPDATE 2: Here is the full story.

      use strict; use warnings; sub GetINDirFiles { my ($path) = @_; opendir my $dir, $path or die $!; return grep {!/\_ACK_/} readdir $dir; } sub GetOUTDirFiles { my ($path) = @_; opendir my $dir, $path or die $!; return grep {/\_ACK.xml$/} readdir $dir; } # Main my $inpath = "./IN"; my $outpath = "./OUT"; my $outsuffix = "_ACK.xml"; my $insuffix = ".xml"; my $timethreshold = 900; # set time threshold in seconds (900 se +conds equal 15 minutes) my %outfiles = map { /(.*)\Q$outsuffix\E$/; $1 => 1 } &GetOUTDirFiles( +$outpath); my @infiles = grep { /(.*)\Q$insuffix\E$/; $1 and not exists $outfiles +{$1} } &GetINDirFiles($inpath); my $currenttime = time; # get current time from system (epoch t +ime) @infiles = grep { -f "$inpath/$_" and ( $currenttime - (stat "$inpath/ +$_" )[9] ) > $timethreshold } @infiles; # now you have all input files w/o corresponding output file that are +older than 15 minutes for (@infiles) { print "File $_ in $inpath directory was created ". ( $currentt +ime - (stat "$inpath/$_" )[9] )/60.0 ."minutes ago.\n"; # put your action here }
      Do you want to compare the names of files or the contents of files? Could you give an example of what data you're trying to compare, and what output you expect to get?

        My requirement is to check if a new file comes into IN folder, with a maximum delay of 15 minute same file with _ACK.xml is present in the OUT Directory or not?

        for example a01.xml comes in the folder IN, this will be processed by the application and sent it to OUT folder after processing (maximum time of processing is 15 minute) as a01_ACK.xml.

        Content is not important in this case. in windows i can use WMI to to check if new file is created in IN directory (instance created) and then check for the same in OUT folder but in UNIX i can not get such trigger so i had to choose the way of comparing files but i am not much into UNIX so cant think other than this.

        -KAKA-