Re (tilly) 1: I have Wma file jammed in my regex

I would guess that the problem is that the slowness is in statting your wma files. Do you have more of those than others? Are they grouped on the same server? In a particularly big directory? Is stat failing on a lot of them?

What follows is an untested but cleaned up version that works incrementally. Were I maintaining this long-term I might declare multiple passes through the log files (one for each type) to be a mistake and I would do one scan for all types at once. But if it is good enough, this is simple.

use strict;
#sifts all the server logs based on media type

my @servers=('XXXXX','YYYYYYY','ZZZZZZZ');
my $dir1='//workstation/share/directory/';
my @types= qw(mp3 avi mpg mpe wav mov rmj zip exe wma);

foreach my $type (@types){
  my $total=0;
  my $out = "$dir1/sifted/$type.txt";
  open (OUT, "> $out") or die "Cannot write to '$out':$!";
  foreach my $server (@servers){
    my $in = "$dir1/$server\.txt";
    unless(open IN,"< $in") {
      warn "  Cannot read from '$in': $!";
      next;
    }
    my $re = qr/\.$type\z/i; # I assume this is what you want?
    while (<IN>) {
      chomp;
      if (/$re/){
        # Get filesize
        my $kbytes = (stat)[7]/1024;
        if (defined($kbytes)) {
          $total += $kbytes;
          print OUT "$_\t$kbytes KB\n";
        }
        else {
          print OUT "$_\tNOT FOUND\n";
        }
      }
    }
  }
  my $mbytes = $total/1024;
  print OUT "\n\nTotal: $mbytes MB\n";
  print "Finished $type...\n";
}
[download]

BTW some points.

You seem to have some misconceptions about what you are supposed to call close on, which have not been biting you because Perl has done a good job of figuring out when to call it itself.
You used an 8-space indent. I recommend less. In studies the most "aesthetically pleasing" indent was 6. However comprehension appears to be best in the 2-4 range. Consistency matters more here than what particular choice you make. I happen to use 2.
You obviously want failing to read a server file to be a graceful error. Even so you probably should be reporting it.
I used qr// for the RE. This avoids compiling multiple times and is faster. Also by saying that it can only match at the end of the string the RE engine knows it can be smart and just jump to the end rather than scanning the whole string.
Working incrementally through log files is much more memory efficient than slurping them into memory.
There are 1024 (ie 2**10) bytes in a K, and 1024 K in a Meg.
Just adding strict on this caught several real mistakes. (Such as your writing to a different filename than would have been reported in your die.)
I prefer having explicit statements of when things were not found. That provides something you can grep for later.

Comment on Re (tilly) 1: I have Wma file jammed in my regex Download Code

Replies are listed 'Best First'.
Ran out of OCD meds.. by PiEquals3 (Acolyte) on Jan 19, 2001 at 23:31 UTC
There are 1024 (ie 210) bytes in a K, and 1024 K in a Meg. Aren't there only 1000 K in a Megabyte? I thought the reason for making 1024B/K didn't apply in the K2M case.. -- `(nit) (nit) (nit) (nit) (nit) (nit) ^ \| +--------------I pick this one!` [download] --	[reply] [d/l]
Re: Ran out of OCD meds.. by myocom (Deacon) on Jan 19, 2001 at 23:43 UTC
No, strictly speaking there are 1024 bytes per kilobyte, 1024 kilobytes per megabyte, 1024 megabytes per gigabyte, etc. The only people who regularly change these rules work in the marketing department of hard drive manufacturers.	[reply]