Filter File Using HASH

pr19939 has asked for the wisdom of the Perl Monks concerning the following question:

$fileArray;
my $totlogfile = "$today-TotalLogFile";
my $totlogfile1 = "today-TotalLogFile1";
my $totlogfilebkup="TotalLogFileBkup";
open(total,">$totlogfilebkup") || die("Could not open out file!$outfil
+e");#outfile is declared before
opendir(DIR, "logfiles") or die "couldn't open logs";
          while ( defined ($filename = readdir(DIR)) ) 
           {
             $index = index($filename,$yesterday);
              if ($index > -1) 
              {
                $fileArray[$count] = $filename;
                $count = $count + 1;
                    print "The log file name is $filename.\n";
                      open(logfile,"$filename") || die("Couldx not ope
+n file! $logfilename");#$logfilename declared
                        while($line = <logfile>)
                        {
                            chomp($line);
                            unless(( $line =~ /\.gif/i ) || ( $line =~
+ /\.jpg/i ) || ( $line =~ /\.jpeg/i ) || ( $line =~ /\.js/i ) || ( $l
+ine =~ /\.css/i ) || ( $line =~ /tickerServlet/i ) || ( $line =~ /nag
+ios/i ) || ( $line =~ /statusservlet/i ))
                            {
                              print total "$line\n";                  
+              
                            }
                        }
                      close logfile;                        
                }
           }
        closedir(DIR);
close total;
[download]

I have given only a part of the code.But i think this will do. Thanks

Comment on Filter File Using HASH Download Code

Replies are listed 'Best First'.
Re: Filter File Using HASH by cog (Parson) on Feb 01, 2005 at 12:06 UTC
The line you don't understand can be read as: "if there is a true value in the hash table %wanted for the key $server". That said, that while loop is reading a line, extracting the server name from it and, if there is an entry for that server, it prints the line.	[reply]
Re: Filter File Using HASH by holli (Abbot) on Feb 01, 2005 at 12:08 UTC
A hash is a data-structure, like an array. The difference is that the lookup is not done via a numerical index, but with a key. Eg., if you create a hash like this: `%hash = ( "keys" => "value" );` [download] You will be able to refer to "value" by saying: `print $hash{"value"};` [download] In your case `print $wanted{"members.aol.com"};` [download] will print "1". So if your regex matches, the matched part is used to make a lookup in your hash. If that returns true, the $line is printed. You should definitly read perldata Note: I am a bit surprised that you seem to understand a regex, but have problems with something very basic like hashes. holli, regexed monk	[reply] [d/l] [select]
Re^2: Filter File Using HASH by pr19939 (Initiate) on Feb 01, 2005 at 13:02 UTC
Hi, Thanks for you response.I will be precise in what i want to acheive. I have 10 log files of 35,000 lines each.I have looped through them and index searched them for .gif and .jpg extensions and have written them into a single log repository.This process takes 3 hours. I would like to know what would be best approach.Time frame is my main concern. Also the above-mentioned code did not work. Please help Thanks	[reply]
Re^3: Filter File Using HASH by holli (Abbot) on Feb 01, 2005 at 13:05 UTC
Can we have sample-data, please? Several significant lines are enough. holli, regexed monk	[reply]
Re: Filter File Using HASH by holli (Abbot) on Feb 01, 2005 at 13:52 UTC
I have given only a part of the code.But i think this will do. Thanks No, it will not. We have no clue what your logfile looks like, nor what output you want to achieve. Especially since you deleted the original part of your question. holli, regexed monk	[reply]
Re^2: Filter File Using HASH by pr19939 (Initiate) on Feb 01, 2005 at 14:01 UTC
$fileArray; my $totlogfile = "$today-TotalLogFile"; my $totlogfile1 = "today-TotalLogFile1"; my $totlogfilebkup="TotalLogFileBkup"; open(total,">$totlogfilebkup") \|\| die("Could not open out file!$outfil +e");#outfile is declared before opendir(DIR, "logfiles") or die "couldn't open logs"; while ( defined ($filename = readdir(DIR)) ) { $index = index($filename,$yesterday); if ($index > -1) { $fileArray[$count] = $filename; $count = $count + 1; print "The log file name is $filename.\n"; open(logfile,"$filename") \|\| die("Couldx not ope +n file! $logfilename");#$logfilename declared while($line = <logfile>) { chomp($line); unless(( $line =~ /\.gif/i ) \|\| ( $line =~ + /\.jpg/i ) \|\| ( $line =~ /\.jpeg/i ) \|\| ( $line =~ /\.js/i ) \|\| ( $l +ine =~ /\.css/i ) \|\| ( $line =~ /tickerServlet/i ) \|\| ( $line =~ /nag +ios/i ) \|\| ( $line =~ /statusservlet/i )) { print total "$line\n"; + } } close logfile; } } closedir(DIR); close total; [download] My file will have lines as follows 3.77.65.36 - - 16/Jan/2005:00:01:08 -0500 "GET /images/spacer.gif HTTP/1.0" 200 43 0 "-" "Mozilla/3.01 (compatible;)" "-" 3.45.14.157 - - 16/Jan/2005:00:02:22 -0500 "HEAD /portal/site/energy/ HTTP/1.1" 200 - 0 "-" "libwww-perl/5.11" "-" Lines similar to the above. I want line 2 but not line 1 coz it has gif. Thanks	[reply] [d/l]
Re^3: Filter File Using HASH by holli (Abbot) on Feb 01, 2005 at 14:11 UTC
`while($line = <logfile>) { unless( $line =~ /(\.gif\|\.jpg\|\.jpeg\|\.js\|\.css\|tickerServlet\|nag +ios\|statusservlet)/ { print total $line; } }` [download] holli, regexed monk	[reply] [d/l]
Re^4: Filter File Using HASH by pr19939 (Initiate) on Feb 01, 2005 at 14:29 UTC
Re^5: Filter File Using HASH by holli (Abbot) on Feb 01, 2005 at 15:46 UTC
Re: Filter File Using HASH by pr19939 (Initiate) on Feb 01, 2005 at 13:19 UTC
Hi, Thanks for you response.I will be precise in what i want to acheive. I have 10 log files of 35,000 lines each.I have looped through them and index searched them for .gif and .jpg extensions and have written them into a single log repository.This process takes 3 hours. I would like to know what would be best approach.Time frame is my main concern. Also the above-mentioned code did not work. Please help Thanks	[reply]