Whoa gnarly dude! That is totally way faster!
Here's the details on what I am doing in case anyone spots something totally stupid on my part ...
#!/usr/local/bin/perl -w
use strict;
my %seen;
my @logfiles = glob ( "*access_log" );
foreach my $logfile (@logfiles) {
open ( MYLOG, ">>progress_safe" );
print MYLOG "Doing $logfile\n";
close ( MYLOG );
process_file( $logfile );
}
sub process_file {
my $fn = shift;
open ( FH, "<$fn" );
while (<FH>) {
chomp;
s/\W/_/g;
my $new_empty_file = substr( $_, 0, 200 );
my $target = "$new_empty_file";
if ( $seen{$target} && ($fn ne $seen{$target}) ) {
$seen{$target} = "$seen{$target} and $fn both
+have :$target\n";
open ( DUP, ">>dups_found_safe" );
print DUP "$seen{$target}";
close DUP;
} else {
$seen{$target} = "$fn";
}
}
}
... thanks Joost and Monks!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.