Sorry about the last 2 posts... somehow wasn't logged in.
I don't want to remove other line breaks, only when the regex matches. EX:
<meta name="revision" content=
"Mon, 05 Jul 2006 23:59:59 GMT">
I need to get rid of that line break. my output:
changing to C:/
*Processing: C:/working
file type not processed
*Processing: C:/working/CIPP_en.html
matched: <meta name="revision" content=
*Processing: C:/working/Copy of CIPP_en.html
matched: <meta name="revision" content=
*Processing: C:/working/Copy of index_en.html
*Processing: C:/working/index_en.html
log file updated. 4 files - 2 matches
| [reply] [d/l] [select] |
aaah, i know why. It's chomping it in memory... i have to overwrite the file... ugh
So, if i'm going line by line, chomp it. Now I need to rewrite the file right? How can i do that? I need the entire file contents to do that. Am I supposed to slurp and read line x line or just do a switch?
---- disregard, i get it now. I have to read it into an array and then write the array | [reply] |
YEEEHAAA, GOT IT.... any suggestions on tightening it up?
#!C:\Perl\bin\perl.exe
use strict;
use File::Find;
use File::Slurp;
use Time::Local;
print "\nRunning ... \n\n";
my $root = "C:/"; # use forward slash, you can use mapped drives.
print "changing to $root\n";
chdir $root;
my $no_switches=0;
my $no_files=0;
my @log = ();
my $dir;
# find (\&Wanted, "department", "managers", "mybranch", "mycity", "myi
+nfo", "resources");
find (\&Wanted, "working"); #directories - comman delimited
sub Wanted {
print "*Processing: $root$File::Find::name \n";
if ($_ =~ /\.htm(l)?$/i) {
open(xFILE, $_) or die "ERROR: couldn't open file";
my @file = <xFILE>;
foreach my $line (@file) {
close (xFILE);
if ($line =~ m/(<meta[\s\r\n\t]+name="revision"[\s\r\n\t]+
+content=[\n]+)/i) {
open(FILE, ">$_") or die "ERROR: Can't open $_";
chomp($line);
print FILE "@file"; # @file array is entire file.
close (FILE);
print "file overwritten\n";
push @log,"$root$File::Find::name \n\n";
$no_switches++;
} # end if
}
$no_files++;
} #end if matches filetype
else { print "file type not processed\n\n"; } #add $_ if you want
+ to see URL of file not processed.
} # end sub
# add timestamp, # files, # matches to log
open(LOG, ">H:/Web/perl/log.txt") or die "ERROR: Can't open log.txt";
my $timestamp = localtime();
print LOG "$no_files files - $no_switches matches - $timestamp\n";
foreach(@log) { print LOG; }
close (LOG);
print "log file updated. $no_files files - $no_switches matches";
| [reply] [d/l] |
Well hey, if it's working, what's to fix? But if it were my script, and the purpose is simply to make sure that two particular lines from an input file become a single line in the output file, I'd do something like this in the "Wanted" function: read the input one line at a time and push each one onto an array, if the current line matches the special pattern, attach the next line to it, and when all the input lines have been read in, write the set to a new output file if that line-joining logic actually applied (if it didn't, no need to rewrite the file).
sub Wanted
{
if ( !/\S\.html?$/i ) {
warn "$root$File::Find::name skipped\n";
}
elsif ( !open( IN, "<", $_ )) {
warn "open for read failed on $root$File::Find::name : $!\n";
}
else {
my @lines = ();
my $line;
my altered = 0;
while ( defined( $line = <IN> )) {
if ( $line =~ /<meta\s+name="revision"\s+content=\s*$/i )
+{
$line =~ s/\s*$//;
$line .= <IN>;
$altered++;
}
push @lines, $line;
}
close IN;
if ( $altered ) {
open( OUT, ">", $_ ) or die "open for write failed on $roo
+t$File::Find::name : $!\n";
print OUT @lines; ## NB: no need for quotes here
close OUT;
}
warn "$root$File::Find::name finished with $altered changes\n"
+;
}
}
(untested)
That assumes you can redirect STDERR to a file, which any decent shell can do (e.g. ksh or bash, both available for ms-windows), to keep all the "warn" outputs for logging. Or you can open a log file in the main routine and always print the logging messages to that. If you just push them onto an array like you were doing (to be printed when the find() function finished), none of the log data would get written in the event of a "die" condition.
Note that when you print an array like this: print "@array" the quoting will normally cause a space character to be inserted between the elements of the array, and you might not want that. | [reply] [d/l] [select] |