I got a little bogged down with changing several things on a suite of pages on a remote server, so I wrote this. Right hand is not interpolated, but perhaps someone out there has a nice hack to fix that (I'm suffering from Perl blindness right now :).

Script takes a file containing a list of Regular Expressions with (currently!) literal replacements and applies them to files in a recursive set of directories. Output can be previewed, or original appended to rolling backup to allow (currenlty manual ): recovery from bad errors.

When done, backups can be removed simply

Comments and suggestions welcomed

cLive ;-)

#!/usr/bin/perl -w use strict; $|++; use File::Find; # # MULTIPLE SEARCH & REPLACE # # Search a directory and sub directory for ASCII files and perform lis +t of regexps # on them based on file input # # usage: # perl treeswitch.pl /directory/to/search /regexp/list/file backup # perl treeswitch.pl /directory/to/search /regexp/list/file preview |l +ess # # see __DATA__ below for sample data file (delete before installing) # # backup - *append* original 'file' to 'file.bak' for each file amend +ed # (useful for backtracking - may write a reverser one day :) # preview - don't change file(s) - just output changed # # perl treeswitch.pl /directory/to/search delete_backups # remove the .bak files found (used for clean-up when done) # # designed for web pages, but amend below as you see fit # file extensions to ignore (binary files are also ignored if correctl +y identified) my @ignore = qw(bak jpg jpeg gif png pl pm cgi); # must contain 'bak' or backups get backed up when backup option selec +ted # # Notes: # - script assumes you have read write permissions on all files that w +ill be changed # - if using preview, script assumes you have 'less' installed # - performs all rexexps with /gis modifiers - like I said, aimed at w +eb pages :) # (I think using eval would be big performance hit and would need be +tter safeguards) # - lack of eval also means that replace string is not interpolated - +ie match # variables (eg $1) etc won't work # beginning of script proper my @delete_files=(); my $search_dir = $ARGV[0] || die 'No searchdir specified'; -e $search_dir || die "Directory $search_dir does not exist"; -d $search_dir || die "$search_dir is not a directory"; # cleanup? ($ARGV[1] eq 'delete_backups') && delete_backups(); my $regexp_file = $ARGV[1] || die 'No regexp file specified'; -e $regexp_file || die "File $regexp_file does not exist"; -T $regexp_file || die "$regexp_file is not a text file"; my $option = $ARGV[2] || ''; if ($option && ($option !~ /^backup|preview$/) ) { die "Invalid option + - backup or preview only" } # read in regular expressions into hash my %regexp = (); open(REG,$regexp_file) || die $!; while (<REG>) { next unless (m/^\s*\/(.*)\/\s+->\s+\/(.*)\/\s*$/); $regexp{$1} = $2; } close(REG); # create hash for quicker testing of file extensions my %ignore = map { $_, 1 } @ignore; # grab text files - not 100%, so use @ignore above to filter out other +s! my @text_files = (); find(\&wanted, $search_dir); for (@text_files) { # read in file open(FILE,$_) || die $!; my $file_content = join '', (<FILE>); close(FILE); my $backup_file = $file_content; # perform changes for (keys %regexp) { $file_content =~ s/$_/$regexp{$_}/gis; # <- amend these modifiers if + situation requires } # update file or output if ($option eq 'preview') { print '-' x 60, "\nAmended file: $_\n", '-' x 60, "\n$file_content\n +"; } else { # create backup if requested if ($option eq 'backup') { open(FILE,">>${_}.bak") || die "$! (${_}.bak)"; print FILE "$backup_file\n\n", '-' x 60, "\n\n"; close(FILE); print "Backup ${_}.bak appended\n"; } open(FILE,">$_") || die "$! ($_)"; print FILE $file_content; close(FILE); print "$_ updated\n"; } } sub wanted { -T && (/\.(\w+)$/ && !$ignore{$1}) && push @text_files, $File::Find::name; } sub delete_backups { find(\&delete_wanted, $search_dir); for (@delete_files) { unlink $_; print "$_ deleted\n"; } exit(0); } sub delete_wanted { -T && ($File::Find::name =~ /\.bak$/) && push @delete_files, $File::Find::name; } __DATA__ Here's a sample regexp file - note that right hand is not interpolated ------------------------------------- # comments are ok in RegExp file, as long as they don't contain replac +e regexp # general rule /interpolated_pattern1/ -> /literal_replacement/ # spacing before or after elements can be added for legibility # this works /two words/ -> /words two/ # this doesn't - just replaces with '$2 $1' /(two) (words)/ -> /$2 $1/ # slashes are no problem - just escape them on LHS ONLY /http:\/\// -> /HTTP:///

--
seek(JOB,$$LA,0);

Replies are listed 'Best First'.
Re: recursive multiple search and replace
by samtregar (Abbot) on May 23, 2002 at 23:34 UTC
    I usually use a combination of 'find', 'xargs' and 'perl -i.bak -pe' to do this. For example, change all the p's to q's in a tree of HTML documents starting in /htdocs and creating backups to .bak files:

    $ find /htdocs -name '*.html' | xargs perl -i.bak -pe 's(p)(q)g'

    Shazam!

    -sam