in reply to URL search and replace in the files

Well, clearly, as you already write

$lines =~ s{http://www.abc.com}{http://www.test.com} does not work because it is too generic

options:

1. You pause at each substitution and ask if it should be replaced. You cache the answer, so you only ask once per URL. Takes a while.

2. You dump all found URL's into a single, sorted file, then peruse it. Find things that need to stay the same (blacklist), and things that should be changed (whitelist). What falls in between, you use the $ans=<STDIN> trick to interactively change

Samplecode for 1:

#!/usr/bin/perl my %YES; my %NO; $a='pat http://www.abc.com/test.gif ma http://www.abc.com/hello.html h +ttp://www.abc.com/test.gif '; $a=~s{(http://[\w\.\-\?\&\;\#\/]+)}{&ask($1)}gexi; sub ask{ my($url) =@_; return $url unless index($url,'www.abc.com'); # add more "return $url if condition;" here (blacklist) if($YES{$url}){ $url =~ s/www.abc.com/www.test.com/; return $url; }elsif($NO{$url}){ return $url; }else{ print "substitute $url ?"; $ans = <STDIN>; if($ans =~ m/y/i){ ++$YES{$url}; }else{ ++$NO{$url}; } return ask($url); } }

3: You already know what you will replace, and it does not match other things,

use File::Slurp; use warnings; use strict; my %PATTERNS =( 'http://www.abc.com/test\b' => 'http://www.test.com/twist', 'http://www.abc.com/(?:test[\d])\b' => 'http://www.test.com/', ); # patterns to regexps my @REGEXPS = map { qr/$_/ } keys %PATTERNS; # read from commandline die "usage: $0 <filenames> ...\n" unless @ARGV; for my $filename (@ARGV){ die "NOT A FILE! '$filename' " unless -f $filename; die "NOT READABLE! '$filename' " unless -r $filename; # read in a whole file into an array of lines my $lines = read_file( $filename ); my $changes = 0; for my $r (@REGEXPS){ if($lines =~ $r){ $changes++; last; } } if($changes == 0){ print "no changes for $filename\n"; exit 0; } rename $filename, $filename . ".bak"; my ($r,$s); for $r (keys %PATTERNS){ $s = $PATTERNS{$r}; $lines =~s/$r/$s/gei; } # write out a whole file write_file( $filename, $lines ); print "Modified $filename\n"; }

4. You take the url, get the new page, if it exists, it needs to be renamed. (curl -I fetches only the headers, and not the content, there you search for the "200 OK")

$result = `curl -I "$url"`; if($result=~m{HTTP/1.1 200 OK}){ # proceed to rename }

5. Lots of more options, tired now.