Re^24: search and replace strings in different files in a directory

I notice you have this:

while( <$infh> ) {
        s{&}{&amp;}g; ## In some case does not match as intended
        s{&amp;amp;}{&amp;}g;
        ...
}
[download]

presumably because, when the input line already contains &, the first substitution changes it to &amp;, so the second substitution is needed to change it back again! Better to replace these two substitutions with a single substitution using a negative look-ahead assertion (?!...). Proof-of-concept:

14:25 >perl -wE "my @s = ('Fred & Wilma', 'Barney &amp; Betty'); for (
+@s) { s{&(?!amp;)}{&amp;}g }; say for @s;"
Fred &amp; Wilma
Barney &amp; Betty

14:25 >
[download]

See “Look-Around Assertions” in perlre#Extended-Patterns.

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

Comment on Re^24: search and replace strings in different files in a directory Select or Download Code

Replies are listed 'Best First'.
Re^25: search and replace strings in different files in a directory by PitifulProgrammer (Acolyte) on Sep 10, 2014 at 11:02 UTC
Dear Athanasius Thanks a mil for posting your regular expression. It is quite funny, since that line that caught your interest was no longer part of the code, I must have posted this particular version by accident. However, this will surely resolve some issues to come. Your help is much appreciated Thanks a mil again, I will bookmark the extended regex patterns, I am sure I might be needing them soonish Kind regards C.	[reply]
Re^26: search and replace strings in different files in a directory by PitifulProgrammer (Acolyte) on Sep 12, 2014 at 10:43 UTC
Dear Monks As promised last time, I ran the code using new files that needed to be checked at work Given the previous comments, examples and the lovely testing script one of you provided, I found out that some of the directories had not been touched due to unicode characters (mostly umlauts) and whitespace in the actual file name. I went through previous posts in the forum and the web checking for answers. One suggestion was using the Encode module for the file names I am reading from the text file. I took a look at the module, but I am a bit at a loss how to implement the module in the subroutines and modules which are already in use. I assume that the GetPaths subs needs some editing, provided I am on the right track. Would be grand if you guys could give me a hint on a) whether Encode is the right module for reading umlaut-and-whitespace-packed file names. b) if not => any other solution to the issue Thanks a mil in advance Kind regards C #!/usr/bin/perl -- use 5.014; use strict; use warnings; use Path::Tiny qw/ path /; use POSIX(); use autodie qw/ close /; use File::BOM; use Carp::Always; use Data::Dump qw/ dd /; use Encode qw(encode decode); Main( @ARGV ); exit( 0 ); sub Main { #my( $infile_paths ) = @_; #if run via my( $infile_paths ) = 'C:\dev\test_paths.txt'; chomp $infile_paths; my @paths = GetPaths( $infile_paths ); for my $path ( @paths ){ RetrieveAndBackupXML( $path ); } return @paths; } ## end sub Main sub GetPaths { use File::BOM; ## my @paths = path( shift )->lines_utf8; my @paths = path( shift )->lines( { binmode => ":via(File::BOM)" } + ); s/\s+$// for @paths; # "chomp" return @paths; } ## end sub GetPaths sub RetrieveAndBackupXML { my( $directory ) = shift; ## same as shift @_ ## my $date = POSIX::strftime( '%Y-%m-%d', localtime ); #suffix + for the backup-file, e.g. 2014-08-01 my $bak = "$date.bak"; my @xml_files = path( $directory )->children( qr/\.xml$/ ); for my $file ( @xml_files ) { Replace( $file, "$file-$bak" ); } } ## end sub Main # Fix xml entities and create a copy of the original file before editi +ng sub Replace { my( $in, $bak ) = @_; path( $in )-> copy( $bak ); #create a copy of $in with the ending( +s) specified in $bak my $infh = path( $bak )->openr_raw; my $outfh = path( $in )->openrw_raw; while( <$infh> ) { s{&}{&}g; ## In some case does not match as intended s{\s>\s}{>}g; s{\s<\s}{<}g; print $outfh $_; } close $infh; close $outfh; } ## end sub Replace [download]	[reply] [d/l]
Re^27: search and replace strings in different files in a directory by Anonymous Monk on Sep 13, 2014 at 07:27 UTC
Path::Tiny has already done the decode step for you ... see Re^17: search and replace strings in different files in a directory, Re: links in pathnames under Windows 8 and figure it out	[reply]
Re^28: search and replace strings in different files in a directory by PitifulProgrammer (Acolyte) on Sep 18, 2014 at 08:50 UTC
Re^29: search and replace strings in different files in a directory by PitifulProgrammer (Acolyte) on Oct 01, 2014 at 09:55 UTC