PitifulProgrammer has asked for the wisdom of the Perl Monks concerning the following question:

Dear Members of the Perl Community,

I am new to this forum and quite new to programming. I picked up Perl to facilitate certain annoying tasks such as the following:

A given folder x contains a number of files.

The aim of the script is to open each file in the folder and replace the strings as shown in the code.

So far, the script opens the folder and reads the files into the array. The files are printed to my command line.

However, something seems to be wrong in the substitution section, since the files are not touched (same date of creation, etc.) and apparently no replacement has been effected and no backup files have been created, either.

I would be delighted if you could shed some light on my code, since I am not getting any error messages.

$^I = ".bak"; my $directory = "c:/temp/"; opendir( DIR, $directory ) or die "Unable to open dir!"; my @xml_files = grep( /\.xml$/, readdir(DIR) ); say "files found"; print "\n\n"; say for @xml_files; print "\n\n"; say "Replacing strings"; print "\n\n"; foreach my $file (@xml_files) { open( IN, "+>", $file ) or die $!; while( <IN> ){ $_ =~ s{&}{&amp;}g; $_ =~ s{&amp;amp;}{&amp;}g; $_ =~ s{\s>\s}{&gt;}g; $_ =~ s{\s<\s}{&lt;}g; print IN $file; } close ( IN ); } closedir (DIR); say for @xml_files;
Thanks a mil in advance for your support. C.

Replies are listed 'Best First'.
Re: search and replace strings in different files in a directory
by McA (Priest) on Aug 07, 2014 at 10:23 UTC

    Hi,

    have a look at perldoc perlrun to see some ways of calling Perl to do some very convenient tasks on the command line.

    In your case I'm pretty sure you can do your annoying job with:

    perl -pi.bak -e 's{&}{&amp;}g; s{&amp;amp;}{&amp;}g; s{\s>\s}{&gt;}g; +s{\s<\s}{&lt;}g;' *.xml

    Regards
    McA

      Dear McA,

      Thanks a mil for your help with command line editing and sorry for not saying thanks a wee bit earlier.

      The script ran smoothly and did everything as expected. However, I've been running the script via cygwin and was wondering if I could just transform it into a batch file (for all those who do not use cygwin).

      I am a bit concerned if *.xml would be interpreted correctly by a WIN/DOS system.

      Thanks a mil again for the cmd-solution.

      Looking forward to your reply

      Kind regards

      C.
        I am a bit concerned if *.xml would be interpreted correctly by a WIN/DOS system.

        Good point. DOS, Windows and OS/2 leave resolving wildcards to the application, whereas Unix and friends resolve wildcards in the shell. (See also Re^3: Perl Rename.) Luckily, someone has already taken care of this: Win32::Autoglob. Win32::Autoglob also "just works" (by doing nothing) on other platforms than Windows. (Unfortunately, this means Win32::Autoglob does NOT work as expected for DOS and OS/2, but as both platforms went the way of the dodo, this does not harm many people.)

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: search and replace strings in different files in a directory (Path::Tiny)
by Anonymous Monk on Aug 07, 2014 at 10:08 UTC

    You've fallen for the basic readdir trap

    Also, you're not using $^I correctly (you really don't want to use it anyway)

    Make your life easier by using Path::Tiny, improve this program

    #!/usr/bin/perl -- ## ## ## ## perltidy -olq -csc -csci=3 -cscl="sub : BEGIN END if " -otr -opr - +ce -nibc -i=4 -pt=0 "-nsak=*" ## perltidy -olq -csc -csci=10 -cscl="sub : BEGIN END if " -otr -opr +-ce -nibc -i=4 -pt=0 "-nsak=*" #!/usr/bin/perl -- use strict; use warnings; use Path::Tiny qw/ path /; use POSIX(); use autodie qw/ close /; Main( @ARGV ); exit( 0 ); sub Main { my $date = POSIX::strftime( '%Y-%m-%d', localtime ); my $bak = "$date.bak"; my @xml_files = path( $directory )->children( qr/\.xml$/ ); for my $file ( @xml_files ) { Diddle( $file, "$file-$bak" ); } } ## end sub Main sub Diddle { my( $in, $bak ) = @_; path( $in )->move( $bak ); my $infh = path( $bak )->openr_raw; my $outfh = path( $in )->openrw_raw; while( <$infh> ) { s{&}{&amp;}g; ## will match more than what you want fix it s{&amp;amp;}{&amp;}g; s{\s>\s}{&gt;}g; s{\s<\s}{&lt;}g; print $outfh $_; } close $infh; close $outfh; } ## end sub Diddle
      Dear Anonymous Monk

      Sorry for not having replied earlier. I was working on something else. However, last week, I was asked to have a go at the script again.

      What has changed is that the directories and files should now be accessed individually.

      The idea was to have a text file containing the pathnames in a network folder.

      My issue and the reason for reopening the question is how to add the different pathnames from the text-file, which are stored in the array to the subroutine

      I am having some difficulties with the line containing 'directory', because none of the changes I made actually worked

      I figure the pathnames have to go in there.

      I've been getting the following error messages

      c:\dev>perl search_and_replace.pl Global symbol "$directory" requires explicit package name at search_an +d_replace.pl line 33. Execution of search_and_replace.pl aborted due to compilation errors.
      c:\dev>perl search_and_replace.pl Path::Tiny paths require defined, positive-length parts at search_and_ +replace.pl line 33
      c:\dev>perl search_and_replace.pl Variable "$path" is not imported at hornbach_mit_verz_suche.pl line 34 +. (Did you mean &path instead?) Global symbol "$path" requires explicit package name at search_and_rep +lace.pl line 34.
      c:\dev>perl search_and_replace.pl ': Filename too long at search_and_replace.pl line 34.
      I can explain some of the errors such as variable scope (I don't think I can just $path inside the sub), but the question remains how to get the pathnames into the sub, but I have not yet touched subroutines in detail yet.

      I am sorry for all these questions, this script has evolved in a way I could not have aniticipated and the guys at work are eager for a solution.

      Oh yes, before I forget, this is the latest version of the script

      use 5.014; use strict; use warnings; use File::Slurp qw(:all); use Path::Tiny qw/ path /; use POSIX(); use autodie qw/ close /; my $infile_paths = 'C:\temp\test_folder\paths.txt'; my @paths = read_file $infile_paths, { binmode => ':utf8' } or die $!; chomp(@paths); # say for @paths; foreach my $path (@paths){ Main($path), } Main( @ARGV ); exit( 0 ); sub Main { my $directory = ""; my $date = POSIX::strftime( '%Y-%m-%d', localtime ); my $bak = "$date.bak"; my @xml_files = path( @paths )->children( qr/\.xml$/ ); for my $file ( @xml_files ) { Replace( $file, "$file-$bak" ); } } ## end sub Main sub Replace { my( $in, $bak ) = @_; path( $in )->move( $bak ); my $infh = path( $bak )->openr_raw; my $outfh = path( $in )->openrw_raw; while( <$infh> ) { s{&}{&amp;}g; ## will match more than what you want fix it s{&amp;amp;}{&amp;}g; s{\s>\s}{&gt;}g; s{\s<\s}{&lt;}g; print $outfh $_; } close $infh; close $outfh; } ## end sub Replace

      Thank you all so much for going through the trouble and explaining stuff to beginner. You guys rock, keep it going! I wish I had more time to play with the code and read up on certain aspects.

      Thank you so much. Looking forward to your reply.

      Please let me know if you need additional information, I still need to get used describing my problems in detail

      Kind regards C.
Re: search and replace strings in different files in a directory
by aitap (Curate) on Aug 07, 2014 at 11:28 UTC
    By the way, unless you are sure that there are no other XML/SGML-unsafe characters in your files, it could be useful to switch to HTML::Entities::encode_entities_numeric function to encode your files.
      Dear aitap, Thanks a mil for your hint about the module. I will surely try it out. However in the present case, I cannot be sure that I am just dealing with wrong entities. The script is more or a less a general replacer script for my colleagues Thanks a lot for your help. Kind regards C.