LF has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a script to insert 3 words into some xml files.

$dir contains many xml files with the same format.

The tag <dc:description> appears 4 times in each file

This script needs to go through the directory, and for each of the xml files, do nothing with the first <dc:description> tag, insert "Notes:" immediately after the 2nd tag, "Director:" after the 3rd tag, and "Actors:" after the 4rth tag.

I am having trouble with the code for inserting the 3 words. Here are a couple of things I tried at the ***insert***:
1.
s/(?:$search>){3}()\w+\s+/Notes:/; s/(?:$search>){5}()\w+\s+/Director:/; s/(?:$search>){7}()\w+\s+/Actors:/;

2.
Someone suggested below (I'm updating this post) using
my $data = do {local $/;<OLD>}; # slurp the whole file into $data my @replace_list = ('','Notes:','Director:','Actors:'); while ($data =~ s/(?<=$search)/shift @replace_list/egs) {};

(Thanks.) But I'm not familiar with some of this syntax and am not sure how to modify it

Here is my code. Can you show me either how to modify one of these methods or suggest another way to insert words after the repeated tags?

#!/usr/bin/perl -w use warnings; use strict; my $record_count = 0; my $search = 'dc:description'; my ($dir) = @ARGV; defined($dir) || usage(); chop($dir) if $dir =~ m#/$#; opendir(DIR, $dir) || die "Can't open $dir\n"; $file = readdir(DIR); $file = readdir(DIR); my $temp = 'temp.xml'; while (defined($file = readdir(DIR))) { print "Defined $dir/$file\n"; open(OLD, "< $dir/$file") or die "can't open $dir/$file: $!"; open(TEMP, ">> $dir/$temp") or die "can't open $dir/$temp: $!"; while (<OLD>) { #***insert*** print TEMP $_ or die "can't write $temp: $!"; } close(OLD) or die "can't close $dir/$file: $!"; close(TEMP) or die "can't close $dir/$temp: $!"; rename("$dir/$file", "$dir/$file.orig") or die "can't rename $file to +$file.orig: $!"; rename("$dir/$temp", "$dir/$file"); $record_count++; } closedir (DIR); msg("Processed $record_count files from $dir"); sub msg { print @_, "\n"; } sub usage { msg("Usage: $0 <directory>"); exit(1); }
Thanks!

Replies are listed 'Best First'.
Re: Inserting Text into Files within a Directory
by davorg (Chancellor) on Aug 24, 2004 at 14:30 UTC

    For the problems you're having with "open", perhaps you should ask Perl to tell you what the problem is - by including $! in your error message.

    For parsing and changing XML, you should probably be using an XML parsing module rather than regexes. Sounds like XML::Treebuilder would be best for this job.

    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: Inserting Text into Files within a Directory
by bgreenlee (Friar) on Aug 24, 2004 at 14:47 UTC

    Maybe your open is failing because you're not looking in $dir? Try this:

    open(OLD,"<$dir/$file") or die "can't open $dir/$file: $!";

    As for your substitutions, here's one way to do it (note: untested code):

    my $data = do {local $/; <OLD>}; # slurp the whole file in to $data my @replace_list = ('','Notes:','Director:','Actors:'); while ($data =~ s/(?<=$search)/shift @replace_list/egs) {};

    BTW, I can understand doing it the quick-and-dirty way with s///, but if this is more than just a one-off thing, you might consider doing it with XML::Parser.

    -b

      How do I create a temporary file? It cannot open $new because it is not defined...

        Your best bet is to step through your code with the debugger (run it with perl -d) and see what the values of $dir and $file are before the call to open. See perldebug if you're unfamiliar with the debugger.

        Also, you have use warnings and use strict commented out in your program. Uncomment them and see what happens.

        -b

        First of all, don't ask a new question by editing your previous one and replacing the text. It makes my above reply look completely irrelevant.

        As for creating a temporary file, there are lots of ways to do it. One that might work for you is to just add an extention on the name of the existing file:

        open(NEW,">$dir/$file.new") or die "Can't open $dir/$file.new: $!"

        -b