bryank has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I am writing a script that does the following:

1. traverses directory(ies), looking for a file called 'metadata.xml'

2. Looks for a tag called <merchant>

3. Replaces underscores with spaces in value assigned to <merchant> tag.

My script seems to work, but I could use help/tips on the following:

1. creating a backup file of any file that I alter.

2. the directories I look at are originally in zip format. Right now I manually unzip before updating, and then rezip. I'd like to implement a process that unzips the original zip, traverses the directories and modifies the metadata.xml files, rezips the files, and removes any clutter.

3. Any tips on making the code more efficient, elegant, and less redundant..

Thanks!

#!/usr/bin/perl use strict; use warnings; use File::Find; @ARGV = ('.') unless @ARGV; my $dir = shift @ARGV; find(\&edits, $dir); sub edits() { my $seen = 0; my $file = $_; if ($file eq 'metadata.xml') { open (my $file_fh, $file) || die "Can't open $file!\n $!"; my @lines = <$file_fh>; close $file_fh; open $file_fh, ">$file"; foreach my $line ( @lines ) { if ($line =~/merchant/) { $line =~s/_/ /g; } print $file_fh $line; $seen++; } close $file_fh; } print "Updated $File::Find::name\n" if $seen > 0; }

Replies are listed 'Best First'.
Re: Recursive editing of a single xml tag..
by Anonymous Monk on Jun 21, 2009 at 16:42 UTC
    You should use xml parser, like XML::Twig You would adapt example Building an XML filter like
    my $orig = 'doc.xml'; use autodie 1.999; use POSIX 'strftime'; my $back = $orig . strftime( '-%Y-%m-%d', localtime ); rename $orig, $back; open my $newfh, '>', $orig; my $t = XML::Twig->new( twig_roots => { 'qpass:merchant' => sub { my ( $t, $price ) = @_; { my $ra = $price->text; $ra =~ s/_/ /g; $price->set_text($ra); } $price->print($newfh); }, }, twig_print_outside_roots => $newfh, ); $t->parsefile($back); $t->flush; #don't forget undef $t; # close $newfh;
Re: Recursive editing of a single xml tag..
by ww (Archbishop) on Jun 21, 2009 at 18:41 UTC

    ...and on the theme of Modules (aka 'don't reinvent the wheel'), a quick search of CPAN reveals:

    • Nicholas Clark / ex-lib-zip
    • Steve Peters / Archive-Zip-1.16
      and
    • Archive::Zip::MemberRead

    each of which has a brief summary suggesting it may save you the manual unzipping.

    If you're using ActiveState, ppm search zip will provide an alternate source (albeit, with a number of modules aimed at zipcodes).

    Update 14:43: Applied missing close quote in the first graf