codewalker has asked for the wisdom of the Perl Monks concerning the following question:

<contrib-group> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . </contrib> </contrib-group>

To Be coded as:

<contrib-group> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . <x> and </x></contrib> <contrib contrib-type="author"> . some code . </contrib> </contrib-group>

Replies are listed 'Best First'.
Re: To find the last occurence and replace in perl
by Laurent_R (Canon) on Dec 30, 2014 at 10:55 UTC
    An XML parsing module such as the one mentionned in the posts above is probably the right way to go. However, for such a simple change (and assuming that's really all you need to do), you may try to do it manually. The easiest might be to reverse the lines (you may have to store them first in an array if it is not already the case), make the change on the first occurrence of what you are looking for, and then reverse the lines again.

    Update: Fixed a typo in the parenthesed sentence about having to store the data in an array.

Re: To find the last occurence and replace in perl
by Anonymous Monk on Dec 30, 2014 at 08:19 UTC
      #!/usr/bin/perl -- use strict; use warnings; use XML::LibXML 1.70; ## for load_html/load_xml/location my $xml = q{<contrib-group> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . <x>, </x></contrib> <contrib contrib-type="author"> . some code . <x> and </x></contrib> <contrib contrib-type="author"> . some code . </contrib> </contrib-group> }; my $dom = XML::LibXML->new(qw/ recover 2 /)->load_xml( #~ location => $filepath_or_http, string => $xml, ); print $dom->find(q{ //contrib/x[ contains( . , ' and ' ) ] })-> +shift->nodePath, "\n"; if( my $cx = $dom->findnodes(q{ //contrib/x })->pop ){ print $cx->nodePath, "\n"; print "$cx\n"; my $tx = $cx->textContent; $tx =~ s/ and / , /; $cx->removeChildNodes; $cx->appendText( $tx ); print "$cx\n\n\n"; } print "$dom\n"; __END__ /contrib-group/contrib[3]/x /contrib-group/contrib[3]/x <x> and </x> <x> , </x>
      A reply falls below the community's threshold of quality. You may see it by logging in.
Re: To find the last occurence and replace in perl
by locked_user sundialsvc4 (Abbot) on Dec 30, 2014 at 15:40 UTC

    IMHO, “an XML-parsing module” is the only way to go, and XML::LibXML is an excellent choice, since it employs a binary module that is quite likely to be the same one that was used to produce the file.   A very real consideration when dealing with XML files “manually” is that, sooner or sooner, the line-by-line format of the file will vary from whatever your hand-rolled program was built to expect.   (And it is most likely to do so at 4:45 PM on the evening of your child’s piano recital ... don’t miss it, Dad ...)

    Therefore:   use a module/library.   Period.   There are actually several good ones to choose from ... XML::Twig is another good war-horse aimed especially at b-i-g files.   And leave us not forget XML::Simple.   Treat the XML file as an abstract thing, which the module magically knows how to manipulate for you.   You will be rewarded with a durable, long-lasting solution to the problem that won’t have to be revisited, no matter what the next XML file looks like.

      “an XML-parsing module” is the only way to go ...
      Yeah, right, I agree in principle, but what are you gonna do in the event that when this otherwise very fine module chokes on an error in the XML syntax? This is not a rhetorical question (sh*t happens, you know), I have seen that happening several times, although I don't remember precisely which module it was (and also with HTML modules, BTW).