Colin_R has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks!

I am trying to edit an .xml file, the most relevant field of which is generally indicated by body="..."

In cases where the body content contains quotation marks, the format changes to body=' ...".." ...'

body= is always the sixth field, so that its content is preceded by five matched pairs of fieldname="..."

I would be grateful for any hints on how to harvest the body content.

Thank you,

Colin

p.s. my current awk one-liner fails to recognise the body='...' demarcator:

awk -F \" '{print $12}' < infile.xml > outfile.xml

Replies are listed 'Best First'.
Re: conditional input field separator?
by ww (Archbishop) on Feb 10, 2012 at 12:52 UTC
    It might be worth your while (if there's any significant chance that your requirements will change/expand) to explore the various XML::... modules on CPAN.

    Update - typos fixed: s/change/chance/; s/,/\)/. Pre-caffeine, AM blindness.

      Thank you ww. I'm quite new to Perl, so hoping to ease my way in fairly gently. I'll keep the XML tip in mind if I need to be more flexible.
        Not to disagree with your plan, but CPAN and modules -- or even just core modules; the ones installed with your Perl -- are so nearly intrinsic to programming with Perl, that you won't be well served by postponing your use of them too long.

        And beyond their utility for your production needs, reading a module (pick some!) and its documentation can be really enlightening/educational.

Re: conditional input field separator?
by JavaFan (Canon) on Feb 10, 2012 at 12:09 UTC
    Untested:
    my ($content) = $str =~ /body=(?|"([^"]*)"|'([^']*)')/;

      Thank you JavaFan!

Re: conditional input field separator?
by choroba (Cardinal) on Feb 10, 2012 at 15:54 UTC