dbmathis has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I have a line that looks like the following...

<Prompt><![CDATA[Please describe your experience maintaining batteries, associated power equipment and diesel. . (If you do not have this experience, please enter ?none? and continue to the next question).  ]]></Prompt>

Rather than using split, is there a one liner that will parse the following from the above...

Please describe your experience maintaining batteries, associated power equipment and diesel. . (If you do not have this experience, please enter ?none? and continue to the next question).

Thanks

After all this is over, all that will really have mattered is how we treated each other.

Replies are listed 'Best First'.
Re: Parsing a line from a line with one line of perl code.
by erroneousBollock (Curate) on Nov 09, 2007 at 01:06 UTC
    While the regular expression to achieve this may be simple, I implore you to use XPath (say with XML::XPath) to access data in XML documents.

    -David

Re: Parsing a line from a line with one line of perl code.
by zer (Deacon) on Nov 09, 2007 at 00:51 UTC
    ($_)=<DATA>=~/<!\[CDATA\[(.*?)]]/; __DATA__ <Prompt><![CDATA[Please describe your experience maintaining batteries +, associated power equipment and diesel. . (If you do not have this e +xperience, please enter ?none? and continue to the next question). ] +]></Prompt>

    The <DATA> can be replaced with a variable that holds the info. The ($_) is getting the response of the regular expression in list format. So the $1 variable which returns the first group gets placed into $_.

Re: Parsing a line from a line with one line of perl code.
by RaduH (Scribe) on Nov 09, 2007 at 01:07 UTC
    The problem is a little underspecified ... I will assume you are looking for the text between the last [ and the first ]. You need to:
    ignore until you find [ read in buffer until you find [ or ] if you found ] the buffer is your result if you found [ reset buffer {optional - if you reach the end and there's no ], scream "betrayal!"}
    In Perl you do
    $input = ...your text here...; $result = /.*\[(.*^[)\].*/;
    Match anything between [ and ], unless you find a [, otherwise you'd pick up anything between the first [ and the first ].

    Hope this helps!

Re: Parsing a line from a line with one line of perl code.
by Anonymous Monk on Nov 09, 2007 at 01:02 UTC
    i'm not sure just what you're after here, but the following regex satisfies the requirement for the specific example given.

    my $string = '<Prompt><![CDATA[Please ... question). ]]></Prompt>'; my ($extract) = $string =~ m{ \[CDATA\[ (.*) ]]> }xms;
      • Don't parse (arbitrary) XML with a regex; you're asking for a world of hurt.
      • If you're going to parse a relatively fixed XML input (a one off; a very static source that's not going to change how it creates its XML output), you most likely want to use non-greedy modifiers because it's extremely easy to get something that "looks" like it'll work but is brittle (as has already been recommended elsewhere in this thread).
      $ cat foo.pl use strict; use warnings; my $string = qq{<Prompt><![CDATA[Please ... question ). ]]> <![CDATA[[ + FAIL. ]]></Prompt>}; my( $fail ) = $string =~ m{ \[CDATA\[ (.*) ]]> }xms; print $fail, "\n"; my( $better ) = $string =~ m{ \[ CDATA \[ (.*?) \]\] }x; print $better, "\n"; $ perl foo.pl Please ... question ). ]]> <![CDATA[[ FAIL. Please ... question ).

      Addendum: I mean it's not like this topic gets discussed every other week around here or something . . .