catfish1116 has asked for the wisdom of the Perl Monks concerning the following question:

I got my script to parse out an XML message. However, I did receive a few error messages. Below is code and errors

my $line = <>; print "This is what line looks like $line \n"; #while ($line = !<>) { chomp($line); print "This is what line looks like $line \n"; my @items = (split /></, $line); printf "\n\n Item 1 $items[1], \n Item 2 $items[2], \n Item 3 $it +ems[3]\n"; # } Missing argument in printf at ./XML_parse line 19, <> line 1. Invalid conversion in printf: "%20R" at ./XML_parse line 19, <> line 1 +.

Also, how could I parse a file that has more than 2 messages? I.e. what would be the message delimiter? I have been 'asked' not to download the XML module on my workstation, so I am trying to be creative. :) TIA The Catfish

Replies are listed 'Best First'.
Re: XML parsing
by choroba (Cardinal) on Dec 11, 2019 at 16:42 UTC
    These are not error messages, but warnings. Are you sure you wanted printf and not just print? Interpolation of user supplied data in a printf template is a security risk, don't do that (see e.g. %n).

    Parsing XML with regular expressions is crazy. If the format of the input is not always the same, using an XML-aware module is the only possible path.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: XML parsing
by Fletch (Bishop) on Dec 11, 2019 at 17:12 UTC

    Unless you can guarantee a very limited scope of XML input (e.g. something only ever produced by a single source that will always produce the exact same formatting / layout / character encoding / ...) you're just asking for trouble. Even with such constraints your code is going to be "brittle" because a valid XML document that varies from those constraints will break your parsing. And inevitably you'll get said valid-but-unexpected document at just the wrong time.

    If you're going to need to handle arbitrary XML you want to use a real parser.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: XML parsing
by haukex (Archbishop) on Dec 11, 2019 at 17:31 UTC
Re: XML parsing
by stevieb (Canon) on Dec 11, 2019 at 17:24 UTC
    "I have been 'asked' not to download the XML module on my workstation"

    I'd be asking about the reasoning behind this nonsense.

    Regarding your printf warnings, it appears as though the code you provide is not the same code generating the warnings (given one of the warnings states an actual format, where your code does not). printf requires a string format template, followed by a list of items to inject into the string. Here's an example:

    printf "Hello %s, your number is %d\n", $name, $num;

    See this for the list of valid formats.

    In your case, you don't need printf, as your variables interpolate just fine with print. Besides, as choroba says, sending user data into printf without any validation checks can be problematic.

      The printf statement in the OP has one argument which includes interpolated strings from some arbitrary XML. It's quite possible that those contain unescaped % symbols. With certain input I'm sure such warnings could be generated by catfish1116's code.

      However since catfish1116 still hasn't deigned to provide us with the input data, despite numerous requests in the other thread, it's all just guesswork.

Re: XML parsing
by Jenda (Abbot) on Dec 12, 2019 at 13:00 UTC

    You were asked not to download. Mkay. But did you check whether you have any already?

    Jenda
    1984 was supposed to be a warning,
    not a manual!