japh2963 has asked for the wisdom of the Perl Monks concerning the following question:

Is there a module/script for converting xhtml to xml?

I've been vaguely asked to do something along those lines although every time it comes up I honestly think that they are merely referring to the rss feed and somehow think that it is the same thing as the actual webpage page.

Replies are listed 'Best First'.
Re: converting xhtml to xml (xhtml2xml)
by Corion (Patriarch) on Mar 12, 2016 at 15:34 UTC

    HTML::TreeBuilder and XML::LibXML can parse HTML and turn it into a tree structure, that you can then output as (HTML-tagged) XML.

    Maybe you want to go back to your askers and ask them what XML schema the resulting document should conform to.

      sadly, that would just confuse them, but thank you very much for the modules :)
Re: converting xhtml to xml (xhtml2xml)
by Anonymous Monk on Mar 12, 2016 at 16:48 UTC

    XHTML is XML, so.... rename -v -n 's/\.html$/.xml/i' * (remove the -n for this to actually do something).

    But, seriously, as you said, RSS is XML too. So you're going to have to interrogate your clients better. Ask them what they want the output to look like or what spec it should conform to, ask them to show you an example, etc. - whatever questions it takes for them to give you enough information to do your task.