Ovid has asked for the wisdom of the Perl Monks concerning the following question:

In trying to test some XSLT code I've been working on, I'm trying to produce valid XHTML and use Test::XML to verify the results. Unfortunately, XML::LibXSLT appears to be generating an unclosed meta tag for the HTML content type. Even if I include my own meta tag, it's removing mine and substituting the invalid one.

As a workaround, I've stripped the meta tag from the resulting HTML and used is_xml() in my test, but I'd much rather have my code produce a valid meta tag. A minimal test case is below with my output in the __DATA__ section. Does anyone get valid XHTML results? If so, I'm running this on OS X. xslt-config returns 1.1.11.

#!/usr/bin/perl use strict; use warnings; use XML::LibXML; use XML::LibXSLT; my $parser = XML::LibXML->new; my $doc = $parser->parse_string(_xml()); my $style_doc = $parser->parse_string(_stylesheet()); my $xslt = XML::LibXSLT->new; my $sheet = $xslt->parse_stylesheet($style_doc); my $html = $sheet->transform($doc); print $sheet->output_string($html); sub _xml { return <<' END_XML'; <?xml version="1.0"?> <resources> <description>Available instances</description> <resource id="foo"/> <resource id="bar"/> <resource id="bar"/> </resources> END_XML } sub _stylesheet { return <<' END_STYLE_SHEET'; <xsl:stylesheet version="1.0" xmlns:xsl="" xmlns:fo=""> <xsl:template match="/"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UT +F-8"/> <title><xsl:value-of select="/resources/description" /></title +> </head> <body> <table> <xsl:for-each select="resources"> <xsl:apply-templates select="./resource" /> </xsl:for-each> </table> </body> </html> </xsl:template> <xsl:template match="resource"> <tr> <td> <xsl:value-of select="@id" /> </td> </tr> </xsl:template> </xsl:stylesheet> END_STYLE_SHEET } __DATA__ <html xmlns:fo=""> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>Available instances</title> </head> <body><table> <tr><td>foo</td></tr> <tr><td>bar</td></tr> <tr><td>bar</td></tr> </table></body> </html>


New address of my CGI Course.

Replies are listed 'Best First'.
Re: XML::LibXSLT generates invalid XHTML
by derby (Abbot) on Jul 19, 2005 at 00:22 UTC
    You're defaulting to html output. Set the output to xhtml:
    <xsl:stylesheet version="1.0" xmlns:xsl="" xmlns=""> <xsl:output method="xml" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="" indent="yes"/>

Re: XML::LibXSLT generates invalid XHTML
by arturo (Vicar) on Jul 19, 2005 at 00:35 UTC
    Well, the same happens using the default jEdit XSLT Plugin, which (surprise surprise) doesn't use LibXML::XSLT. So it's likely not the processor; so I decided to take a gander at the specs, and I found a nasty little surprise:

    The default for the method attribute is chosen as follows. If

    • the root node of the result tree has an element child,

    • the expanded-name of the first element child of the root node (i.e. the document element) of the result tree has local part html (in any combination of upper and lower case) and a null namespace URI, and

    • any text nodes preceding the first element child of the root node of the result tree contain only whitespace characters,

    then the default output method is html; otherwise, the default output method is xml.

    I guess they wanted to have a 'sensible' default and this made sense to a sufficient number of people voting on the spec; I don't agree, it seems like an unnecessary bit of processing and it can be surprising.

    Anyhow, since you're wantin' to output XHTML, you should set the default namespace on your stylesheet element to anyways. What would also "work", from the point of view of closing the meta tag, would be to explicitly declare an xsl:output element with a method attribute of xml. I suggest you do both, because you have to in order to generate valid XHTML.

    update Actually, what the spec implies is that you only have to add the default namespace in order to get XHTML output. I like adding the xsl:output element anyways, because I usually like to output doctype declarations and set the output encoding explicitly.

    If not P, what? Q maybe?
    "Sidney Morgenbesser"