isync has asked for the wisdom of the Perl Monks concerning the following question:

Hi!

Is there anything special when outputting XML from perl?

I am trying to submit an opensearch description document to A9 at http://a9.com/-/opensearch/ and it sais it could not find a "valid XML header in file" - although I successfully validated at feedvalidator.org.

My XML is utf-8 like this:
<?xml version="1.0" encoding="UTF-8"?> <OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/"> ... </OpenSearchDescription>
and I remember that printing text this way:
print "Content-Type: application/opensearchdescription+xml\n\n"; print $xmlstuff;
does not really output binary data ("XML is binary data that happens to look like text"). My only idea left is that the print content is not really properly binary encoded data as I just printed it and treated it like text before. Now, how can I add the first magic XML byte on my output. Something like print binmode($xmlstuff); or something with pack??

Replies are listed 'Best First'.
Re: Output of XML: "valid XML header needed"-error (OpenSearch)?
by clinton (Priest) on Aug 24, 2007 at 17:53 UTC
    There is no magic XML byte, and XML is text. Here are the things I would do:

    • Check the headers that are actually being sent by your server. You could use use wget -S http://yoururl or a Firefox plugin like LiveHeaders

    • Check that there is no whitespace before your XML declaration

    • Check that your XML is valid with XML::LibXML::Schema - the online validator doesn't check schemas

    • Check that your text is UTF-8 print utf8::is_valid($text); print utf8::is_utf8($text)

    • Try with plain ASCII data, perhaps with a static file, just to see if that works

    If all of the above fails, try posting more of the code, plus a link to the URL that you are sending to OpenSearch, so we can look at it

    Clint

Re: Output of XML: "valid XML header needed"-error (OpenSearch)?
by moritz (Cardinal) on Aug 24, 2007 at 16:25 UTC
    Are you sure that the client doesn't expect a Content-Type: text/xml; charset=utf-8 header?

    And you really print utf8? Did you do a Encode::decode("utf8", $string); before you wrote it?

    And which client produces the error message?

    Have you tried to run your script locally and validated the output against a schema?

      Yes, I am sure. (Although I tried all kinds of headers - no result.)

      I tried Encode::decode and had negative result with and without it.

      Its amazon's client. YouTube has (except the values) the exactly same opensearch file layout - I even tried copy and paste their content - and sent it to amazon A9. Result: Error: no valid XML header *in* file.

      Their works: http://a9.com/-/opensearch/?url=http%3A%2F%2Fwww.youtube.com%2Fopensearch
      What is the magic to properly identify an XML file...?
        Try to write your XML to a file and configure Apache to send the application/opensearchdescription+xml Content-Type (that's what youtube uses).

        Try to validate your XML, at least for well-formedness (for example with xmlstarlet), and use wget --server-response $url to check that your webserver delivers the right headers.

        Could you tell us your URL, so that I can take a look as well?

Re: Output of XML: "valid XML header needed"-error (OpenSearch)?
by isync (Hermit) on Aug 24, 2007 at 18:58 UTC
    The tip to just save it to filesystem and let apache handle it was great! It worked. Although I don't see why...:
    Header of working
    Status: 200 OK Connection: close Date: Fri, 24 Aug 2007 18:29:38 GMT Accept-Ranges: bytes ETag: "13dae1c-2b2-2c196200" Server: Apache/2.0.54 (Unix) PHP/4.4.7 mod_ssl/2.0.54 OpenSSL/0.9.7e m +od_fastcgi/2.4.2 DAV/2 SVN/1.4.2 Content-Length: 690 Content-Type: application/xml Last-Modified: Fri, 24 Aug 2007 18:28:24 GMT Client-Date: Fri, 24 Aug 2007 18:29:43 GMT Client-Response-Num: 1
    Header of non-working:
    Status: 200 OK Connection: close Date: Fri, 24 Aug 2007 18:41:50 GMT Accept-Ranges: none Server: Apache/2.0.54 (Unix) PHP/4.4.7 mod_ssl/2.0.54 OpenSSL/0.9.7e m +od_fastcgi/2.4.2 DAV/2 SVN/1.4.2 Content-Length: 690 Content-Type: application/xml Last-Modified: Fri, 24 Aug 2007 18:28:24 GMT Client-Date: Fri, 24 Aug 2007 18:41:57 GMT Client-Response-Num: 1

    Is ETag that important? (come on..)
      ETag is irrelevant, it has to do with your data
Re: Output of XML: "valid XML header needed"-error (OpenSearch)?
by Anonymous Monk on Aug 24, 2007 at 19:36 UTC
    Filed under Mystery. For now I use the "real file" hack... Thanks everyone!
Re: Output of XML: "valid XML header needed"-error (OpenSearch)?
by isync (Hermit) on Aug 30, 2007 at 09:05 UTC
    No http-response experts among the Monks??
        Arg! I posted two different tests... They were exactly 4773, both!
Re: Output of XML: "valid XML header needed"-error (OpenSearch)?
by Anonymous Monk on Aug 29, 2007 at 18:36 UTC
    Hey Monks!

    I need to revisit this topic: my script produces XML data which produces an error on A9. When I save the output and let Apache serve it, it works.
    Today I tried again to get it to work and did a whole slew of tests:

    1. I downloaded it with LWP::UserAgent and compared the response objects: besides some minor stuff in the header (which shouldn't matter) - identical!

    2. I did a wget on the resources and compared with the diff command - no diffs, which means - identical!

    3. I had a look at the headers again and also compared mine with a completely different header which also works fine. Below the headers.

    Connection: close Date: Wed, 29 Aug 2007 17:27:13 GMT Accept-Ranges: bytes ETag: "367930c-12a5-ca8adf00" Server: Apache/2.0.54 (Unix) PHP/4.4.7 mod_ssl/2.0.54 OpenSSL/0.9.7e m +od_fastcgi/2.4.2 DAV/2 SVN/1.4.2 Content-Length: 4773 Content-Type: application/xml Last-Modified: Wed, 29 Aug 2007 17:19:24 GMT Client-Date: Wed, 29 Aug 2007 17:27:14 GMT Client-Response-Num: 1
    - working! (data produced by my script and saved to filesystem. then served by apache)
    Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre- +check=0 Connection: close Date: Wed, 29 Aug 2007 17:35:19 GMT Pragma: no-cache Server: Apache/2.0.59 (CentOS) Content-Type: application/xml; charset=UTF-8 Expires: Thu, 19 Nov 1981 08:52:00 GMT Client-Date: Wed, 29 Aug 2007 17:35:19 GMT Client-Response-Num: 1 Set-Cookie: PHPSESSID=flmm8b47j1d44dkmsiqjm71l74; path=/ X-Powered-By: PHP/5.1.6
    - working! (another site, not my data, but validated ok)
    Connection: close Date: Wed, 29 Aug 2007 18:04:31 GMT Accept-Ranges: none Server: Apache/2.0.54 (Unix) PHP/4.4.7 mod_ssl/2.0.54 OpenSSL/0.9.7e m +od_fastcgi/2.4.2 DAV/2 SVN/1.4.2 Content-Length: 4753 Content-Type: application/xml Client-Date: Wed, 29 Aug 2007 18:04:33 GMT Client-Response-Num: 1 Set-Cookie: LX=ID=nth2ctfs6fpg&T=1188410672&L=en; expires=Sun, 17-Jan- +2038 10:00:00 GMT; path=/; domain=.lumerias.com
    - error (my data directly produced by the script but doesn't pass the test)

    Where is the invisible difference in my data??
    As you can see both responses have the same length, content is completely identical (according to diff) and even the important headers (here: application/xml etc.) are the same...