in reply to Re^2: Caching Entities with XML::LibXML
in thread Caching Entities with XML::LibXML

Well, this is very interesting. Please update the OP or thread with your final solution, as it were.

  • Comment on Re^3: Caching Entities with XML::LibXML

Replies are listed 'Best First'.
Re^4: Caching Entities with XML::LibXML
by ikegami (Patriarch) on Feb 25, 2010 at 02:04 UTC
    • XML::LibXML doesn't cache the parsed DTDs it finds via catalogs or any other means (as far as I can tell). Nothing can be done about this.
    • Catalogs can be used to tell where to find a number of DTDs.
    • DTDs named in catalogs can be stored locally.
    • The location of DTDs can be relative to the catalog, or absolute urls.
    • XML::LibXML parses the catalog once per process.
    • XML::LibXML loads the DTDs it finds in catalogs on demand.
    • The last two points mean a catalog can contain DTDs that are rarely used, if ever.

    I'm going to create XML::Catalogs (common code) and XML::Catalogs::HTML (installs and loads catalog of HTML DTDs). All you'll need to do to prevent the download of HTML DTDs will be:

    use XML::Catalogs::HTML -libxml;
Re^4: Caching Entities with XML::LibXML
by ikegami (Patriarch) on Feb 28, 2010 at 07:02 UTC

    For users unable to alter their system configuration,
    for users unaware of the need to alter their system configuration,
    for the simplicity of installing a Perl package,
    for integration with Perl's dependency system,

    XML::Catalogs and XML::Catalogs::HTML are now on CPAN.

    (I got pod errors, I misspelled "dependency", and I could improve the description of the purpose of the module. Let me know if you have comments or if you want more features.)