User Questions
XML::DOM
3 direct replies — Read more / Contribute
by mirod
on Sep 05, 2000 at 12:10

    Description

    XML::DOM is a Perl implementation of W3C's DOM Level 1.

    It is one of the most widely used Perl XML modules. It is included in libxml-enno

    XML::DOM adds some Perl specific features to the W3C recommendation.

    Why use XML::DOM?

    • you want to follow the W3C recommendation
    • you want to use one of the most stable XML modules in Perl
    • you already know the DOM, or you want to be able to use the same API in Java and in Perl in the future you want to interface easily with XML data bases
    • you are seriously masochistic!

    Why NOT use XML::DOM

    • you have to process huge documents
    • you need speed
    • the DOM API is ugly!

    Related Modules

    • XML::XQL::DOM adds XQL support to XML::DOM.
    • XML::DOM::ValParser uses XML::Checker to validate doccuments at parse time.
    • XML::EasyOBJ is a module built on top of XML::DOM with a simpler and more perlish interface. This kind of module is an excellent idea: it gives you the ease of programming of the Perl way while preserving DOM compatibility,
    • XML::DOM::Twig: a little module I wrote (it's not yet on CPAN) to emulate some of XML::Twig functions with XML::DOM, making it easier (and safer) to use.

    Personal notes

    I don't like the DOM API at all! I think the model is clean but for some reason the interface is clunky and too verbose. It is extremely Java oriented, from the names of methods to the type of object they return.

    That said XML::DOM is a robust module, widely used and well designed. Plus the DOM is generally well documented and some nice tutorials are available.

    Warning: there are compatibility problems between various versions of XML::DOM and XML::Parser. The valid combinations are XML::Parser 2.30 + XML::DOM 1.33 and above or XML::Parser 2.30 + XML::DOM 1.29 (included in libxml-enno) or XML-Parser 2.27 (included in Activestate Perl on Windows) + XML::DOM 1.25 (the stand-alone version not included in libxml-enno).

DBIx::XML_RDB
2 direct replies — Read more / Contribute
by mirod
on Sep 01, 2000 at 12:20

    Description

    DBIx::XML_RDB - Perl extension for creating XML from existing DBI datasources. DBIx::XML_RDB comes with the sql2xml tool which simply dumps a table in a database to an XML file.

    Why use DBIx::XML_RDB?

    • you want to export the data base data in XML
    • you prefer to process the data using XML tools rather than as a table

    Why NOT use DBIx::XML_RDB?

    • you are not using a DBI-supported data base!
    • you want to generate something more complex than just record oriented XML, with nested structures for sub-tables for example

    Example

    use DBIx-XML_RDB;
    my $xmlout = DBIx::XML_RDB->new($datasource,
                  "ODBC", $userid, $password, $dbname) || die "Failed to make new xmlout";
    $xmlout->DoSql("select * from MyTable");
    print $xmlout->GetData;
    

    Personal Notes

    I haven't used DBIx-XML_RDB too often (the first script I wrote with DBI did exactly the same thing!) but it looks like the right tool to generate XML out of a relational table

XML::Twig
No replies — Read more | Post response
by mirod
on Sep 01, 2000 at 11:17

    Full disclosure: I am the author of the module!

    Description

    XML::Twig is a module designed for efficient processing of XML.

    XML::Twig offers tree as well as stream based processing. It allows loading only parts of the document in order to keep memory requirements to a minimum.

    XML::Twig is very Perlish: fast, efficient and it offers many different ways to perform a task.

    Why use XML::Twig?

    • you need to do complex processing of huge documents, fast
    • XML::Simple is not enough for you but you don't like XSLT and DOM
    • you like the interface

    Why NOT use XML::Twig?

    • you can live with the constraints of the DOM API and you want to be able to access XML data bases in the future
    • XML::Simple works for you
    • you don't like the interface

    Additional information

    You can get more information in the documentation., or in the tutorial, a Quick Reference is also available. Kip Hampton also wrote about it in Using XML::Twig on xml.com.

    A list of nodes that include examples of using XML::Twig:

    Personal Notes

    I use XML::Twig a lot ;--)

    It might have some problems with mod_perl, I have not tested it in that environment

    Suggestion, bug reports, comments welcome!

XML::PYX
1 direct reply — Read more / Contribute
by mirod
on Sep 01, 2000 at 10:07

    Description

    XML::PYX, based on XML::Parser is the Perl implementation. It comes with 3 tools: pyx (non-validating) and pyxv (validating) output the Pyx version of the document and pyxw writes an XML version of a Pyx flow. See XML.com - Pyxie for a description of Pyxie

    Why use PYX?

    • you don't want to know to much about XML
    • you are used to, and you like line-oriented tools
    • you are just extracting some data from the XML document
    • you are doing simple XML transformation

    Why NOT use XML::PYX

    • you want to do complex transformations
    • you are more at ease with tree-processing
    • you don't like writing all those regexps with \(
    • you need some information from the XML documents that Pyx does not provide (comments, entity declarations...)

    Related Module

    XML::TiePYX is easier to use on a Windows system

    Personal notes

    Pyx is really cool to extract information from an XML file, or to perform simple transformations on simple XML files. The module is mature (it is quite simple so there shouldn't be too many bugs in it). I never actually use the module, only pyx, which I pipe to a perl -n or perl -p script.

    Examples

    Print all the elements used in an XML document, with the number of occurences.

    pyx file.xml | perl -n -e '$nb{$1}++ if( m/\A\((.*)\n/); \
                                END { map { print "$_ used $nb{$_} time(s)\n";} sort keys %nb;}'
      

    Warn in case of duplicate ID:

    pyx file.xml | perl -n -e '($id)=( m/^Aid (.*)\n/) or next; print "duplicate id: $id\n" if($id{$id});  $id{$id}=1;'
      

    Change a tag name (class to color):

    pyx wine.xml | perl -p -e 's/^([()])class/$1color/' | pyxw
      
XML::Simple
3 direct replies — Read more / Contribute
by mirod
on Sep 01, 2000 at 09:35

    Description

    XML::Simple - Trivial API for reading and writing XML (esp config files)

    XML::Simple loads an XML file in memory, in a convenient structure, that can be accessed and updated, then output back.
    A number of options allow users to specify how the structure should be built. It can also be cached using Data::Dumper

    Why use XML::Simple?

    • XML configuration files, small table, data-oriented XML
    • simple XML processing
    • you don't care much about XML but find it convenient as a standard file format, to replace csv or a home-brewed format

    Why NOT use XML::Simple?

    • your XML data is too complex for XML::Simple to deal with:
      - it includes mixed content (<elt>th<is>_</is>_ mixed content</elt>)
      - your documents are too big to fit in memory
      - you are dealing with XML documents
    • you want to use a standard-based module (XML::DOM for example)

    Personal notes

    I don't use XML::Simple in production but the module seems quite mature, and very convenient for "light" XML: config files, tables, generally data-oriented, shallow XML (the XML tree is not really deep), as opposed to document-oriented XML.

    Update: make sure you read the documentation about the forcearray option or you might get bitten by repeated elements being turned into an array (which is OK) _except_ when there is only one of them, in which case they become just a hash value (bad!).
    for example this document:

    <config dir="/usr/local/etc" log="/usr/local/log"> <user id="user1"> <group>root</group> <group>webadmin</group> </user> <user id="user2"> <group>staff</group> </user> </config>
    when loaded with XMLin and not forcearray option becomes
    { 'dir' => '/usr/local/etc', 'log' => '/usr/local/log', 'user' => {'user1' => {'group' => ['root', 'webadmin']}, 'user2' => {'group' => 'staff'} } };
    Note the 2 different ways the group elements are processed.

    I also found that XML::Simple can be a little dangerous in that it leads to writing XML that is a little too simple. Often when using it I end up with an XML structure that's as shallow as I can possibly make it, which might not be really clean.

Gimp
4 direct replies — Read more / Contribute
by Aighearach
on Jul 10, 2000 at 08:57
    Gimp.pm was written by Marc Lehmann

    Gimp.pm provides Perl interface to all aspects of the GNU Image Manipulation Program. This provides not only for writing GIMP filters, plugins, and extensions, but also for network access to the GIMP.

    There are no programming tasks that I enjoy more than those involving Gimp. Whether I'm just writing a one-off thumbnail program on STDIN, or generating dynamic images and special effects, there is nothing that is as flexible.

    It does still have a way to go before it is ready for primetime, however. One example is that it is not fully compatible with the strict pragma, generating lots of warnings. Running scripts from apache, this needlessly pollutes the web logs.

    The version of Gimp.pm being packaged with the latest GIMP distribution has a number of bugs. One is that Gimp::init() doesn't work as advertised. This forces the use of a callback, that can't return anything. The older versions had cleaner implementations of this feature.

    Another bug is that a GIF format file with layer data outside the image borders will cause a GIMP crash, instead of being cropped. This is not consistent with the behavior of the normal GIMP UI.

    In addition, the manual has sections marked as "outdated," pointing to other areas of the documentation that don't cover everything marked as outdated. The source is available, but reading it it becomes obvious that the author's primary language is C. I have nothing against C, but in this case I think better documentation is called for.

    UPDATE: As of version 1.2, everything is working great excepting the documentation.

CPAN.pm
2 direct replies — Read more / Contribute
by Aighearach
on May 21, 2000 at 16:03

    The Module of Modules: CPAN

    Modules make possible in an afternoon, what would take many months without them. It is because of this the time needed to learn to use them is well spent. Once you learn to effectively use modules, you will wonder how you ever got by without them. Switching from Perl without modules, to Perl with modules, is as big a step is ease of use and project clarity as moving from C to Perl was is the first place.

    So, my favorite module is the module that brings me my modules: CPAN.pm

    This should have come with your perl dist. I won't get into how to use it in your program, as most modules are used; no, there is a very special power within CPAN that no monk can live without. This is the CPAN shell. To invoke, issue the following incantation:

    perl -MCPAN -e'shell'

    It will probably tell you some things, like that there is a newer version of the spell, and that you should fetch some network Bundle:: 's. Follow these instructions, always.

    Now, to look for a module, light your purifying incense, and chant:

    i /keyword/

    Give it a try. If you don't know what to search for, try i /lingua/.

    When you find something that sounds way-super-cool, concentrate on:

    install package::name

    For example, you might want to install Lingua::EN::Gender. This will download the tarball, configure it, make it, make test and make install it for you, saving you the trouble. Why should a Perl hacker have to wrestle with gcc? I say, let's leave the C coding to St. Wall, and his Disciples.

    When you first run it, it will ask you some congif questions; I recommend asking the Gods to install any dependencies for you; I have found them to be better at knowing these things than I am.

    It might help to run this as root.

    If make test fails, you can probably go to $HOME/.cpan/build and make install. Usually when make test fails, it is becuase no time was spent developing the test, and instead of making it always pass, they like to make it always fail... but this is rare. Most packages are written well.

    -- Aighearach

    jdporter - changed title from single word "CPAN"