The first version of the module XML::Smart is in final stage. Before release it on CPAN I would like to show it at the monastery, and see what the monks think in this new way to access/create XML data, and test. Since I had good results in the last time that I did this, for the module Object::MultiType.

But what is XML::Smart? Is a smart, easy and powerful way to access/create XML files/data. Is easier than XML::Simple, more dynamic and integrated with the Perl syntax. But only after try it you will really know how good it can be! In the first look it seam to be simple, and should, but what is inside to make this crazy and funy way work is very complex! But let's talk about the code:

I will try to examplain how it works here, but the better way is to try it, and see the POD inside, that have some code examples. (links in the bottom).

Every one that have tried to use a HASH tree to access XML have big problems, specially when you don't know if in some point you have a HASH key, an ARRAY ref or another HASH ref. This is why when you create an XML::Simple object you have a lot of options to force ARRAY, cut the root node, etc... But you still need to know what structure your XML tree have, and to do that generally we look at the XML file (what is really wrong), or make a lot of codes to check!

To fix that and make everything automatically and easy, I think in a way to make a Perl object work in the same time as an HASH, ARRAY, SCALAR and CODE. What now lives in the module Object::MultiType, and is there for future ideas too, and for any monk. ;-P

A good example is to compare how to access this data in XML::Simple (a HASH tree), and XML::Smart (a HASH tree too, but actually you are accessing a tied HASH and a tied ARRAY that work together):

## The data: <?xml version="1.0" encoding="iso-8859-1"?> <hosts> <server os="linux" type="redhat" version="8.0"> <address>192.168.0.1</address> <address>192.168.0.2</address> </server> <server address="192.168.2.100" os="linux" type="conectiva" version= +"9.0"/> </hosts>
Now let's say that you want to get the first addres of the 2 servers (in XML::Simple):
my $xml = new XML::Simple(); my $tree = $xml->XMLin(DATA]); my $addr0 = $tree->{hosts}{server}[0]{address}[0] ; my $addr1 = $tree->{hosts}{server}[1]{address} ;
Note that in the first case ($addr0) you have an ARRAY ref in the key (since you have 2 addresses), so you need to use [0], but in the second you have a normal key (1 address), than just use the key name. But if you don't know when and where you have a HASH or an ARRAY, you need to check the reference before, or you can get errors, and your script die().

With XML::Smart is easy, since each point in the HASH tree is in the same time a HASH, an ARRAY, a SCALAR and a CODE! So you can make this:

my $XML = XML::Smart->new(DATA) ; my $addr0 = $XML->{hosts}{server}[0]{address}[0] ; ## return {addres +s}[0] ## ...or... my $addr0 = $XML->{hosts}{server}{address} ; ## return {addres +s}[0] my $addr1 = $XML->{hosts}{server}[1]{address} ; ## return {addres +s} ## ...or.. my $addr1 = $XML->{hosts}{server}[1]{address}[0] ; ## return {addre +ss}
You can see that when you have a normal key, and use [0], you actually get the {key}. Or when you have an ARRAY and call the {key} you get the [0] of the array! And when you access [1] but have a {key} you get UNDEF.

Or let's say that you want to get the address of the server with the 'type' 'conectiva', but don't know if it's the 1st or second in the DATA. So, you can use a search selection:

my $addr = $XML->{hosts}{server}('type','eq','conectiva'){address} ;

To get the CONTENT you don't need to use a key called content, you get it as a string:

## Data: <foo port="80">content<i>a</i><i>b</i></foo> ## Tree of the data: $HASH = ( foo => { i => ['a','b'] , CONTENT => 'content' } );
To get the content you just get foo, and use it as a string:
my $cont = $XML->{foo} ; ## ...or... my $cont = $XML->{foo}->content ; print "<<$cont>>\n" ; ## print: <<content>> $content .= 'x' ; ## Append data.

And to save your new data:

my $data = $XML->data ; ## Directly to the file: my $data = $XML->save('new.xml') ;

You can download it at: http://www.inf.ufsc.br/~gmpassos/XML-Smart-1.0.tar.gz
And the dependence module that make everything possible: http://www.inf.ufsc.br/~gmpassos/Object-MultiType-0.01.tar.gz (who got this need to get again, was updated)

See the POD and test.pl in the package! I will appreciate any type of feedback (include your opinions and/or suggestions). ;-P

Graciliano M. P.
"The creativity is the expression of the liberty".

Replies are listed 'Best First'.
Re: XML::Smart - Development in final stage. (beta is out)
by vladb (Vicar) on May 13, 2003 at 05:59 UTC
    You are correct, I did run into problems you are describing here. However, I found temporary remedy by instantiating an XML::Simple object with certain options that simplify the resulting perl structure somewhat. Here's an example code snippet:
    use strict; use XML::Simple; use Data::Dumper; my $xs = new XML::Simple(forcearray => 1, forcecontent => 1, contentkey => '_content', keyattr => []); my $xml; { local $/ = undef; $xml = <DATA>; } my $xmlHash = $xs->XMLin($xml); print "xmlHash:\n" . Dumper($xmlHash); __DATA__ <?xml version="1.0" encoding="iso-8859-1"?> <hosts> <server os="linux" type="redhat" version="8.0"> <address>192.168.0.1</address> <address>192.168.0.2</address> </server> <server address="192.168.2.100" os="linux" type="conectiva" version= +"9.0"/> </hosts>
    And the ouptut is
    xmlHash: $VAR1 = { 'server' => [ { 'os' => 'linux', 'address' => [ { '_content' => '192.168.0.1' }, { '_content' => '192.168.0.2' } ], 'version' => '8.0', 'type' => 'redhat' }, { 'os' => 'linux', 'address' => '192.168.2.100', 'version' => '9.0', 'type' => 'conectiva' } ] };
    Note that I force the '_content' key which assures that there'll be much less inconsistancy in the xml structure (HASHREF or ARRAYREF etc).

    I was also wondering if you had done any benchmarking on your module? I have a sizable script (~4000 lines) that makes extensive use of the XML::Simple module and was wondering if converting to, say, your smart module would slow it down considerably (and this being part of a larger web application may be the least I'd want to have :)?

    Otherwise, your attempt does look worthwhile and promising :-)

    _____________________
    "We've all heard that a million monkeys banging on a million typewriters will eventually reproduce
    the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true."

    Robert Wilensky, University of California

      Thanks to like it! ;-P

      I'm opened for new ideas for XML::Smart too, since you share the same problems to work with the previous resources for XML!

      About benchmark, I can't say that I made tests yet, since this is the last thing to do! But the load of the XML files are faster than XML::Simple! For XML::Simple you make a tree, and than it parse all the HASH tree again to make it in the right format. With XML::Smart it make the tree directly in the right format. What can be slower of course, is the access of keys or indexes in the object, since this paste through a TIEHASH and a TIEARRAY, what can't be compared to the direct access of HASHes and ARRAYs!

      But note that since is easier to use XML::Smart, you make a code cleanner, that work directly in what you want, what help a lot in the speed.

      But if you have experience with hight volumes of XML data, you can help me with the benchmark! If you could make some bench tests for XML::Simple, specially with difficult problems, I can compare with XML::Smart and make everything possible to make it faster. I will appreciate a lot your help in this!

      Graciliano M. P.
      "The creativity is the expression of the liberty".

Re: XML::Smart - Development in final stage. (beta is out) (avoid "Smart")
by tye (Sage) on May 13, 2003 at 14:52 UTC

    Please don't use adjectives conveying a value judgement in module names. Words like "Smart", "Cool", and "Wonderful" should not be used in module names. We're sure you'd think your module is smart or cool or wonderful or you'd probably not bother writing it or sharing it with the world.

    That you think your way of doing XML is "smart" doesn't really tell us anything about your module. Use a name that describes the module more precisely. XML::MultiType would be a much better name, IMO.

                    - tye
      I understand what you say. Well, is a smart way to access XML, and this is the main idea. But you only really know what a module does if you read at least the NAME section at POD.

      I put XML::Smart following the style of XML::Simple... and I still like the name XML::Smart. Unless I find a better name, that have a better mean for the use too, I will keep it.

      For now I don't think that XML::MultiType really show what it does. But thanks for the opinion. ;-P

      Graciliano M. P.
      "The creativity is the expression of the liberty".

        Although "Simple" shares some of the same problems with "Smart", there is at least the posibility of making an unbiased judgement that the module provides a much simpler interface (and so is also not as powerful). So "Simple" conveys something useful: "Use this if you find the other modules too complicated and you don't need as much power". And "Simple" should only be used when the degree of simplification is quite large.

        You also see that "Simple" is not saying the module is "Better".

        Naming modules can be difficult. It is best to take the time and effort to come up with an appropriate name before you release it. "Unless I find a better name" doesn't make it sound like you are spending the effort to come up with a better one.

        I'm glad you "like" the name "Smart". Why is that not a surprise? "My module is smart" is not something I expect an author to find distasteful. Also note that it is easy to have a "blind spot" toward one's own work.

        Please make the effort to come up with a more descriptive name for your module rather than "punting" with a name that praises your own work in a most generic way.

                        - tye

        I strongly dislike "XML::Smart" too.

        I think the most succint and expressive name for this module would be XML::DWIM.

        There is no XML::Easy yet either. That name is distinct from XML::Simple enough to distinguish them (what's easy is not necessarily simple, and vice versa).

        Both of these actually describe your goal, whereas "smart" does not.

        Makeshifts last the longest.

Re: XML::Smart - Development in final stage. (beta is out)
by grantm (Parson) on May 13, 2003 at 06:05 UTC

    Very cool concept. Shall I set up my intray to forward to you? :-)

Re: XML::Smart - Development in final stage. (beta is out)
by diotalevi (Canon) on May 13, 2003 at 15:33 UTC

    All I hear is radio gaga... No wait, make that XPath. Does your code ensure that node ordering is retained? Its a critical concern to me that if I read an XML file in that its elements are written back in exactly the same order. Without order preservation DTD validation is impossible.

    Oh and I'm very unhappy about the name ::Smart. It communicates nothing.

      Yes, it keeps the order! Note that you actually have a HASH tree inside the object, that have your XML data, exatcly like XML::Simple. When you have multiples nodes, they are inside an array, and if you have a key, and add a new node, this key is converted to an array, with the previous key as the first.

      The only order that is changed is the order of arguments in the tag. What I don't know if can make changes in the use of XML.

      Graciliano M. P.
      "The creativity is the expression of the liberty".

        Yes, it keeps the order!
        The only order that is changed is the order of arguments in the tag. What I don't know if can make changes in the use of XML.
        I don't know the difference between the two orders you specified. Can you be clearer about which order is kept and which is not? Most likely the only order in question is sibling element order which is what I'm stating is critical and is also what is thrown away by many of the other XML->perl hash translations out there.

Re: XML::Smart - Development in final stage. (beta is out)
by yoz (Initiate) on May 13, 2003 at 15:31 UTC

    By an astonishing coincidence, in the last 24 hours Aaron Swartz released a Python module that works in almost the same way - and his module name is almost an anagram of yours. :)

    I was looking around because I was wondering about implementing a Perl equivalent. Glad to see I don't have to! The thing that threw me the most, though, was how to have an object that you can access as both a hash and an array (and anything else) - how'd you do that?

    -- Yoz

      how'd you do that?

      See the module Object::MultiType (link in the main node). To make it work I have used overload, where you can use methods to return the value of an object when it's accessed as ${}, @{}, %{}, &{}, *{}, and other ways (see the overload POD).

      I have used a Saver object inside the main object too, where the different data types are saved. And other cool thing, is that all the different objects that are returned from XML::Smart, share the same XML tree, soo, when an object comes from another (clone), it's the same obj, but that points to a different part in the tree, what use less memory.

      Than I linked the overload resource with a tied HASH and ARRAY that work togetther to work Smart with the XML tree. ;-P

      See the source of the module, it's at least educational, but I think it cool too. ;)

      Graciliano M. P.
      "The creativity is the expression of the liberty".