![]() |
|
Your skill will accomplish what the force of many cannot |
|
PerlMonks |
comment on |
( #3333=superdoc: print w/replies, xml ) | Need Help?? |
Hello XML fans, it's time to do some Prolog-like search and query on a
small XML database. What is shown below is an adjacency map. It is an
XML document which shows which cities are next to which other
cities. The utility of such a document/data structure can be imagined
to be if a person had an inter-city travel ticket and wanted to look
up which cities were next to his. Which cities were two away, etc,
etc.
While you could use normal nested Perl data structures to deal with this, XML is becoming en vogue and as a result we have to be just as fashionable. Actually, this isn't true, we can always use Gisle Aas' Data::XMLDumper to convert XML to-and-for Perl nested data structures. But for the purpose of this tutorial, we will act like that module doesn't exist. So without further adieu, I present the XML document detailing the (far too windy) part of the world I currently live in (and will be escaping from as soon as Christmas is here):
Ok, so now whatSo, now that I have shown the data, it is time to grok it, munge it, eat it for breakfast as a meal replacement and basically put it at it's knees to do our bidding.Program One: find all cities next to menlo parkOk, here is a program to grok this XML-base for all cities next menlo park:and here is the pretty output:
all doneThe program was documented, so it should make sense, but let's take a closer look at candidate_generator().It consists of two nested greps and hence can be a little confusing. Depending on the way you think you might want to think about the outer grep and then the inner grep or vice versa. It is only fitting that I discuss both methods of program comprehension. Let's do top-down first. The outer grep is basically saying: take all the XML records and only return the ones which satisfy the inner search criteria. The inner search criteria takes each individual XML record and looks at each of it's children, where each child is a city and examines its text for equality with the text to be searched for, or concretely speaking menlo park. Ok, now bottom up. The innermost expression is $_->text eq $search_text and what this does is take an XML element and get its text and compare it to a normal Perl string. So if $elt was an XML::Twig::Elt representing then $elt->text would be boise. Now we work out a bit more. And a bit more out is grep { YADAYADA } $_->children So here we take advantage of the fact that the XML is structured so that neighboring cities are both children of the pair element, e..g: and we are just checking to see if either child is the text we are looking for. And now we finally make it to the outer grep and the first sentence in the top-down description says what that is doing. th-th-th-that's the first post, folksAnyway, that was the first in a series of 3 posts. The next two will do slightly more advanced searching and in the process introduce a call or two more from the XML::Twig API.In reply to Adjacency List Processing in XML::Twig by princepawn
|
|