comment on

From your later replies I can now see the problem that you are trying to solve, which I admit I didn't really appreciate at first.

I tried to see how to apply your structure to an XML file that contained repetitous elements

Eg.

<users>
    <user>
        <id>fred</id>
        <pass>crude</pass>
    </user>
    <user>
        <id>barney</id>
        <pass>polite</pass>
    </user>
</users>
[download]

And I think I could see it working for a simple repetitious grouping as shown here, once you start getting groups within groups, you end up with a huge, tautological mess that isn't even easy to see how to construct or search using nested recursion. It certainly won't be quick or easy.

The only reasonably simply solution that I can think of is to build an single-level array, with each unique path in the data represented using a string representation of its path, and a second array of arrays that is a fully inverted index of the first.

my @paths = (
    "/colours/paprika/1/good/red",
    "/colours/paprika/1/bad/green",
    "/colours/paprika/2/good/yellow",
    "/colours/paprika/2/bad/brown",
    "/colours/banana/yellow"
);
my %invert = (
    colours=>[0,1,2,3,4],
    paprika=>[0,1,2,3],
    banana=>[4],
    good=>[0,2],
    bad=>[1,3],
    red=>[0],
    green=>[1],
    yellow=>[2,4],
    brown=>[3],
);
[download]

It's fairly easy to see how for any given query path, you can split it into its sub elements, look up each part in the array of arrays, do a little set arithmetic to locate those paths in the simple array that contains all the requisite matches, and then use the results to find the associated data.

Building the inverted index can be achieved on-the-fly as the paths array is being constructed, or en-mass afterwards. The only slightly tricky part is remembering the next available index for groups of data, especially when these can be nested, but the final representation should be considerably less memory hungry than the equivalent hash and array tree, and the lookup of partial match query paths quite a lot faster too.

Examine what is said, not who speaks.

1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
3) Any sufficiently advanced technology is indistinguishable from magic.
Arthur C. Clarke.

In reply to Re: Idea for XPath implementation by BrowserUk
in thread Idea for XPath implementation by Jaap

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.