Re^2: Perl Data Structure Validation Part 2

>> Maybe Data::Validate::Structure is worth a look too. But

Just looked at it, thanks for the tip. Data::Validate::Structure is actually a little like a recursive descent parser. So the approach is somewhat like Data::Rx. Data::Rx seems much more thought out, as it has a spec, and is designed like a traditional recursive descent parser (so if you've seen one, you know how to extend it). Recursive descent is good for validating recursive patterns, and data that is localized to a subtree, because of the model of validation (a DFS traversal through a tree). Imagine trying to compare two parts of a tree with DFS...not very natural. This is why I mention the need for post processing using something that lets me make sets that are from dispersed regions of the tree (like Xpath, but not Xpath), and do set operations on them.

>> when you want to use XPath like functionality I don't see the point of doing it in a non-XML way

There are more than a few passing differences between the syntax of perl data structures and XML. The only reason to convert is if we were actually using XML, or if conversion gave us access to a particular XML tool. I'd actually turn the argument on its head. XPath, while being an XML related technology, has easy to see analogous operations in a non XML world. The underpinnings of XPath are fairly abstract, and it's more like a math. Note Data::DPath is an example of this, and at the top of the module POD the author gives about 10 basic points that indicate why you wouldn't want to use XPath to do this on perl data structures (differences and structural reasons). Also the heavyweight nature of converting everything to XML and processing - IMHO is not a good fit for internal data structure processing. Another issue is the visual density of XML. Imagine changing all of your data structures in code to look like XML - having code be readable trumps many other concerns (a reason I prefer the terseness of perl). XML, while it started out being human readable, doesn't really reach that goal in many practical cases (this is one reason why relaxNG compact schema became popular - it is 1/10th the size of an equivalent XML schema).

I've done plenty of XML with schema checking (trang => relax ng compact, xml schemas).

>> Maybe you have to rethink the problem. Depending on the complexity of your data structure generating XML out of it might be straightforward (XML is hierarchical). Then you can use all the XML tooling you like. With XMLSchema you can do powerful validations (there are limitations of course). The requirement you describe: a set of keys is dependent on a set of keys in another part of the structure sounds a bit tricky but could (maybe) be handled by using key, keyref and unique constructs. It works much like the primary key/foreign key concept in a RDBMS. A small sample taken from W3C to illustrate:

That's an interesting approach. The analogy would be to do left/right outer joins and look for the nulls. However, I'd argue set operations with a little code are more powerful "setA is a subset of setB", is more compact and direct than the equivalent left outer join. Note you'd also have to start jamming all of the other SQL stuff in (unions, etc), to have even a chance of competing with set operations that something "Xpath like" gives.

>> There are also other schema languages like relaxng and good old DTDs (well more old than good:).

I've tried both of these on past projects with success, when applied to the correct problem (e.g. validating web service XML requests). But I'd argue this isn't the correct problem to use those technologies.

Comment on Re^2: Perl Data Structure Validation Part 2