in reply to Research App Optimization

Looking for such patterns might come under the heading of data mining. Why am I not surprised that CPAN has a module in this area? Data::Mining::AssociationRules (no idea on quality I'm afraid). It might pay to look at this module and if it suits your needs, store your data in whatever format it wants natively.

By the way, having xxx1, xxx2 in code or data structures is often a warning sign that you might want to consider some kind of container or list structure. Also, it as useful to put effort into good naming in data models as it is for variables. If something is the answer to a question then it might be better named answer (apologies if I've misunderstood your data structure).

In your XML (and I agree with others that XML is primarily a data transfer language rather than a data storage/querying lanugage) it might be better to have:

<responses> <response id="1"> <answer question="1">I like cheese</answer> <answer question="2">Halloween is scary</answer> ... </response> ... </responses>
Depending on anonymity, you might add other attributes to identify a particular <response>, such as name, date etc.

If you're writing code (or XPATH queries) to examine your original data structure, you'll end up duplicating things to look for all the different tag types.

I bring this up, because most of the same issues apply to data modelling in SQL as well, so its worth thinking about at this stage.

Replies are listed 'Best First'.
Re^2: Research App Optimization
by sskohli (Initiate) on Nov 01, 2006 at 05:42 UTC
    >> By the way, having xxx1, xxx2 in code or data structures is often a warning sign that you might want to consider some kind of container or list structure. >>Also, it as useful to put effort into good naming in data models as it is for variables.

    You are soo right, my boss and my teachers used to eat my head about proper naming convention, When i started this project, i re wrote code two times, becausei forgot what the variables meant and what the code section did, now i am putting adequate comments and using Long and descriptive variable names.
    Learning this the hardway
    thanks
    sandeep