in reply to Re: Re: How will I retrieve values from a POS-tagged question
in thread How will I retrieve values from a POS-tagged question
Sorry I did not mention it before, but the file with my-rules (patterns) should be associated to another file with one or more possible answer(s). Thus when it is a match (true) Formula-Rule, I have to retrieve the sentence (questions) and the answer(s) associated with that rule.
Now, only the question is described by the rule, right? Are the question and answer to be stored in the same file or in different ones? For the moment I'll assume a total of two files, one for rules and one for question/answer pairs.
Okay, let's break the problem into two parts: data storage and data manipulation.
For storage your options are rather open, with the restriction that you have a trustworthy correlation between a rule in one file and the sentences which the rule describes in another. Thus, for any data set such as <s>/SYM Who/WP is/VBZ the/DT author/NN of/IN the/DT book/NN... ?/. </s>/SYM, you'll have two files each containing different subsets of the data, namely, tags and sentences. This will work fine as long as your files don't get tampered with, because you'll be depending on the order in which data appears to know which question properly belongs to each rule. If you're concerned about this, you could supply an index for each entry so that rule 0 corresponds to question/answer pair 0 in your other file. This is still far from unbreakable, but it's a little better. (As an aside, consider looking at something like DB_File if your data collection is going to get very large at all.)
Now as to the data structure for doing your actual look ups; yes, I still think a hash of arrays is a good place to start. You'll need the arrays to handle cases of multiple questions/answers per rule since hashes eliminate duplicate keys. Of course, in your text files you can have as many duplicate entries as you want because they're just text files! Probably you'll end up slurping both files into arrays and then combining them into a hash using some code along the lines I provided in my first post. Then as you run through a list of rules for which you wish to find question/answer pairs, you just have to do the hash lookup.
Good luck!
|
|---|