I'm basically trying to parse text into Noun, Verb, and Preposition Phrases. Nouns are the most troublesome at the moment, because although the individual patterns match what I want in test output, when they are OR'd together, their output is not always correct Thanks$noun ="(?: *[A-Za-z0-9._]+\/NN[PS]*)"; $det ="( *[A-Za-z]+\/DT)"; $adj ="( *[A-Za-z]+\/JJ[RS]?)"; $gen ="( *[A-Za-z]+\/POSS)"; $adv="( *[A-Za-z\']+\/RB[RS]?)"; $inf =" *to\/TO"; $adv="( *[A-Za-z\']+\/RB[RS]?)"; $np1="(?:$det|$gen)"; $np2 ="(?:$adj|$num|$conj|$adv|$inf)"; $np3="(?:$np1*\s*(?:$noun)*\s*$np2*\s*(?:$noun)+\s*$adj*)"; $np4="((?:$noun)+\s*$np2+\s*(?:$noun)+)"; $np5="(?:$np1*\s*$adj+\s*($noun)+)"; # more complex noun and prep phrases $NP = "(?:(?:$np1)*\s*(?:$np3)+)"; $NP1 = "(?:$np3)+\s*(?:$np2)\s*(?:$np3)+"; $NP2 = "(?:(?:$np1)+\s*(?:$np3)+\s*(?:$np4)+)"; $NP3 ="(?:$np1*\s*$noun+\s*[^INV]+\s*(?:$noun)+)"; $NP4 ="$np1+\s*[^NV]+\s*$noun+"; $nps= "(?:($NP1)|($NP2)|($NP3)|($NP4)|($NP))"; $extnp="(?:($pro(?!\$))|($np5))";
In reply to RE: Re: greedy and lazy
by Anonymous Monk
in thread greedy and lazy
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |