perlquestion
erix
<p>Hi all,
<p>I am going to make a regex collection for capturing specific (english) language constructs. These can then be used to parse/index/search texts. If such a regex-collection is large and general enough, it should be possible to collect
and organise them <i>without</i> knowing the precise form of the text beforehand. My experience with science-like articles (which are the target) is that the text and style are often repetitive, almost monotonous (not meant
negatively here).
<p>My question is: would something like a Natural Language regex collection already be in existence? I know Regexp::Common &c, but they all seem to be very much more specialized than what I was hoping to find.
<p>I'd be thankful for pointers or further ideas.