I was wondering if you could possibly explain the regex you have used. I am trying now to identify one occurance of the term in a line of text so that I can work out the inverse document frequency (IDF).
So far I have worked out that you are looking for the term, using a non-capturing means (?:pattern), i.e. (?:\W). I haven't a clue what this actually does, nor about the part after \E ..... (?:(?=\W).
I know that the (?=\W) is a regex to look-ahead of a non-word, but not sure what the outer ?: is doing.
cheers,
MonkPaul.
In reply to Re^4: Regex word boundries
by MonkPaul
in thread Regex word boundries
by MonkPaul
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |