Hi Ken, thank you very much for the suggested improvements. I am going to have a look at all of them.
I since realised I have another problem to solve though: My code only printed the last tag found for a token and not all of them, so if the same token were to appear more than once with a different ID for each occurence in the tagset, I would only get the last one in my output and not a list of all possible IDs assigned to that token.
Any suggestions on how to handle this would be most appreciated.
In reply to Re^2: Finding multiword units in a corpus
by veg_running
in thread Finding multiword units in a corpus
by veg_running
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |