in reply to Tagging a corpus with multiple tags
kcott suggested a good approach in their reply to Re^2: Finding multiword units in a corpus - how does that approach fail for you?