I just need to grab the text from certain elements to make document term vectors for querying. I just need the "words" and an id.
The problem is I'm parsing thousands of XMLs from various external sources. I don't have entity lists for all of them and I can't predict what entities will appear. And I don't need the entities anyway.
Thanks,
Rob