in reply to Re^3: a question about making a word frequency matrix
in thread a question about making a word frequency matrix
I'd just like to spread information about internationalization for the Americian monk who thinks naïvely thinks other languages all use8859_1just a handful of accented letters.
I apologize. This was indeed a bit rude of me:
I admit this encoding problem is just a minor nit, and that it's not central to the problem of the OP. There're just one reason why I mentioned it: you included accented characters to your examples.
It was a 5 minute throw away script I just tossed off to give an idea of how the problem could be approached. Sorry I can't live up to the high standards of ambrus who naively believes that every quick and dirty one-off script should be perfect in every way and cover every eventuality.
Very true. I often fall to this mistake.
|
|---|