I'd just like to spread information about internationalization for the Americian monk who thinks naïvely thinks other languages all use8859_1just a handful of accented letters.
I apologize. This was indeed a bit rude of me:
I admit this encoding problem is just a minor nit, and that it's not central to the problem of the OP. There're just one reason why I mentioned it: you included accented characters to your examples.
It was a 5 minute throw away script I just tossed off to give an idea of how the problem could be approached. Sorry I can't live up to the high standards of ambrus who naively believes that every quick and dirty one-off script should be perfect in every way and cover every eventuality.
Very true. I often fall to this mistake.
In reply to Re^4: a question about making a word frequency matrix
by ambrus
in thread a question about making a word frequency matrix
by peacekorea
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |