Hi allolex. There is a problem that nobody has yet mentioned. It concerns this line:
next if $element =~ /[^A-Za-zĄ-’]/;This is doing a lot more than you want it too, I think. Basically, it means "ignore any $element containing a character not in the set defined between square brackets". It is therefore stripping out, for example, any 'word' with attached punctuation. For example, in a sentence such as:
"Shut up!" he said.
you are throwing away three quarters of your 'words'! And you are also, of course, ignoring hyphenated words
It also means that the line:
$element =~ s/[\s\,\!\?\.\-\_\;\)\(\"\']//g;never actually does anything, with or without surplus backslashes...
hth
dave
In reply to Re: Constructive criticism of a dictionary / text comparison script
by Not_a_Number
in thread Constructive criticism of a dictionary / text comparison script
by allolex
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |