Given that this has ultimately come from a web page I suspect that it might be some sort of non-core character. What I would suggest is that work out what characters are allowable.
Some regular expression like /^([\w\-]+)/ and use $1 to extract the string of interest. Actually you should make your regular expression match as closely as possible what is allowable. I doubt that anything malicious is going on here, but in general suspicious characters entered via web pages is a common form of hack on the internet. Taint mode (-T) is the usual defence in the perl world and you may want to read up on that even if you decide it is overkill in your case.
Comment on Re: Something strange in the world or Regexes