Weird underscore/whitespace failing regex

d-napizzle has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Weird underscore/whitespace failing regex by JavaFan (Canon) on May 12, 2009 at 23:03 UTC
Also, when I paste that string into non-emacs things, as you can see, there is no underscore, but there is whitespace. Look, puny mortal, are you questioning Emacs? The Holy Developers Environment written by God¹ Himself? If Emacs shows a red underscore, there is a red underscore; do not question Emacs by using software written by false gods² which cannot display red underscores. The correct way to replace a red underscore is: $str =~ s/_/-/; Getting a red underscore in Emacs is easy - just hit the right 7 keys at once, and Emacs will start up the underscore wizard. You find the keys to hit in the info page (never use the manual page!). Inferior editors will not be able to do this. ¹Beard, long hair, open arms, must be a god. ²Shaven, has seen a haircutter recently, black turtleneck, cannot be a true god.	[reply]
Re^2: Weird underscore/whitespace failing regex by Fletch (Bishop) on May 13, 2009 at 16:57 UTC
What does god need with a ~~starship~~ editor?</Kirk> The cake is a lie. The cake is a lie. The cake is a lie.	[reply]
Re: Weird underscore/whitespace failing regex by almut (Canon) on May 12, 2009 at 22:50 UTC
Just a wild guess... Maybe there really is no mysterious character in between 6 and 1, and it's just some ultra-clever emacs mode (inadvertendly activated) that's trying to alert you that there should be something other than a space in between 6 and 1 (maybe it's expecting an operator, or some such?)... What happens when you delete the weird character and enter a new space (instead of an underscore)?	[reply]
Re: Weird underscore/whitespace failing regex by moritz (Cardinal) on May 12, 2009 at 22:39 UTC
`use Data::Dumper; $Data::Dumper::Useqq = 1; print Dumper($str);` [download] This will show you what's insider your `$str`.	[reply] [d/l] [select]
Re: Weird underscore/whitespace failing regex (nbsp) by tye (Sage) on May 12, 2009 at 23:56 UTC
I bet it is a non-breaking space. Try `s/\xa0/-/`. Instead of emacs, use something that doesn't mind making things uglier but clearer, such as recommended elsewhere in this thread. - tye	[reply] [d/l]
Re: Weird underscore/whitespace failing regex by przemo (Scribe) on May 12, 2009 at 22:20 UTC
I'm not sure, if I understand correctly, but please send us hex dump of the file (e.g. with `hd -C file` on Linux). Then it will be evident what mysterious bytes live under your text.	[reply] [d/l]
Re^2: Weird underscore/whitespace failing regex by ikegami (Patriarch) on May 12, 2009 at 22:40 UTC
Since we're dealing with a var, `use Data::Dumper qw( Dumper ); $Data::Dumper::Useqq = 1; print(Dumper($str));` [download] or `use Devel::Peek qw( Dump ); Dump($str);` [download]	[reply] [d/l] [select]
Re: Weird underscore/whitespace failing regex by d-napizzle (Initiate) on May 13, 2009 at 13:54 UTC
Wow, I didn't think anyone would actually respond to this. @JavaFan Best. Response. Ever. I tried running `use Data::Dumper; $Data::Dumper::Useqq = 1; print Dumper($str);` [download] from the command line with `print Dumper("6 1/2")` [download] but what's interesting is that every time I would copy/paste "6 1/2" from the Most Holiest of Editors Emacs, it would come out as "1/26" on the command line. Then Dumper would say it just contains "1/26", which is not helpful. So I gave up on that and put the code right in my script. The results from that were much more interesting: `$VAR1 = "6\2401\\/2";` [download] So, it looks like I have a \240 running a muck in my data. Googling tells me this should actually show up as ð but The Most Righteous of Editors was displaying my data in iso-latin-1-unix encoding, as opposed to straight-up UTF-8. Thanks so much, monks.	[reply] [d/l] [select]