http://qs1969.pair.com?node_id=996483


in reply to Re^3: This looks like whitespace in my CSV but doesn't seem to be
in thread This looks like whitespace in my CSV but doesn't seem to be

I need to get a newer perl as I'm on 5.10.1 and /u throws an error.

Cheers though, will look how I can do this.

Walking the road to enlightenment... I found a penguin and a camel on the way.....
Fancy a yourname@perl.me.uk? Just ask!!!
  • Comment on Re^4: This looks like whitespace in my CSV but doesn't seem to be

Replies are listed 'Best First'.
Re^5: This looks like whitespace in my CSV but doesn't seem to be
by Anonymous Monk on Sep 30, 2012 at 09:35 UTC

    You can always use unicode-regex-range-character-class.pl

    space => [\u0009-\u000D\u0020\u0085\u00A0\u1680\u180E\u2000-\u200A\u20 +28-\u2029\u202F\u205F\u3000]

    so

    $ perl -pe " s{\\u(....)}{\\x{$1}}g " [\u0009-\u000D\u0020\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028-\u2029 +\u202F\u205F\u3000] [\x{0009}-\x{000D}\x{0020}\x{0085}\x{00A0}\x{1680}\x{180E}\x{2000}-\x{ +200A}\x{2028}-\x{2029}\x{202F}\x{205F}\x{3000}]

    Thus

    #!/usr/bin/perl -- use warnings;use strict; use Data::Dump; $_ = qq{\xC2\xA01.00};; utf8::decode($_); dd[$_]; s{^[\x{0009}-\x{000D}\x{0020}\x{0085}\x{00A0}\x{1680}\x{180E}\x{2000}- +\x{200A}\x{2028}-\x{2029}\x{202F}\x{205F}\x{3000}]+}{}g; dd[$_]; __END__ ["\xA01.00"] ["1.00"]

    Although, in 5.10 you could probably just use  s{^\p{space}+}{}g;