in reply to Re^2: Hacker News titles using U+2013 EN DASH
in thread Hacker News titles using U+2013 EN DASH

Thanks. I hadn't done the character conversion correctly. I needed s/\x{2016}/.../. There were three other characters I found in recent feed as well: 2019, 201C, and 201D. If you see any others needing to be converted, please drop me a note. Thanks!

Replies are listed 'Best First'.
Re^4: Hacker News titles using U+2013 EN DASH
by NERDVANA (Priest) on Jan 11, 2024 at 04:19 UTC
    Maybe hit the remainder with Text::Unidecode if you don't want any UTF-8 output?
      Text::Unidecode

      Is that in the standard library? I don't have the ability to install modules on this system. (Maybe Corion can do it.)

      And would this be all I need? i.e. will it cover the four that I've already done?

      Thanks!

      Today's latest and greatest software contains tomorrow's zero day exploits.
        Definitely a cpan module.
        perl -E 'use Text::Unidecode; say unidecode("\x{2016} \x{2019} \x{201C +} \x{201D}")' || ' " "

        are you sure you had the right codepoint above for 2016?