Hacker News titles using U+2013 EN DASH

Replies are listed 'Best First'.
Re: Hacker News titles using U+2013 EN DASH by hippo (Archbishop) on Jan 09, 2024 at 10:19 UTC
Tangentially, does anyone know why they are using this character in preference to the universally unproblematic `HYPHEN-MINUS` (0x2D) in the first place? 🦛	[reply] [d/l]
Re^2: Hacker News titles using U+2013 EN DASH by Anonymous Monk on Jan 09, 2024 at 14:15 UTC
Why not?	[reply]
Re^3: Hacker News titles using U+2013 EN DASH by bliako (Abbot) on Jan 11, 2024 at 13:36 UTC
`perl -MEncode -e 'print "because it idiotically takes ".length(encode_utf8("\x{2013}"))." bytes to say what ".length(encode_utf8("\x{2d}"))." byte can say as clearly, dummy\n."'`	[reply] [d/l]
Re: Hacker News titles using U+2013 EN DASH by jdporter (Paladin) on Jan 09, 2024 at 15:33 UTC
I've put in a 'fix' - won't know if it works until HN puts out a title with the offensive string in it again. Thanks!	[reply]
Re^2: Hacker News titles using U+2013 EN DASH by kcott (Archbishop) on Jan 10, 2024 at 00:43 UTC
Thanks. Both Slashdot nodelet and HackerNews nodelet are rendering without content. I'll keep monitoring. — Ken	[reply]
Re^3: Hacker News titles using U+2013 EN DASH by jdporter (Paladin) on Jan 10, 2024 at 22:02 UTC
Thanks. I hadn't done the character conversion correctly. I needed `s/\x{2016}/.../`. There were three other characters I found in recent feed as well: 2019, 201C, and 201D. If you see any others needing to be converted, please drop me a note. Thanks!	[reply] [d/l]
Re^4: Hacker News titles using U+2013 EN DASH by NERDVANA (Priest) on Jan 11, 2024 at 04:19 UTC
Re^5: Hacker News titles using U+2013 EN DASH by jdporter (Paladin) on Jan 11, 2024 at 04:36 UTC
Some notes below your chosen depth have not been shown here
Re: Hacker News titles using U+2013 EN DASH by bliako (Abbot) on Jan 11, 2024 at 13:49 UTC
How about converting it to just a plain ascii hyphen `\x{2D}`? Content will still be the same but more accessible and smaller.	[reply] [d/l]
Re^2: Hacker News titles using U+2013 EN DASH by jdporter (Paladin) on Jan 11, 2024 at 15:41 UTC
Here are the conversions currently as implemented: `s/\x{2013}/–/g; s/\x{2019}/'/g; s/\x{201C}/"/g; s/\x{201D}/"/g;` [download] This seems like a suboptimal approach to me. Does anyone have any better ideas? Today's latest and greatest software contains tomorrow's zero day exploits.	[reply] [d/l]
Re^3: Hacker News titles using U+2013 EN DASH by kcott (Archbishop) on Jan 11, 2024 at 19:16 UTC
This is a response to what you have here plus other posts throughout this thread. Slashdot nodelet and HackerNews nodelet both have content now. (ref. #11156828) I checked `U+2013 EN DASH` in a number of places: all seem to be rendered correctly. (ref. OP) In #11156847 you wrote "I needed `s/\x{2016}/.../`.": that's a "`‖`" character. `U+2016 DOUBLE VERTICAL LINE` may be needed but, from the context, and the fact that this is only mentioned once, I wondered if this might be a typo. `U+201C LEFT DOUBLE QUOTATION MARK` and `U+201D RIGHT DOUBLE QUOTATION MARK` are the pair "`“`" & "`”`". Instead of converting both to "`"`", perhaps using "`“`" & "`”`" might be a better option. (ref. #11156847 and #11156884) `U+2019 RIGHT SINGLE QUOTATION MARK` is perhaps being used as a fancy apostrophe; I'm not seeing an example at the time of writing. Similar to the last dot point, you might want to proactively address the potential `U+2018 LEFT SINGLE QUOTATION MARK` and `U+2019 RIGHT SINGLE QUOTATION MARK`, being the pair "`‘`" & "`’`". And in the same vein, "`‘`" & "`’`" might be better options. (ref. #11156847 and #11156884) Rather than a whole bank of individual `s///g`, each of which needs to be run for every string, I'd be more inclined to use a lookup table and a single `s///g`, which only needs to be run once for every string. Something along these lines: $ perl -Mutf8 -C -E ' my %ent_for_char = ( "\x{2013}" => "–", "\x{2018}" => "‘", "\x{2019}" => "’", "\x{201c}" => "“", "\x{201d}" => "”", ); my $test_str = "“fancy double” – ‘fancy single’ – fancy’apostrophe"; say $test_str; $test_str =~ s/(.)/exists $ent_for_char{$1} ? $ent_for_char{$1} : $1/eg; say $test_str; ' “fancy double” – ‘fancy single’ – fancy’apostrophe “fancy double” – ‘fancy single’ – fancy’apostrophe You can modify the table (e.g. add `"\x{2014}" => "—",`) without requiring any changes to the code doing the processing. — Ken	[reply] [d/l] [select]
Re^4: Hacker News titles using U+2013 EN DASH by jdporter (Paladin) on Jan 11, 2024 at 20:19 UTC