Hi Crosis,

I'm going to write provocatively about a series of topics, one comment per topic. My thesis for each will be that Perl is strong in some particular area in which Python is weak.

This one's about text processing that involves text segmentation (i.e. character or substring processing) of Unicode text.

In a nutshell Perl is a world leader in getting this right. The Perl 5 community has trailblazed supporting devs in dealing with all the fiddly details in as practical a manner as it could manage given its existing runtime and standard library functions. Perl 6 has trailblazed developing a new runtime and standard library that makes it easy for mere mortals to get the right results without having to have a degree in Emoji data science.

In the meantime, the Python language, string type, standard library, and doc all entirely ignore the pieces necessary for getting text segmentation right per Unicode annex #29 (linked above) so it is all but impossible for any ordinary dev to correctly segment arbitrary Unicode text in Python 3.7.

Feel free to ask what the heck I'm talking about if it's not obvious from what I've written and the link I provided.

If you follow up on this comment I'll post another topic so we can keep things rolling. And if you comment on that, I'll post on another topic. I think I've got maybe 10 if you've got the stamina...

Hi monks, hope you're all doing well.


In reply to Re: Curious about Perl's strengths in 2018 by raiph
in thread Curious about Perl's strengths in 2018 by Crosis

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.