in reply to Re^9: Any good ways to handle NARROW NO-BREAK SPACE characters in regex in newer versions of Perl?
in thread Any good ways to handle NARROW NO-BREAK SPACE characters in regex in newer versions of Perl?

Why do you use literals in your source code?

You have the UTF-8 in your source code:

my $blah = "Screenshot-2024-02-23-at-1.05.14 AM.png";

... but you never tell Perl that your source code should be seen as utf8.

Don't use UTF-8 in your source code unless you also tell Perl about it.

Also, you will have noted already that your regular expression matches on the decoded filename.

  • Comment on Re^10: Any good ways to handle NARROW NO-BREAK SPACE characters in regex in newer versions of Perl?
  • Select or Download Code

Replies are listed 'Best First'.
Re^11: Any good ways to handle NARROW NO-BREAK SPACE characters in regex in newer versions of Perl?
by nysus (Parson) on Aug 13, 2024 at 17:49 UTC

    PerlMonks website added them in. I just forgot to delete them. But they are narrow non-breaking space characters.

    $PM = "Perl Monk's";
    $MC = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar Parson";
    $nysus = $PM . ' ' . $MC;
    Click here if you love Perl Monks

      Yes, but they are what I mean by "literals". If you have any byte above 127 in your source code, you need to tell Perl what encoding your source code is in if it is not Latin-1.

      You have something that is UTF-8, but you are not telling Perl that your source code contains UTF-8.

        Well, through all this frustration I learned something. I mistakenly assumed utf8 was turned on with `use v5.36`. And when I did try `use utf8`, other problems in my code masked the issue.

        $PM = "Perl Monk's";
        $MC = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar Parson";
        $nysus = $PM . ' ' . $MC;
        Click here if you love Perl Monks