htmanning has asked for the wisdom of the Perl Monks concerning the following question:

Monks, I've done this a thousand times before, but for some reason this isn't working. I'm trying to escape or preferably replace double quotes that are copied from a website. I'm doing like this but Perl is not recognizing the quotes at all.
$string = 'copied from a web page looking like this: “Test.” '; $string =~ s/“//g; $string =~ s/”//g;
Am I just having a brain fart moment? Perl seems to not even recognize the quotes and just lets them go through. I have ruled out simple things like the string name being wrong or something.

Replies are listed 'Best First'.
Re: Escaping or removing quotes
by LanX (Saint) on Jun 17, 2021 at 22:43 UTC
    copying your code from PM works for me, but im pretty sure you have a unicode issue

    compare

    use strict; use warnings; use Data::Dump qw/pp dd/; #use utf8; my $string = 'copied from a web page looking like this: “Test.” '; dd $string; $string =~ s/“//g; $string =~ s/”//g; dd $string;

    without utf8 strings are an octect-stream

    C:/Strawberry/perl/bin\perl.exe -w d:/tmp/pm/replace_quotes.pl "copied from a web page looking like this: \xE2\x80\x9CTest.\xE2\x80\x +9D " "copied from a web page looking like this: Test. "

    with utf8 they are in the internal unicode variant

    C:/Strawberry/perl/bin\perl.exe -w d:/tmp/pm/replace_quotes.pl "copied from a web page looking like this: \x{201C}Test.\x{201D} " "copied from a web page looking like this: Test. "

    you need to make sure that the chars in your file (i.e. the way it's saved by your editor) and the web-input are both in the same encoding, because “” are not ASCII-characters.

    see perlunitut

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      Thanks. I added the following to the page that has the form on it.
      <meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
        Please use utf8 too if you wanna use literal utf8 characters in your source.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

Re: Escaping or removing quotes
by hippo (Archbishop) on Jun 17, 2021 at 22:41 UTC

    Are you sure they are actually the normal double-quote character (0x22) and not some stylized alternative at a different code point?


    🦛