Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

constants in an RE

by BernieC (Pilgrim)
on May 15, 2023 at 15:44 UTC ( [id://11152190]=perlquestion: print w/replies, xml ) Need Help??

BernieC has asked for the wisdom of the Perl Monks concerning the following question:

I know how to interpolate a variable into an RE, but it looks like doing it with a constant doesn't work. My code looks something like:
use constant APOSTROPHE => "\xE2\x80\x99" ; my $string =~ s/APOSTROPHE/'/ ;
and it doesn't seem to be working. The string I was playing with is "wjzwxl cowuxze."

Replies are listed 'Best First'.
Re: constants in an RE
by choroba (Cardinal) on May 15, 2023 at 16:01 UTC
    The m// or s/// work more like double quotes, and you can't include constants into double quotes this way, either.
    print "THIS APOSTROPHE MATCHES" =~ /APOSTROPHE/; # 1

    There are tricks how to do it, though:

    use constant APOSTROPHE => chr 39; print chr(39) =~ /${ \ APOSTROPHE }/; # 1

    Update: s/(?<=t)(?=icks)/r/

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: constants in an RE
by haukex (Archbishop) on May 15, 2023 at 15:58 UTC

    That's (IMHO) one of the downsides of constants. Take a look e.g. at the "baby cart" operator from perlsecret.

Re: constants in an RE
by BillKSmith (Monsignor) on May 15, 2023 at 21:30 UTC
    Using Unicode Character names (Creating Unicode) avoids your problem. It has the further advantage that it eliminates confusion between similar looking characters.
    use strict; use warnings; use utf8; use Test::More tests=>1; my $string = "wjzwxl cowuxze."; my $required = "wjzw\N{APOSTROPHE}xl cowuxze."; #"wjzw'xl cowuxze. +"; $string =~ s/\N{RIGHT SINGLE QUOTATION MARK}/'/ ; is( $string, $required, 'APOSTROPHE' );

    UPDATE: Added link

    Bill
Re: constants in an RE
by afoken (Chancellor) on May 15, 2023 at 18:32 UTC

    Use Readonly instead of constant.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      Use Readonly instead of constant.

      Years ago I started doing exactly that, only to discover strange bugs. I then completely abandoned Readonly, opting to use normal variables in all caps - it gives variables that interpolate and that can be overridden for testing. And since I mostly publish libraries where such variables are typically undocumented package variables, if someone chooses to reach into my package and break something by changing "constants", it's their fault. And yes, this philosophy only works when you don't have many users and those users are programmers...

        The module ReadonlyX claims to be a drop-in replacement for Readonly which fixes many of the bugs and avoids the time penalty. Unfortunately, it is not an exact "drop-in". It may be a good solution to your problem.
        Bill
        > such variables are typically undocumented package variables,

        FWIW: There is an old poor man's trick to make package vars readonly, by assigning refs of literals to globs.

        DB<35> *RO=\5 DB<36> p $RO 5 DB<37> $RO=6 Modification of a read-only value attempted at (eval 51) ...

        HTH! :)

        Cheers Rolf
        (addicted to the 𐍀𐌴𐍂𐌻 Programming Language :)
        Wikisyntax for the Monastery

Re: constants in an RE
by bliako (Monsignor) on May 15, 2023 at 21:39 UTC

    Far fetched but you can execute perl code in BOTH a regex-match (e.g. /HERE/) and the match-side of a regex substitution (e.g. s/HERE/../) and use its return value to shape the final regex pattern (see (??{ code })). So, using (??{ code }), you can do:

    use constant APOSTROPHE => "\xE2\x80\x99" ; my $string = "..."; $string =~ s/(?{{ APOSTROPHE }})/'/ ;

    If you want to call perl code in the replacement side of a substitution (e.g. s/../HERE/) simply add the /e (for eval) regex modifier, i.e. $string =~ s/.../APOSTROPHE/e

    Perl code executes within the baby-cart operator suggested by haukex and also in choroba's ref-deref trick.

    Edit: OP's code my $string =~ s/...//; has been modified above, thanks choroba.

    bw, bliako

      Note that (?{{ ... }}) turns off at least some regex optimizations, since the Perl compiler has (at least in the general case) no idea what the code will actually return. Whether this is a problem depends on what you are doing.

Re: constants in an RE
by kcott (Archbishop) on May 16, 2023 at 00:03 UTC

    G'day BernieC,

    My code looks something like:

    use constant APOSTROPHE => "\xE2\x80\x99" ; my $string =~ s/APOSTROPHE/'/ ;

    Unlikely. Probably closer to:

    use constant APOSTROPHE => "\xE2\x80\x99" ; my $string = "wjzwxl cowuxze."; $string =~ s/APOSTROPHE/'/ ;
    "it doesn't seem to be working"

    That's a useless error report which, after 16 years and 850 posts, you really should know.

    You've been shown an impressive number of hoops that you could jump through to force the constant into a regex. Perl's string handling functions do not have the problem you've encountered with a regex; for instance,

    my $pos = index $string, APOSTROPHE; substr($string, $pos, length APOSTROPHE) = q{'};

    [If it's important to you, they're probably also faster (Benchmark to check).]

    You should probably handle the potential case of a string containing more than one "". If you have to deal with more than one string, you should abstract the solution into a subroutine so that you only need to code it once. If you have this requirement in more than one script, you should put that subroutine into a module so that all scripts can share the one solution.

    Here's an example of what that subroutine might look like. Note that I've done away with the constant altogether; however, if you felt that its inclusion was necessary, you could use index() and substr() as I've shown above.

    #!/usr/bin/env perl use 5.010; use strict; use warnings; use utf8; use open OUT => qw{:encoding(UTF-8) :std}; my $string1 = 'abcdefghijkl'; my $string2 = 'mnopqrstuvwx'; say '*** BEFORE ***'; say "\$string1[$string1]"; say "\$string2[$string2]"; ($string1, $string2) = @{fancy_apos_to_ascii_apos(\($string1, $string2))}; say '*** AFTER ***'; say "\$string1[$string1]"; say "\$string2[$string2]"; sub fancy_apos_to_ascii_apos { my (@fancies) = @_; state $fancy_apos = ''; state $fancy_apos_len = length $fancy_apos; my $asciis = []; for my $string (map $$_, @fancies) { while ((my $pos = index $string, $fancy_apos) >= 0) { substr($string, $pos, $fancy_apos_len) = q{'}; } push @$asciis, $string; } return $asciis; }

    Output:

    *** BEFORE *** $string1[abcdefghijkl] $string2[mnopqrstuvwx] *** AFTER *** $string1[abc'def'ghi'jkl] $string2[mno'pqr'stu'vwx]

    — Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11152190]
Approved by Paladin
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-15 00:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found