KSURi has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

is there any way to get modifiers which has been applied to a compiled regular expression? Without XS.
Concretely I want to detect if a regexp is case-insensitive:
my $r1 = qr/hello/i; my $r2 = qr/hello/; is_ci($r1); # true is_ci($r2); # false

Replies are listed 'Best First'.
Re: Get regexp modifiers
by ikegami (Patriarch) on Jan 06, 2010 at 02:03 UTC

    You can tell if the case-insensitive modifier was used pretty easily:

    sub is_ci { my ($pat) = @_; my ($pos_mods) = $pat =~ /^\(\?(\w*)(?:-\w*)?:.*\)/s or return 0; return $pos_mods =~ /i/; }

    Or even the following in Perl ≥ 5.10:

    use re qw( regexp_pattern ); sub is_ci { (regexp_pattern(shift))[1] =~ /i/ }

    But you can't tell if a pattern is case-sensitive or not.

    The following will all come up as case-sensitive even though they're not:

    qr/(?i:hello)/ qr/(?i)hello/ qr/[hH][eE][lL][lL][oO]/ qr/1234/

    The following will all come up as case-insensitive even though they're not:

    qr/(?-i)hello/i qr/(?-i:hello)/i qr/he(?-i:l)lo/i

    Update Forgot to escape ? in my pattern. Fixed.

      Nice! Thank you.
Re: Get regexp modifiers
by Perlbotics (Archbishop) on Jan 06, 2010 at 00:43 UTC

    I've got the gut-feeling, that this is bad advice and there's a better way to do it... but the following works here...

    use strict; # print RE into variable and examine it... sub is_ci { my $check_re = shift; die "$check_re is not a RE" if ref $check_re ne 'Regexp'; open my $redir , '>', \my $output; print $redir $check_re; close $redir; return $output =~ /i.*?\-/; # i before '-' } my $r1 = qr/hello/i; my $r2 = qr/hello/; print "true is " , is_ci($r1) ? 'true' : 'false' , " ?\n"; # true print "false is " , is_ci($r2) ? 'true' : 'false' , " ?\n"; # false print "false is " , is_ci(\"nonsense") ? 'true' : 'false' , " ?\n"; # + die!
    HTH

    Update: Ah trustworthy guts ;-) See ikegamis response below...

    Reading this thread, I second what ww questions here - if one needs to check regexp modifiers a posteriori, that seems to be an indicator for a design problem. But maybe the regexp comes from user input... ?

      I think ref($check_re) ne 'Regexp' can fail in modern Perls. You want to use

      use re qw( is_regexp ); is_regexp($check_re)
      Or if you want to support Perl < 5.10,
      BEGIN { eval 'use re qw( is_regexp )'; if (!defined(&is_regexp)) { my $re_class = ref qr//; *is_regexp = sub($) { local *__ANON__ = 'is_regexp'; return UNIVERSAL::isa($_[0], $re_class); }; } } is_regexp($check_re)

      That said, I don't think you should be checking if it's a compiled regex pattern at all. Just checking if it looks like one (as I did) is much more tolerant.

        I think ref($check_re) ne 'Regexp' can fail in modern Perls.

        Could you explain that? (Or post a link to the relevant documentation?)

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Get regexp modifiers
by Anonymous Monk on Jan 06, 2010 at 00:49 UTC
    #!/usr/bin/perl -- use strict; use warnings; for my $regex ( qr/hello/i, qr/hello/ ){ my( $left, $right ) = $regex =~ /\(\?([^-:]*)-([^-:]*):/; print "$regex "; print "is_ci" if $left =~ /i/i; print "\n"; } __END__ (?i-xsm:hello) is_ci (?-xism:hello)
    See also YAPE::Regex
Re: Get regexp modifiers
by ww (Archbishop) on Jan 06, 2010 at 01:56 UTC
    Wouldn't it be easier to look through the source for the "qr/..." and see what it says?

    Patently, you can't do that if you don't have the script... but if you don't have the script, the whole evolution is a no op.