Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Posix regexes in Perl

by davies (Prior)
on Oct 07, 2021 at 18:40 UTC ( [id://11137310]=perlquestion: print w/replies, xml ) Need Help??

davies has asked for the wisdom of the Perl Monks concerning the following question:

YAPE::Regex::Explain states that it handles regular expressions up to Perl 5.6. I am doing some work involving Postgres, which uses Posix regexes. I have found documentation indicating that all Posix character classes are in the current Perl version. I have been struggling to find anything indicating whether this has always been the case and what the position is with other aspects of regexes. It would be ideal if Perl 5.6 were a superset of Posix. It would be good if I knew which Posix features Perl 5.6 did not support and when or if support was added to Perl. Can anyone lighten my darkness, please?

Regards,

John Davies

Replies are listed 'Best First'.
Re: Posix regexes in Perl
by afoken (Chancellor) on Oct 07, 2021 at 19:12 UTC
Re: Posix regexes in Perl
by LanX (Saint) on Oct 07, 2021 at 22:32 UTC
    Hi John

    There are various alternatives for YAPE::Regex::Explain

    Could you give us a sample regex and tell us where and how it fails?

    update

    > It would be ideal if Perl 5.6 were a superset of Posix

    not sure about that, but the supported character classes are listed in the head of the module's source.

    does this help?

    my $valid_POSIX = qr{ alpha | alnum | ascii | cntrl | digit | graph | lower | print | punct | space | upper | word | xdigit }x; ... my %macros = ( # utf8/POSIX macros alpha => 'letters', alnum => 'letters and digits', ascii => 'all ASCII characters (\000 - \177)', cntrl => 'control characters (those with ASCII values less than 32)' +, digit => 'digits (like \d)', graph => 'alphanumeric and punctuation characters', lower => 'lowercase letters', print => 'alphanumeric, punctuation, and whitespace characters', punct => 'punctuation characters', space => 'whitespace characters (like \s)', upper => 'uppercase letters', word => 'alphanumeric and underscore characters (like \w)', xdigit => 'hexadecimal digits (a-f, A-F, 0-9)', );

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: Posix regexes in Perl (Tcl Advanced Regular Expressions)
by LanX (Saint) on Oct 08, 2021 at 00:22 UTC
    Hi again

    I skimmed thru https://www.postgresql.org/docs/14/functions-matching.html and found some features which were new to me.

    E.g. I've never heard of metacharacters \m or \M (?) I tested them and Perl will only see the escaped characters, i.e. literal m and M .

    > It would be ideal if Perl 5.6 were a superset of Posix.

    I'm afraid Postgres has another incompatible superset of Posix

    Further searching revealed that PG is apparently using the TCL RE-engine and

    > There are a number of important differences between Tcl Advanced Regular Expressions and Perl-style regular expressions. Tcl uses \m, \M, \y and \Y for word boundaries. Perl and most other modern regex flavors use \b and \B. In Tcl, these last two match a backspace and a backslash, respectively.

    see here for more:

    good luck with that using Perl tools. :|

    I think you will be better off using something from the TCL toolchain.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      "I think you will be better off using something from the TCL toolchain"

      Patching YAPE::Regex::Explain could be easier, it's a rather compact module.

      - Ron
Re: Posix regexes in Perl
by Anonymous Monk on Oct 08, 2021 at 09:23 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11137310]
Approved by Paladin
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-19 06:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found