harangzsolt33 has asked for the wisdom of the Perl Monks concerning the following question:

I am so frustrated, I have been reading the RegExp help for awhile, and I just don't get it. All I want to do is find out if a string is made up of a set of characters, and print either 1 or 0 (true or false).

Here is the character set:

0-9 - all numbers a-zA-Z - all letters . - the decimal point \ - backslash / - forward slash - - the dash _ - the underline % - percent sign $ - dollar sign ' - single quote () - parenthesis {} - opening and closing brackets & - the and sign ! - the exclamation point ~ - wave ` - tick or whatever this is @ - the at sign # - the comment character ^ - this guy

Any other character is NOT allowed. SO, if the string contains a character that is not allowed, then I want the match to evaluate to 1, otherwise 0. This way I can tell whether the string has any illegal characters in it.

Here is the code I have written, which doesn't do anything. :-P :-(

my $STR1 = '///?///'; my $STR2 = 'ABC_ABC'; print "\n" . ($STR1 =~ /(\d\w.\/~-_!@#$%\^&{}\(\).'`)/); print "\n" . ($STR2 =~ /(\d\w.\/~-_!@#$%\^&{}\(\).'`)/);

Replies are listed 'Best First'.
Re: How would you write this RegExp?
by BrowserUk (Patriarch) on Jul 26, 2016 at 21:26 UTC

    Test this:

    my $re = qr[^[0-9a-zA-Z.\\/_%$'(){}&!~`@#^-]+$];; print $string !~ $re ? 1 : 0;;

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Wow. This is awesome! Thank you very much!!!!!

      All I had to do was put a backslash before the $ sign, and it worked beautifully!

      $re = qr[^0-9a-zA-Z.\\/_%\$'(){}&!~`@#^-+$];

Re: How would you write this RegExp?
by AnomalousMonk (Archbishop) on Jul 27, 2016 at 00:28 UTC

    I might approach the problem slightly differently, with an inverted character class:

    c:\@Work\Perl\monks>perl -wMstrict -le "use constant NOT_ALLOWED => qr{ [^-\x27\x23\$\@0-9a-zA-Z.\\/_%(){}&!~ +`^] }xms; ;; for my $str ('', qw(ABC_ABC 'A' ///?/// =*+| ABC=), ' ', qq{\t +\r\n}) { my $status = 0 + $str =~ NOT_ALLOWED; print qq{[$str] $status}; } " [] 0 [ABC_ABC] 0 ['A'] 0 [///?///] 1 [=*+|] 1 [ABC=] 1 [ ] 1 [ ] 1
    Note that:
    • A  ^ (caret) in the first position in a character class inverts the contents of the class.
    • A  - (hyphen) in the first position in a character class (or in the first position after an initial caret, if there is one), or in the last position in the class is a literal hyphen character. Everywhere else it is a character range operator as in  A-Z and  0-9 elsewhere in the example class.
    • The  $ @ characters should always be escaped, as otherwise they will often be interpreted as interpolable variables.
    • Finally,  \x27 \x23 are how I have to represent the  ' # characters, respectively, because my REPL doesn't like these literal characters in  qr// and similar operators. (Have to fix that someday.)
    (And a lot more test cases wouldn't be a bad idea.)


    Give a man a fish:  <%-{-{-{-<

Re: How would you write this RegExp?
by Anonymous Monk on Jul 26, 2016 at 21:32 UTC

    This guy is called a caret. :-)

    You will need the following:

    • 0+ arithmetic to make the result a 0/1 boolean
    • Anchors ^$ at the regex pattern, to force a full match
    • A character class [] in the regex for your accepted set
    • A repetition * (or +) after the character class
    Did you read the documentation?

      Well, I have read the documentation, but it's so long, by the time I get to the end, I forget everything else. lol I downloaded an Android app called RegEx GUI, which asks a bunch of questions and is supposed to help you build a regex, but it didn't help me.. :-(

      I think, the most complicated part of learning the Perl language is understanding how to write and interpret regular expressions. I still don't understand how to do it. It's a big mess. and everytime somebody writes a long !$,\\.m./*(/&/\/%(\/_\/*!/@Y\/%/\@\\\\$H&*(P!//(/@*&/P/@/#!/$/^/-//-^xqfa34s*(67)_w/0//P@\/UI\.\.\ I am like oh, wow!!! what is that!? :-O Haha :D

        And don't forget Damian Conway's Regexp::Debugger. It does a very good job of showing, in full, glorious, mind-numbing detail, every step of the regexp processing sequence.

        --MidLifeXis

Re: How would you write this RegExp?
by ww (Archbishop) on Jul 27, 2016 at 12:42 UTC

    This may give you a (slightly golfed) alternate start (on the letters; you have a fine solution for including non-a-zA-Z characters):

    C:>perl -E "print 'Enter a char: '; my $char = <>; print $char; print +$char =~ /[AEIOU]/i ? 'char is a vowel' : 'char is not a vowel';" Enter a char: v v char is not a vowel C:>perl -E "print 'Enter a char: '; my $char = <>; print $char; print +$char =~ /[AEIOU]/i ? 'char is a vowel' : 'char is not a vowel';" Enter a char: E E char is a vowel C:>