In previous semester there was a riddle to write a regex which catches valid strings that meet the following conditions:

1. The length of the string can be zero or more.
2. It can contain any ASCII char that can be printed (see below definition), except the following chars:
- Backslash: \
- double quote: "
- LF char: \n (when it comes as a single char)
- LR char: \r (when it comes as a single char)
They are valid only if they are coming as part of a valid escape sequence.
3. Valid escape sequences: \\, \", \n, \t, \r, \0 \xdd (where dd represents an hexadecimal digit)

Examples of valid strings:
"hello"
"hi 'Hello'"
"Hey there\n"
"hi1 \x10"
"hi2 \x3A"
"hi\thow\tare\tyou\tdoing"

Examples of invalid strings:
'bad"
"bad
"multi-line bad
string"
"inner-"-bad"
"bad escape \"
The code to fill: if (<REGEX1>) { print("Valid String"); } else if (<REGEX2>) { print("Invalid char"); } else if (<REGEX3>) { print("Close the string!"); } else if (<REGEX4>) { print("Invalid escape"); }
Valid ASCII values: value between 0x20 and 0x7E and also whitespaces like 0x09, 0x0A, 0x0D.
You can change the order of the if-else statements but you can't use else (without if), meaning you have to write regex for each statement. The riddle didn't have a solution but I'm interest to see one.

The first statement should check if a string is valid (meets all the conditions describes before).
The second statement should check if it's contains an invalid char (see Valid ASCII values).
The third statement should check if it's contains an unclosed string.
The fourth statement should check if it's contains invalid escape (Not one of \\, \", \n, \t, \r, \0 \xdd).

How would you do it?

EDIT: As I understand, I could use the regex ([\x00-\x09\xB-\xC\xE-\x21\x23-\x5B\x5D-\xFF]) to catch the valid chars and (\\x[0-9A-Fa-f]{2}) for the hex digits. My problem of understanding on how to solve it is due to the fact that all of them I need to do something like "show all but ...". Checking for unclosed string is easy (I think) because it's just \".*. On the other hand it catches also valid strings like "aaa".

In reply to Riddle with regex by ovedpo15

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.