The reason a single, 'pure and simple' regex is not a good approach to this problem is that it has great difficulty handling the degenerate cases of zero-length and single-character strings: either some fancy footwork is needed within the regex, or some sort of post-match fixup must be done in these cases.

For a single-character string in particular, one is asking the regex to match twice on the same character! A regex will always advance the string match point (as returned by the pos built-in) past the match or, in the case of a zero-width assertion match, by a default of one character. (The 5.10 regex 'backtracking control verbs' may offer a way around this problem, but I'm not familiar enough with them to know.)

The following is the best I can do with a regex. It uses post-match fixup to finish the job. Note that the order of the alternatives in the ordered alternation
    \A . | . \z | \z
is important: the lone  \z alternative must be last.

>perl -wMstrict -le "for my $str (@ARGV) { printf qq{string '$str': }; my ($first, $last) = $str =~ m{ \A . | . \z | \z }xmsg; $last = $first if not $last; print qq{first '$first', last '$last'}; } " "" "a" "ab" "abc" "abcd" string '': first '', last '' string 'a': first 'a', last 'a' string 'ab': first 'a', last 'b' string 'abc': first 'a', last 'c' string 'abcd': first 'a', last 'd'

In reply to Re: regex for string by AnomalousMonk
in thread regex for string by saranperl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.