Interesting problem. This is the best I can do so far. It extracts all singleton characters from a string. It needs Perl 5.10 regex extensions, but I think those are kosher. The  (?(condition)yes-pattern) is used with  (?{ code }) for the (condition) and I'm not sure if the stink of  (?{ code }) is dispelled by its use in a conditional regex expression. Of course, the most damning thing is the use of a hash to keep track of characters already seen, but I can't get around this (update: yet). (I'm running under 5.10 so I have to use a  local our %seen hash, but I understand that 5.18+ supports my variables at last.)

File singleton_chars_1.pl:

use 5.010; # need regex extensions use warnings; use strict; use Test::More 'no_plan' # safer to use done_testing() ; use Test::NoWarnings; # test datasets #################################################### use constant TEST_VECTOR_SET_1 => ( "each contain one or more single character", [ qw(a a) ], [ qw(ab a b) ], [ qw(abc a b c) ], [ qw(aba b) ], [ qw(abb a) ], [ qw(aab b) ], [ qw(cpcdeqe p d q) ], [ qw(pcdcq p d q) ], [ qw(cpdqc p d q) ], [ qw(aapdq p d q) ], [ qw(apadq p d q) ], [ qw(apdaq p d q) ], [ qw( +apdqa p d q) ], [ qw(paadq p d q) ], [ qw(padaq p d q) ], [ qw(padqa p d q) ], [ qw(pdaaq p d q) ], [ qw(pdaqa p d q) ], [ qw(pdqaa p d q) ], "expected results from LanX pm#1201799", [ qw(aaaab b) ], [ qw(aaaba b) ], [ qw(aabaa b) ], [ qw(abaaa b) ], [ qw(abbbb a) ], [ qw(baaaa b) ], [ qw(babbb a) ], [ qw(bbabb a) ], [ qw(bbbab a) ], [ qw(bbbba a) ], "none of these contain any single character", [ 'aa' ], [ 'aaa' ], [ 'aabb' ], [ 'aaabbb' ], [ 'abcabc' ], [ 'abcxcbax' ], [ 'xabccbax' ], [ 'abcxxcba' ], [ 'abacbc' ], ); # functions under test ############################################# sub rx_1 { my ($string, ) = @_; local our %seen; my $singleton = qr{ # captures a singleton (.) # capture/test this char (?(?{ $seen{$^N}++ }) (*FAIL)) # fail if seen before (?(?= .*? \g{-1}) (*FAIL)) # fail if seen later in string }xms; # extract all singletons. return [ $string =~ m{ (?= $singleton) }xmsg ]; } # testing, testing... ############################################## FUNT: for my $ar_funt ( # function # name comment [ 'rx_1', '1st try - not "pure"', ], ) { my ($func_name, $func_note) = @$ar_funt; *singletons = do { no strict 'refs'; *$func_name; }; defined $func_note ? note "\n $func_name() -- $func_note \n\n" : note "\n $func_name() \n\n" ; VECTOR: for my $ar_vector (TEST_VECTOR_SET_1) { if (not ref $ar_vector) { # comment string if not vector ref. note $ar_vector; next VECTOR; } my ($string, @expected) = @$ar_vector; is_deeply singletons($string), \@expected, qq{'$string' -> (@expected)}, ; } # end for VECTOR } # end for FUNT note "\n done testing functions \n\n"; done_testing(); exit; # utility functions ################################################ # none

Output:

c:\@Work\Perl\monks\LanX>perl singleton_chars_1.pl # # rx_1() -- 1st try - not "pure" # # each contain one or more single character ok 1 - 'a' -> (a) ok 2 - 'ab' -> (a b) ok 3 - 'abc' -> (a b c) ok 4 - 'aba' -> (b) ok 5 - 'abb' -> (a) ok 6 - 'aab' -> (b) ok 7 - 'cpcdeqe' -> (p d q) ok 8 - 'pcdcq' -> (p d q) ok 9 - 'cpdqc' -> (p d q) ok 10 - 'aapdq' -> (p d q) ok 11 - 'apadq' -> (p d q) ok 12 - 'apdaq' -> (p d q) ok 13 - 'apdqa' -> (p d q) ok 14 - 'paadq' -> (p d q) ok 15 - 'padaq' -> (p d q) ok 16 - 'padqa' -> (p d q) ok 17 - 'pdaaq' -> (p d q) ok 18 - 'pdaqa' -> (p d q) ok 19 - 'pdqaa' -> (p d q) # expected results from LanX pm#1201799 ok 20 - 'aaaab' -> (b) ok 21 - 'aaaba' -> (b) ok 22 - 'aabaa' -> (b) ok 23 - 'abaaa' -> (b) ok 24 - 'abbbb' -> (a) ok 25 - 'baaaa' -> (b) ok 26 - 'babbb' -> (a) ok 27 - 'bbabb' -> (a) ok 28 - 'bbbab' -> (a) ok 29 - 'bbbba' -> (a) # none of these contain any single character ok 30 - 'aa' -> () ok 31 - 'aaa' -> () ok 32 - 'aabb' -> () ok 33 - 'aaabbb' -> () ok 34 - 'abcabc' -> () ok 35 - 'abcxcbax' -> () ok 36 - 'xabccbax' -> () ok 37 - 'abcxxcba' -> () ok 38 - 'abacbc' -> () # # done testing functions # 1..38 ok 39 - no warnings 1..39


Give a man a fish:  <%-{-{-{-<


In reply to Re: Regex: matching character which happens exactly once by AnomalousMonk
in thread Regex: matching character which happens exactly once by LanX

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.