in reply to Re^2: Nonrepeating characters in an RE (updated)
in thread Nonrepeating characters in an RE

> But I believe your example is also for "all distinct letters", it isn't clear to me how you'd use it for BernieC's templates.

I find the wording "template" confusing and thought it's about constructing a complicated regex by templates, i.e. like it's done with HTML.

If "template" is supposed to mean character class $chars = "abc.." whose chars are never repeated anywhere in the string, a negated approach is probably the simplest

$str !~ / ([$chars]) .* \1 /x

edit

use v5.12; use warnings; my @words = qw"abc aab abb aba abcd abca"; my $chars = "ad"; for (@words) { say "$_" if $_ !~ / ([$chars]) .* \1 /x }

abc abb # NB: b wasn't in chars abcd

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^4: Nonrepeating characters in an RE (simple)
by hv (Prior) on Aug 16, 2022 at 12:54 UTC

    When the OP talks about "a template that might look like abcdefa", I interpret that to represent a pattern that words should follow, as in a substitution cypher: the first and seventh letters should be the same, all other letters should be different from those and from each other. Thus it should match "suitors" and "realtor", but not "bracken" or "suffers" or "straits" or "albania".

    So the regexp translation becomes: for each letter in the template, if it has not been seen before introduce a new capture, and insist it doesn't match any of the previous captures; if it has been seen before, just match the appropriate previous capture:

    sub template_to_regexp { my($template) = @_; my $seen_count; my %seen_at; my $regexp = ''; for my $i (0 .. length($template) - 1) { my $chr = substr($template, $i, 1); my $seen = $seen_at{$chr}; if (defined $seen) { $regexp .= "\\$seen"; next; } # else it's a new template character $regexp .= sprintf '(?!%s)', join '|', map "\\$_", 1 .. $seen_coun +t if $seen_count; $seen_at{$chr} = ++$seen_count; $regexp .= '(.)'; } return qr{$regexp}; }
      I agree with your interpretation. It seems to imply that the length of the target must match the length of the 'template'. It sounds like a tool for solving Jumble by searching a dictionary for valid words that match the given pattern.
      Bill
      > I interpret that to represent a pattern that words should follow, as in a substitution cypher: the first and seventh letters should be the same,

      or he made a typo. :)

      best to wait for clarification...

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery