G'day Willi,

Firstly, I'd recommend that you move away from the idea of adjusting code whenever search criteria changes: it's tedious, error-prone, and likely to involve many duplications of effort. Instead, write one script that does everything for you.

You've already got a good start in this direction with your letter patterns. In the code below, I changed abCCdEEfgh to ..AA.BB... and ABcBAefghijh to AB.BA.......: I think this makes it a bit clearer without changing the underlying principle. You can also use this for a length check.

The code below shows a technique: you'll need to make some changes to suit your needs. I've added comments that indicate the types of modifications that might be required.

#!/usr/bin/env perl use 5.010; use strict; use warnings; use autodie; # Possibly read from command line, database, file, etc. my @in_patterns = qw{..AA.BB... AB.BA.......}; # Prefer exclusion (blacklist) over inclusion (whitelist). # Here, exclude proper nouns # and any words with non-alphabetic characters. # Note: in other scenarios, whitelists are better; # e.g. only allow access to X, Y & Z. my $blacklist_re = qr{(?:^[A-Z]|[^A-Za-z])}; # Point to your preferred dictionary. Some on my system: #my $dict = '/usr/share/dict/words'; # --> linux.words my $dict = '/usr/share/dict/strine'; # --> australian-english open my $dict_fh, '<', $dict; for my $in_pat (@in_patterns) { say "*** Input Pattern: $in_pat"; my $len = length $in_pat; my $match_re = ''; my %seen; my $count = 0; for my $char (split //, $in_pat) { if ($char eq '.') { $match_re .= '.'; } elsif (! $seen{$char}) { $match_re .= '(.)'; $seen{$char} = ++$count; } else { $match_re .= "\\$seen{$char}"; } } say "*** Match Pattern: $match_re"; my $qr_re = qr{^$match_re$}; say "*** QR Regex: $qr_re"; seek $dict_fh, 0, 0; while (<$dict_fh>) { chomp; next unless length($_) eq $len; next if $_ =~ $blacklist_re; next unless $_ =~ $qr_re; say; } }

Abridged output:

*** Input Pattern: ..AA.BB... *** Match Pattern: ..(.)\1.(.)\2... *** QR Regex: (?^:^..(.)\1.(.)\2...$) barrelling barrenness ... tunnellers tunnelling *** Input Pattern: AB.BA....... *** Match Pattern: (.)(.).\2\1....... *** QR Regex: (?^:^(.)(.).\2\1.......$) minimisation monomaniacal ... reverberates secessionist

— Ken


In reply to Re: Words, no consecutive doubled letters but repeated letters by kcott
in thread Words, no consecutive doubled letters but repeated letters by wlegrand

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.