comment on

G'day Willi,

Firstly, I'd recommend that you move away from the idea of adjusting code whenever search criteria changes: it's tedious, error-prone, and likely to involve many duplications of effort. Instead, write one script that does everything for you.

You've already got a good start in this direction with your letter patterns. In the code below, I changed abCCdEEfgh to ..AA.BB... and ABcBAefghijh to AB.BA.......: I think this makes it a bit clearer without changing the underlying principle. You can also use this for a length check.

The code below shows a technique: you'll need to make some changes to suit your needs. I've added comments that indicate the types of modifications that might be required.

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;
use autodie;

# Possibly read from command line, database, file, etc.
my @in_patterns = qw{..AA.BB... AB.BA.......};

# Prefer exclusion (blacklist) over inclusion (whitelist).
# Here, exclude proper nouns
#   and any words with non-alphabetic characters.
# Note: in other scenarios, whitelists are better;
#   e.g. only allow access to X, Y & Z.
my $blacklist_re = qr{(?:^[A-Z]|[^A-Za-z])};

# Point to your preferred dictionary. Some on my system:
#my $dict = '/usr/share/dict/words'; # --> linux.words
my $dict = '/usr/share/dict/strine'; # --> australian-english

open my $dict_fh, '<', $dict;

for my $in_pat (@in_patterns) {
    say "*** Input Pattern: $in_pat";

    my $len = length $in_pat;
    my $match_re = '';
    my %seen;
    my $count = 0;

    for my $char (split //, $in_pat) {
        if ($char eq '.') {
            $match_re .= '.';
        }
        elsif (! $seen{$char}) {
            $match_re .= '(.)';
            $seen{$char} = ++$count;
        }
        else {
            $match_re .= "\\$seen{$char}";
        }
    }

    say "*** Match Pattern: $match_re";
    my $qr_re = qr{^$match_re$};
    say "*** QR Regex: $qr_re";

    seek $dict_fh, 0, 0;

    while (<$dict_fh>) {
        chomp;
        next unless length($_) eq $len;
        next if $_ =~ $blacklist_re;
        next unless $_ =~ $qr_re;
        say;
    }
}
[download]

Abridged output:

*** Input Pattern: ..AA.BB...
*** Match Pattern: ..(.)\1.(.)\2...
*** QR Regex: (?^:^..(.)\1.(.)\2...$)
barrelling
barrenness
...
tunnellers
tunnelling
*** Input Pattern: AB.BA.......
*** Match Pattern: (.)(.).\2\1.......
*** QR Regex: (?^:^(.)(.).\2\1.......$)
minimisation
monomaniacal
...
reverberates
secessionist
[download]

— Ken

In reply to Re: Words, no consecutive doubled letters but repeated letters by kcott
in thread Words, no consecutive doubled letters but repeated letters by wlegrand

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.