Re^2: Upwords IV: creating words with underpatterns

Something like this?

Yes, indeed, this seems to hit it on the head. It's all I can do to unravel it, but I have output, source and questions in readmore tags:

The following is a version of your script that allows me to see what's going on. The output is 176k and overwhelms STDOUT if not written to a log:

#!/usr/bin/perl

use strict;    # https://perlmonks.org/?node_id=11116556
use warnings;
use Path::Tiny;
use 5.016;

### adding grown-up perl logging

use Log::Log4perl;

my $log_conf4 = "/home/hogan/Documents/hogan/logs/conf_files/4.conf";

# get rid of old one
my $file = '/home/hogan/Documents/hogan/logs/4.log4perl.txt';
unlink $file or warn "Could not unlink $file: $!";

# init new one
Log::Log4perl::init($log_conf4);    #info
my $logger = Log::Log4perl->get_logger();
$logger->info($0);

my $dict_path = '/home/hogan/Documents/hogan/my_data/upwords.10';

my $dictionary = path($dict_path);
my @tiles      = map +( 'a' .. 'z' )[ rand 26 ], 1 .. 7;
print "tiles: @tiles\n";
$logger->info("tiles: @tiles\n");
my $string = 'w  r  t';

$logger->info("============");
say "$string";
$logger->info($string);
my $pat     = join '', map "$_?", sort @tiles, $string =~ /\w/g;
my $tilepat = join '', map "$_?", sort @tiles;
my $letters = join '', @tiles, $string =~ /\w/g;
print "$pat\n$tilepat\n";
$logger->info("$pat\ntilepat: $tilepat\nletters: $letters\n");
my @matches =
  grep { ( join '', sort split // ) =~ /^$pat$/ }
  $dictionary->slurp =~ /^[$letters]{2,}$/gm;

my @places;
( $string =~ tr/ /./r ) =~ /(?<!\w).{2,}(?!\w)(?{ push @places, $& })(
+*FAIL)/;
say "after =~ string is $string";
@places = grep /\w/, @places;
use Data::Dump 'dd';
dd \@places;

my @matches1 = extension(1);
print "@matches1\n\n";
$logger->info("matches1================\n @matches1");
my @matches2 = extension(2);
print "@matches2\n\n";
$logger->info("matches2================\n @matches2");

sub extension {
  my $id = shift;
  my @found;
  for my $placepat ( $id == 1 ? @places : map expand($_), @places ) {
    for my $match (@matches) {
      $logger->info("id: $id match: $match");
      $match =~ /^$placepat$/ or next;

      $logger->info("id: $id placepat: $placepat");
      $logger->info("matched: $match");
      my $newtiles = $match & ( $placepat =~ tr/.a-z/\xff\0/r );
      my ($hex) = unpack( 'H*', $newtiles );
      $logger->info("hex is: $hex");
      ( join '', sort $newtiles =~ /\w/g ) =~ /^$tilepat$/
        and push @found, $match;
    }
  }
  return @found;
}

sub expand {
  grep /\w/, glob join '', map { /\w/ ? "{$_,.}" : $_ } split //, shif
+t;
}
[download]

Excerpts with typical outputs:

2020/05/10 00:21:33 INFO ./2.3.t89.pl 
2020/05/10 00:21:33 INFO tiles: a j a y x l n
 
2020/05/10 00:21:33 INFO ============ 
2020/05/10 00:21:33 INFO w  r  t 
2020/05/10 00:21:33 INFO a?a?j?l?n?r?t?w?x?y?
tilepat: a?a?j?l?n?x?y?
letters: ajayxlnwrt
 
2020/05/10 00:21:33 INFO id: 1 match: aa 
2020/05/10 00:21:33 INFO id: 1 match: aal 
2020/05/10 00:21:33 INFO id: 1 match: ajar
[download]

@matches is the set of all words that match the tiles and the underpattern put together. Very few of these matches gets past the line

$match =~ /^$placepat$/ or next;

but it does happen:

2020/05/10 00:21:33 INFO id: 1 placepat: .r. 
2020/05/10 00:21:33 INFO matched: try 
2020/05/10 00:21:33 INFO hex is: 740079
[download]

, but 'try' can't make it into @found because of this line:

      ( join '', sort $newtiles =~ /\w/g ) =~ /^$tilepat$/
        and push @found, $match;
[download]

as seen here:

2020/05/10 00:21:33 INFO matches1================
 ar rant at
[download]

, but as I stare at that line, I don't understand it. I see that the mask is 00 at the position of the 'r', and I realize that the t only comes from the underpattern, so it is not available to stick in front of the r.

Q1) Can someone "talk me through" this line?

      ( join '', sort $newtiles =~ /\w/g ) =~ /^$tilepat$/
        and push @found, $match;
[download]

extension(2) iterates over the same @matches:

2020/05/10 00:21:33 INFO id: 2 match: aa 
2020/05/10 00:21:33 INFO id: 2 match: aal 
2020/05/10 00:21:33 INFO id: 2 match: ajar 
2020/05/10 00:21:33 INFO id: 2 match: al 
2020/05/10 00:21:33 INFO id: 2 match: ala
[download]

but @matches2 is likely to be a superset of @matches1

2020/05/10 00:21:33 INFO id: 2 match: alant 
2020/05/10 00:21:33 INFO id: 2 placepat: ....t 
2020/05/10 00:21:33 INFO matched: alant
[download]

We see how the the underpattern only need have one of the letters in this scheme. The net for words gets cast wider:

2020/05/10 00:21:33 INFO matches2================
 wanly alary waly wany waxy ajar alar alant ar rant raja raya at
[download]

So far, so good.

But then, I try to extend the script, so that I can supply more than one underpattern. Sounds pretty reasonable, right? What this does is that it takes @places and @matches out of scope for the subroutines. So I try to pass them instead, and I can't get there. Here's what that looks like:

#!/usr/bin/perl

use strict;    # https://perlmonks.org/?node_id=11116556
use warnings;
use Path::Tiny;
use 5.016;

### adding grown-up perl logging

use Log::Log4perl;

# get rid of old one
my $file = '/home/hogan/Documents/hogan/logs/4.log4perl.txt';
unlink $file or warn "Could not unlink $file: $!";
my $log_conf4 = "/home/hogan/Documents/hogan/logs/conf_files/4.conf";
Log::Log4perl::init($log_conf4);    #info
my $logger = Log::Log4perl->get_logger();
$logger->info($0);

my $dict_path  = '/home/hogan/Documents/hogan/my_data/upwords.10';
my $dictionary = path($dict_path);

my @trials = ( '  f   e ', 'w   r  e  ', '  w   r  e', 'h og', );

for my $string (@trials) {
  $logger->info("============");
  say "$string";
  $logger->info($string);
  my @tiles = map +( 'a' .. 'z' )[ rand 26 ], 1 .. 7;
  print "tiles: @tiles\n";
  $logger->info("tiles: @tiles\n");

  my $pat     = join '', map "$_?", sort @tiles, $string =~ /\w/g;
  my $tilepat = join '', map "$_?", sort @tiles;
  my $letters = join '', @tiles, $string =~ /\w/g;
  print "pat: $pat\ntilepat: $tilepat\nletters: $letters\n";
  $logger->info("$pat\ntilepat: $tilepat\nletters: $letters\n");
  my @matches =
    grep { ( join '', sort split // ) =~ /^$pat$/ }
    $dictionary->slurp =~ /^[$letters]{2,}$/gm;

  say "matches are @matches";

  my @places;
  ( $string =~ tr/ /./r ) =~ /(?<!\w).{2,}(?!\w)(?{ push @places, $& }
+)(*FAIL)/;

  @places = grep /\w/, @places;
  use Data::Dump 'dd';
  dd \@places;

  my @matches1 = extension( 1, @places, @matches, $tilepat );
  print "@matches1\n\n";
  say "==============";
  my @matches2 = extension( 2, @places, @matches, $tilepat );
  print "@matches2\n\n";
  say "==============";
}

sub extension {
  my ( $id, @places, @matches, $tilepat ) = @_;
  my @found;
  for my $placepat ( $id == 1 ? @places : map expand($_), @places ) {
    for my $match (@matches) {
      $logger->info("id: $id match: $match");
      $match =~ /^$placepat$/ or next;

      $logger->info("id: $id placepat: $placepat");
      $logger->info("matched: $match");
      my $newtiles = $match & ( $placepat =~ tr/.a-z/\xff\0/r );
      my ($hex) = unpack( 'H*', $newtiles );
      $logger->info("hex is: $hex");
      ( join '', sort $newtiles =~ /\w/g ) =~ /^$tilepat$/
        and push @found, $match;
    }

  }
  return @found;
}

sub expand {
  grep /\w/, glob join '', map { /\w/ ? "{$_,.}" : $_ } split //, shif
+t;
}
[download]

2020/05/10 01:07:11 INFO ./7.t89.pl 
2020/05/10 01:07:11 INFO ============ 
2020/05/10 01:07:11 INFO   f   e  
2020/05/10 01:07:11 INFO tiles: e c i e i q l
 
2020/05/10 01:07:11 INFO c?e?e?e?f?i?i?l?q?
tilepat: c?e?e?i?i?l?q?
letters: ecieiqlfe
 
2020/05/10 01:07:11 INFO ============ 
2020/05/10 01:07:11 INFO w   r  e   
2020/05/10 01:07:11 INFO tiles: s m e v o a u
 
2020/05/10 01:07:11 INFO a?e?e?m?o?r?s?u?v?w?
tilepat: a?e?m?o?s?u?v?
letters: smevoauwre
 
2020/05/10 01:07:11 INFO ============ 
2020/05/10 01:07:11 INFO   w   r  e 
2020/05/10 01:07:11 INFO tiles: x h m k o x e
 
2020/05/10 01:07:11 INFO e?e?h?k?m?o?r?w?x?x?
tilepat: e?h?k?m?o?x?x?
letters: xhmkoxewre
 
2020/05/10 01:07:11 INFO ============ 
2020/05/10 01:07:11 INFO h og 
2020/05/10 01:07:11 INFO tiles: h y k x k r n
 
2020/05/10 01:07:11 INFO g?h?h?k?k?n?o?r?x?y?
tilepat: h?k?k?n?r?x?y?
letters: hykxkrnhog
[download]

Q2) What gives? How do I drive this from main with a loop?

Q3) Finally, I understand what the expand subroutine does, but I can't tease it out of this one-liner. Can someone talk through this:

grep /\w/, glob join '', map { /\w/ ? "{$_,.}" : $_ } split //, shift;

So, bravo, tybalt89, this script is amazingly compact and effective. I just need a little help figuring it out as I test....

Comment on Re^2: Upwords IV: creating words with underpatterns Select or Download Code