No that's a good point. I started following my own suggestion and had just that problem. I'd not match the expected in that I'd get "too much" commonality duplicated and wound up with a{1b{3c,4c},2b{3c,4c}} rather than a{1,2}b{3,4}c. Now it "works" and produces the same names for 5/6 outputs (expanding those with glob before testing), but it's not what I'd call an optimal glob representation (in that I think there's too much duplicated).

#!/usr/bin/env perl use warnings; use strict; use 5.032; use Test::More; use YAML::XS qw( Dump ); ## Copied starting with Regexp::Trie package GlobTrie { sub new { return bless {} => shift; } sub add { my $self = shift; my $str = shift; my $ref = $self; for my $char ( split //, $str ) { $ref->{$char} //= {}; $ref = $ref->{$char}; } $ref->{''} = 1; return $self; } sub _glob { my $self = shift; return if $self->{''} and scalar keys %{$self} == 1; my ( @alt, @cc ); my $q = 0; for my $char ( sort keys %{$self} ) { my $qchar = $char =~ m{[*?]} ? quotemeta $char : $char; if ( ref $self->{$char} ) { if ( defined( my $recurse = _glob( $self->{$char} ) ) +) { push @alt, $qchar . $recurse; } else { push @cc, $qchar; } } else { $q = 1; } } my $cconly = !@alt; @cc and push @alt, @cc == 1 ? $cc[0] : '{' . join( ',', @cc ) +. '}'; my $result = @alt == 1 ? $alt[0] : '{' . join( q{,}, @alt ) . +'}'; ##$q and $result = $cconly ? "$result?" : "(?:$result)?"; return $result; } }; sub list2glob { my @pats = @_; my $ret = GlobTrie->new; for my $item (@pats) { $ret->add($item); } ## say STDERR Dump( $ret ); return $ret->_glob; } is glob( list2glob( 'a', 'b' ) ), glob('{a,b}'); is glob( list2glob( 'ab', 'ac' ) ), glob('a{b,c}'); is glob( list2glob( 'aXb', 'aYb' ) ), glob('a{X,Y}b'); is glob( list2glob(qw( a1b3c a1b4c a2b3c a2b4c )) ), glob('a{1,2}b{3,4 +}c'); is glob( list2glob( qw( /ab/ef/ij/kl /ab/ef/ij/mn /ab/ef/ij /ab/gh/ij/kl /ab/gh/ij/mn /ab/gh/ij /cd/ef/ij/kl /cd/ef/ij/mn /cd/ef/ij /cd/gh/ij/kl /cd/gh/ij/mn /cd/gh/ij ) ) ), glob('/{ab,cd}/{ef,gh}/ij{/{kl,mn},}'); is glob( list2glob( qw( abdel abdelmn abdelmo abdfgkl abdfgklmn abdfgklmo abdfghkl abdfghklmn abdfghklmo abdfgijkl abdfgijklmn abdfgijklmo acdel acdelmn acdelmo acdfgkl acdfgklmn acdfgklmo acdfghkl acdfghklmn acdfghklmo acdfgijkl acdfgijklmn acdfgijklmo ) ) ), glob('a{b,c}d{e,fg{,h,ij}k}l{,m{n,o}}'); done_testing(); __END__

The cake is a lie.
The cake is a lie.
The cake is a lie.


In reply to Re^3: Challenge: Generate a glob patterns from a word list by Fletch
in thread Challenge: Generate a glob patterns from a word list by choroba

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.