in reply to Re: Challenge: Generate a glob patterns from a word list
in thread Challenge: Generate a glob patterns from a word list

AFAIK are Trie algorithms effectively building a tree.

But now we have a search graph with diamond structures.

IOW or-clauses in a trie are nested and never chained.

(Unless I'm missing something)

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

  • Comment on Re^2: Challenge: Generate a glob patterns from a word list

Replies are listed 'Best First'.
Re^3: Challenge: Generate a glob patterns from a word list
by Fletch (Bishop) on May 04, 2021 at 14:33 UTC

    No that's a good point. I started following my own suggestion and had just that problem. I'd not match the expected in that I'd get "too much" commonality duplicated and wound up with a{1b{3c,4c},2b{3c,4c}} rather than a{1,2}b{3,4}c. Now it "works" and produces the same names for 5/6 outputs (expanding those with glob before testing), but it's not what I'd call an optimal glob representation (in that I think there's too much duplicated).

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re^3: Challenge: Generate a glob patterns from a word list
by tybalt89 (Monsignor) on May 04, 2021 at 14:20 UTC

    I think you are right, What we have here is a N-way diff (where N is the number of strings).
    My guess is that it is horribly explosive in N.

    A 2-way diff is bad enough that it takes caching to perform adequately. N-way would be much, much worse.

      In fact, my idea was to construct a finite state machine that accepted all the input words and minimise it. I was able to do that, but I was then unable to correctly interpret the states and transitions to create the output string.

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
      > My guess is that it is horribly explosive in N.

      I doubt it.

      Intuitively I'd say a two phase approach should work.

      • Build a tree like with a Trie
      • Parse the tree bottom-up and collapse it into a graph by using look-up hashes for duplicated sub-trees
      Since a Trie-tree isn't "exploding", this should be safe.

      Untested, of course.

      UPDATE

      Sorry, I should have read the examples in the OP more closely.

      I wasn't aware that

      • Or-groups were nested a{b,{c,d}}e (not even that it's possible)
      • empty alternatives were allowed a{b{,c}}d
      Without that my approach above should work.

      Now I'm rather pessimistic.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        Maybe a suffix tree could help?
        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]