No that's a good point. I started following my own suggestion and had just that problem. I'd not match the expected in that I'd get "too much" commonality duplicated and wound up with a{1b{3c,4c},2b{3c,4c}} rather than a{1,2}b{3,4}c. Now it "works" and produces the same names for 5/6 outputs (expanding those with glob before testing), but it's not what I'd call an optimal glob representation (in that I think there's too much duplicated).
The cake is a lie.
The cake is a lie.
The cake is a lie.
| [reply] [d/l] [select] |
I think you are right, What we have here is a N-way diff (where N is the number of strings).
My guess is that it is horribly explosive in N.
A 2-way diff is bad enough that it takes caching to perform adequately. N-way would be much, much worse.
| [reply] |
In fact, my idea was to construct a finite state machine that accepted all the input words and minimise it. I was able to do that, but I was then unable to correctly interpret the states and transitions to create the output string.
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] |
> My guess is that it is horribly explosive in N.
I doubt it.
Intuitively I'd say a two phase approach should work.
- Build a tree like with a Trie
- Parse the tree bottom-up and collapse it into a graph by using look-up hashes for duplicated sub-trees
Since a Trie-tree isn't "exploding", this should be safe.
Untested, of course.
UPDATE
Sorry, I should have read the examples in the OP more closely.
I wasn't aware that
- Or-groups were nested a{b,{c,d}}e (not even that it's possible)
- empty alternatives were allowed a{b{,c}}d
Without that my approach above should work.
Now I'm rather pessimistic.
| [reply] [d/l] [select] |
Maybe a suffix tree could help?
map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
| [reply] [d/l] |