in reply to combining array items to match a certain string

salutations, thank you for the replies. quester, your idea is good, but, just when we were trying to implement it, GrandFather implemented it to us (we think it is the same principle, isn't it?). anyway, it worked very well, just the way we wanted. poolpi, what you said is true, but the algorithm we use can't be adapted for this. anyway, problem solved, thank you all. GrandFather, if it isn't much trouble, could you explain us how your algorithm really operates?
  • Comment on Re: combining array items to match a certain string

Replies are listed 'Best First'.
Re^2: combining array items to match a certain string
by GrandFather (Saint) on Dec 22, 2007 at 23:13 UTC

    First turn on strictures (you always use strictures don't you?), then set up the target string and the search list. Perform the search using the starting parameters.

    use strict; use warnings; my $target = "cowboycatdog"; my @parts = qw(cow cowboy boy cat at do dog); search ($target, [@parts], []);

    Assign the parameters passed to the sub to local variables. Note that the arrays are passed as references so that two arrays can be passed.

    sub search { my ($target, $parts, $used) = @_;

    Check to see if there is anything left to match. If the target string is zero length then we have a successful combination in @$used. Print the result and return.

    unless (length $target) { print join ("-", @$used), "\n"; return; }

    otherwise loop over the remaining unused words looking for matches with the start of the remaining target string.

    for my $part (@$parts) {

    if there is no match for the current word at the start of the target string skip to the next word

    next unless 0 == index $target, $part;

    otherwise isolate the part of the target string following the current word.

    my $remainder = substr $target, length $part;

    Recursively search the remaining part of the target string using the word list passed in excluding the current word and passing a list of the words used so far plus the current word.

    search ($remainder, [grep {$part ne $_} @$parts], [@$used, $pa +rt]); } }

    Perl is environmentally friendly - it saves trees

      salutations, GrandFather,

      the algorithm works well, but there is a small problem: if we try to segmentate:
      cowboycatdogcat
      it won't work, because it won't segmentate as, for example:
      cow-boy-cat-dog-cat
      because "cat" and "cat" are repeated morphemes, even if the array itself has the items "cowboy cow boy cat do dog cat" with "cat" repeated. how can we solve it? thank you in advance.

        Just adding an extra cat to the bag doesn't do the trick because the grep removes both cats when the first one is used. The code needs to be reworked so that we know how many of each part there are and so that they are removed one at a time. Consider:

        use strict; use warnings; my $target = "cowboycatdogcat"; my @partsList = qw(cow boy cat dog cat); my %partsLu; ++$partsLu{$_} for @partsList; search ($target, {%partsLu}, []); sub search { my ($target, $partsLu, $used) = @_; unless (length $target) { print join ("-", @$used), "\n"; return; } for my $part (keys %$partsLu) { next unless 0 == index $target, $part; my $remainder = substr $target, length $part; delete $partsLu->{$part} unless --$partsLu->{$part}; search ($remainder, {%$partsLu}, [@$used, $part]); } }

        Prints:

        cow-boy-cat-dog-cat

        Perl is environmentally friendly - it saves trees