in reply to Re: combining array items to match a certain string
in thread combining array items to match a certain string

First turn on strictures (you always use strictures don't you?), then set up the target string and the search list. Perform the search using the starting parameters.

use strict; use warnings; my $target = "cowboycatdog"; my @parts = qw(cow cowboy boy cat at do dog); search ($target, [@parts], []);

Assign the parameters passed to the sub to local variables. Note that the arrays are passed as references so that two arrays can be passed.

sub search { my ($target, $parts, $used) = @_;

Check to see if there is anything left to match. If the target string is zero length then we have a successful combination in @$used. Print the result and return.

unless (length $target) { print join ("-", @$used), "\n"; return; }

otherwise loop over the remaining unused words looking for matches with the start of the remaining target string.

for my $part (@$parts) {

if there is no match for the current word at the start of the target string skip to the next word

next unless 0 == index $target, $part;

otherwise isolate the part of the target string following the current word.

my $remainder = substr $target, length $part;

Recursively search the remaining part of the target string using the word list passed in excluding the current word and passing a list of the words used so far plus the current word.

search ($remainder, [grep {$part ne $_} @$parts], [@$used, $pa +rt]); } }

Perl is environmentally friendly - it saves trees

Replies are listed 'Best First'.
Re^3: combining array items to match a certain string
by pc2 (Beadle) on Dec 23, 2007 at 15:22 UTC

    salutations, GrandFather,

    the algorithm works well, but there is a small problem: if we try to segmentate:
    cowboycatdogcat
    it won't work, because it won't segmentate as, for example:
    cow-boy-cat-dog-cat
    because "cat" and "cat" are repeated morphemes, even if the array itself has the items "cowboy cow boy cat do dog cat" with "cat" repeated. how can we solve it? thank you in advance.

      Just adding an extra cat to the bag doesn't do the trick because the grep removes both cats when the first one is used. The code needs to be reworked so that we know how many of each part there are and so that they are removed one at a time. Consider:

      use strict; use warnings; my $target = "cowboycatdogcat"; my @partsList = qw(cow boy cat dog cat); my %partsLu; ++$partsLu{$_} for @partsList; search ($target, {%partsLu}, []); sub search { my ($target, $partsLu, $used) = @_; unless (length $target) { print join ("-", @$used), "\n"; return; } for my $part (keys %$partsLu) { next unless 0 == index $target, $part; my $remainder = substr $target, length $part; delete $partsLu->{$part} unless --$partsLu->{$part}; search ($remainder, {%$partsLu}, [@$used, $part]); } }

      Prints:

      cow-boy-cat-dog-cat

      Perl is environmentally friendly - it saves trees

        great!

        thank you, it worked.
        just for information, we are using a slightly modified version of the search code, which divides the words in commas and excludes similar outputs with doubled commas to divide the words, and, also, it puts the output in an @input array instead of printing it:
        sub search { my ($target, $partsLu, $used) = @_; unless (length $target) { $v = join (",", @$used); if (substr($v, 0, 1) eq "," or $v =~ /,,/) {} else { push(@input, join (",", @$used))}; return; } for my $part (keys %$partsLu) { next unless 0 == index $target, $part; my $remainder = substr $target, length $part; delete $partsLu->{$part} unless --$partsLu->{$part}; search ($remainder, {%$partsLu}, [@$used, $part]); } }