b4swine has asked for the wisdom of the Perl Monks concerning the following question:

I have a $target string eg: "PEARLYGATES" and I want to check if another $word eg: "EASYPEASY" can be made from it. This means that the count of each letter in the second word must be less than or equal to the count of the same letter in the first word. Since there are two S's and two Y's in EASYPEASY, it is not inclulded in PEARLYGATES.

I have a way, but I keep thinking there must be a better way. What better way to find out and learn than to ask here?

At the end of the following code, $ok tells if it is contained or not.

$_ = $target; my $ok=1; for my $let (split //,$word) { $ok=0, last unless s/$let//; }

Replies are listed 'Best First'.
Re: Check if a word is included in another
by moritz (Cardinal) on Apr 18, 2010 at 13:50 UTC
    It all depends on what you mean by "better" :-)

    One way is to count number of occurrences of characters in a hash:

    sub count { my $str = shift; my %h; $h{$_}++ for split //, $str; return \%h; }

    Then you can compare those hashes:

    sub check { my ($target, $word) = @_; return 0 if length($target) < length($word); my %target = %{ count($target) }; my %word = %{ count($word) }; for (keys %word) { no warnings 'unintialized'; if ($target{$_} < $word{$_}) { print "'$word' can not be made out of letters of '$target', +because it has $word{$_} $_'s\n"; return 0; } } return 1; }

    That's a lot longer than your code, but maybe a bit faster if you want to check lots of words against a common target (in which case you can cache %target).

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Check if a word is included in another
by BrowserUk (Patriarch) on Apr 18, 2010 at 17:45 UTC
    sub wordCheck{ my $s = join '', sort split'', $_[0]; my $re = join'.*', sort split'', $_[1]; return $s =~ $re; }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      This is not counting the number of times a letter appears.

        Did you try it?

        print wordCheck( 'onlyones', $_ ) for qw[ none noon noone lyoness ]; 1 1 1

        Perhaps this will clarify:

        sub wordCheck{ my $s = join '', sort split'', $_[0]; my $re = join'.*', sort split'', $_[1]; print "doing: $s =~ $re"; return $s =~ $re; } print wordCheck( 'onlyones', $_ ) for qw[ none noon noone lyoness ];; doing: elnnoosy =~ e.*n.*n.*o 1 doing: elnnoosy =~ n.*n.*o.*o 1 doing: elnnoosy =~ e.*n.*n.*o.*o 1 doing: elnnoosy =~ e.*l.*n.*o.*s.*s.*y
Re: Check if a word is included in another
by PeterPeiGuo (Hermit) on Apr 18, 2010 at 15:04 UTC

    Here is one way:

    1. You split your $target, count each character and store it in a hash;
    2. You split your $word into an array, loop through each letter;
      • For each letter in the array, you go to the hash and decrement the count by 1. If you encounter any letter that has no count in that hash or its count in the hash is zero, the process is terminated right the way and the answer is "not ok".

    Peter (Guo) Pei

Re: Check if a word is included in another
by jwkrahn (Abbot) on Apr 18, 2010 at 16:14 UTC
    $ perl -le' my $target = "PEARLYGATES"; my $word = "EASYPEASY"; ( my $temp = $target ) =~ s/[^$word]//g; print join( "", sort split //, $temp ) eq join( "", sort split //, $wo +rd ) ? "Match" : "No match"; ' No match

      Problem with this approach is that it doesn't take the numbers of each letter into account.  For example, EASY can be made from PEARLYGATES, yet it prints "No match", because it's checking AAEESY eq AESY.

      Interesting thought, just need a a little bit tweak:

      use warnings; use strict; print check("YLPP", "PEARLYGATES"), "\n"; print check("YLP", "PEARLYGATES"), "\n"; print check("EASYPEASY", "PEARLYGATES"), "\n"; sub check { my ($word, $target) = @_; foreach my $letter (split //, $word) { return "not ok" if ($target =~ s/$letter// == 0); } return "ok"; }

      Peter (Guo) Pei

        Isn't this the same as the original algorithm in the first post?
Re: Check if a word is included in another
by b4swine (Pilgrim) on Apr 19, 2010 at 01:28 UTC
    Timings

    Several interesting ideas are in the replies. The timings for one million random pairs of words from and English dictionary are:

    11.617664 secs for the code which I just tried shown below.
    12.778731 secs for the code id=835331 in the first post by me.
    15.012858 secs for the code id=835332 by moritz.
    18.555034 secs for the code id=835347 by BrowserUk.
    75.051250 secs for the code id=835372 by LanX.

    sub included{ my %h; $h{$_}++ for split //,$_[1]; for (split //,$_[0]) { return 0 if --$h{$_} < 0 } return 1; }

    Interesting that in this last instance, creating a hash every time, is still faster than the other methods. I made all the above codes into subs, and otherwise tried to keep the code identical to what has been posted.

      I think my solution is relatively fast only because of the length comparison at the beginning - you could try to add that to the other solutions, it might reduce the run time for pairs of random lengths significantly.
Re: Check if a word is included in another
by LanX (Saint) on Apr 18, 2010 at 21:33 UTC
    sub is_included { my ($a,$b)=@_; # while ($a) { # UPDATED since buggy while (length $a) { my $c=substr($a,0,1); return 0 if $a =~ s/$c//g > $b =~ s/$c//g ; } return 1; } print is_included(qw/ababa babababab/); # 1 print is_included(qw/babababab ababa/); # 0 print is_included(qw/EASYPEASY PEARLYGATES/); # 0
    HTH! :)

    Cheers Rolf

      I like it. Straightforward, easy to read, and probably more efficient than making hashes, etc. Need to speed test to make sure. This is most like what I was looking for.
        Thanx, I like my code too! ;-)

        Unfortunately there is a typical "too clever for perl" bug 8-(

        please replace:

        # while ($a ) { while (length $a ) {

        otherwise qw/0 a/ will be a false positive!

        If you want performance you may wanna try tr///d instead, but you will need to eval.

        Cheers Rolf

Re: Check if a word is included in another
by shawnhcorey (Friar) on Apr 19, 2010 at 14:26 UTC

    Try this one:

    #!/usr/bin/perl use strict; use warnings; my $target = 'PEARLYGATES'; for my $word ( qw( EASYPEASY GATES PERL ) ){ if( check_word( $target, $word ) ){ print "$target contains $word\n"; }else{ print "$target does not contain $word\n"; } } sub check_word { my $target = shift @_; my $word = shift @_; my @target = sort split //, uc $target; my @word = sort split //, uc $word; my $tidx = 0; for my $char ( @word ){ while( $char gt $target[$tidx] ){ $tidx ++; return 0 if $tidx >= @target; } if( $char eq $target[$tidx] ){ $tidx ++; }else{ return 0; } } return 1; }