anadem has asked for the wisdom of the Perl Monks concerning the following question:

This is probably laughably elementary, but feel free to ridicule anyway ...

I need to examine numerous collections of items (strings), and if specific items (the same 6 items throughout, only 3 shown below) are missing from a collection then they must be added (but added only if missing). Each collection has from five to thirty items in it.

Something like this ungainly semi-pseudocode:

for each collection { $have-apple = 0; $have-orange = 0; $have-pear = 0; #examine about 30 items: for each item { if( $item eq "apple" ) { $have-apple = 1; } if( $item eq "orange" ) { $have-orange = 1; } if( $item eq "pear" ) { $have-pear = 1; } } # then, when all items examined unless( $have-apple ) { add apple; } unless( $have-orange ) { add orange; } unless( $have-pear ) { add pear; } }
My bones tell me that perl can do this much more elegantly ... please suggest how, kind brothers

Replies are listed 'Best First'.
Re: are they all there?
by jhourcle (Prior) on Aug 22, 2006 at 16:44 UTC

    Hashes and arrays are your friend:

    my @required_items = qw( apple orange pear ); foreach my $collection (@collections) { my %seen = (); foreach my $item ( ... ) { # however you get the items out of $colle +ction $seen{$item} = undef; } foreach my $item (@required_items) { if ( ! exists($seen{$item}) ) { # do something to insert it } } }
Re: are they all there?
by ikegami (Patriarch) on Aug 22, 2006 at 17:01 UTC

    Is your collection an array-like (not keyed) or hash-like (keyed)? If your collection is hash-like, this would be trivial.

    my @must_haves = qw( apple orange pear ); foreach my $collection (@collections) { foreach my $must_have (@must_haves) { $collection->{$must_have} = ... if not exists $collection->{$must_have}; } }

    If your collection is array-like, it's simplest to create a hash from it. See jhourcle's reply for an implementation.

      thank you both -- the collections are array-like so I'll use jhourcle's suggestion. (I'm a perl novice, and have been away from perl for 18 months too, so wheels quite rusty -- thanks for helping get them moving!)
Re: are they all there?
by Not_a_Number (Prior) on Aug 22, 2006 at 18:16 UTC

    Since, as you say, your 'collections' are strings, you could consider something like the following:

    my @required = qw ( apple banana orange pear ); my @collections = ( 'foo pear bar apple', 'apple foo pear orange banana', 'pineapple appearance', ); foreach my $collection ( @collections ) { foreach my $item ( @required ) { $collection .= " $item" unless $collection =~ /\b$item\b/; } } print "$_\n" for @collections;

    Update: ikegami++ (see below). I haven't changed the code in this node, but please use the suggested improvement.

      Nice, but instead of compiling {number of collections} * {number of required items} regexps, swap your loops to only compile {number of required items} regexps.

      foreach my $item ( @required ) { foreach my $collection ( @collections ) { $collection .= " $item" unless $collection =~ /\b$item\b/; } }

        I was asked

        How do we know that perl is smart enough not to recompile the regexp with every iteration of the inner loop nonetheless?

        We can use use re 'debug'; to see the difference.

        use re 'debug'; for my $i (1..2) { for my $re (qw( abc def )) { $i =~ /$re/; } }

        ouputs

        Compiling REx `abc' <------------ size 3 first at 1 1: EXACT <abc>(3) 3: END(0) anchored `abc' at 0 (checking anchored isall) minlen 3 Freeing REx: `abc' Compiling REx `def' <------------ size 3 first at 1 1: EXACT <def>(3) 3: END(0) anchored `def' at 0 (checking anchored isall) minlen 3 Freeing REx: `def' Compiling REx `abc' <------------ size 3 first at 1 1: EXACT <abc>(3) 3: END(0) anchored `abc' at 0 (checking anchored isall) minlen 3 Freeing REx: `abc' Compiling REx `def' <------------ size 3 first at 1 1: EXACT <def>(3) 3: END(0) anchored `def' at 0 (checking anchored isall) minlen 3 Freeing REx: `def'

        while

        use re 'debug'; for my $re (qw( abc def )) { for my $i (1..2) { $i =~ /$re/; } }

        outputs

        Compiling REx `abc' <------------ size 3 first at 1 1: EXACT <abc>(3) 3: END(0) anchored `abc' at 0 (checking anchored isall) minlen 3 Freeing REx: `abc' Compiling REx `def' <------------ size 3 first at 1 1: EXACT <def>(3) 3: END(0) anchored `def' at 0 (checking anchored isall) minlen 3 Freeing REx: `def'

        If the interpolated variables are still the same as they were the last time the regexp was compiled, the regexp is not recompiled.

        Update: We get the same results with /\b$re\b/.

Re: are they all there?
by jwkrahn (Abbot) on Aug 22, 2006 at 18:02 UTC
    Something like this should do what you want:
    my @required = qw/ apple orange pear grape banana /; my %missing = map { $_, 1 } @required; my @items = examine_items(); delete @missing{ @items }; push @items, keys %missing;

Re: are they all there?
by shmem (Chancellor) on Aug 22, 2006 at 20:57 UTC
    Since you didn't say how your collections are stored, I deliberately decide they are stored in an array of anonymous arrays. The must-be items are stored in an array.

    #!/usr/bin/perl # sample data my @required = a..f; my @collections = ([b..g],[c..h],[d..i],[e..j],[f..k]); my $elems = scalar @required; # number of elements foreach my $coll (@collections) { # foreach collection my %missing = (); # assume that all @missing{@required} = (1) x $elems; # items are missing # reset missing flag for elements in collection # and push the missing ones onto the collection $missing{$_} = 0 for grep {exists $missing{$_}} @$coll; push @$coll, grep {$missing{$_}} @required; } foreach my $coll (@collections) { print join (',', @$coll), "\n"; } __END__ output: b,c,d,e,f,g,a c,d,e,f,g,h,a,b d,e,f,g,h,i,a,b,c e,f,g,h,i,j,a,b,c,d f,g,h,i,j,k,a,b,c,d,e

    Something between jhourcle's and jwkrahn's solution.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: are they all there?
by artist (Parson) on Aug 22, 2006 at 17:28 UTC
    use Array::Lookup; my @required_items = qw(apple orange pear); @collections = ( [ 'apple', 'pear' ], ['banana'] ); foreach my $collection (@collections) { foreach my $required_item (@required_items) { lookup $required_item, $collection, \&notfound; } } sub notfound { my $required_item = shift; my $collection = shift; push @{$collection}, $required_item; } foreach my $collection (@collections) { print "@{$collection}\n"; }
    --Artist

      Your solution loops {number of collections} * {number of required items} * {number of items per collection} times.

      jhourcle's solution loops {number of collections} * ( {number of items per collection} + {number of required items} ) times.

      We weren't given any numbers, but I bet your code will get execute many more times than jhourcle's, making it much slower.

      Update: Here are some very conservative sample numbers:

      Number of collections: 5 Number of items per collection: 8 Number of items required: 10 artist: 5*8*10 = 400 iterations jhourcle: 5*(8+10) = 90 iterations

      Update: The OP gave some numbers.

      Number of collections: 5 (Conservative guess.) Number of items per collection: 12 (Middle of provided "5 to 30".) Number of items required: 6 (Provided.) artist: 5*12*6 = 360 iterations jhourcle: 5*(12+6) = 90 iterations