Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Fast Way to find one array in second array

by Athanasius (Archbishop)
on Aug 14, 2016 at 07:13 UTC ( [id://1169738]=note: print w/replies, xml ) Need Help??


in reply to Fast Way to find one array in second array

Hello melmoth,

Here’s a more Perlish version of your code, using Set::Tiny for the convenience of its is_subset method:

use strict; use warnings; use Set::Tiny 'set'; my @paths = ( [ qw(A B F G) ], [ qw(A B) ], [ qw(A B X) ], [ qw(A B C D E F G) ], [ qw(I J K L M) ], ); my @sets; push @sets, set(@$_) for @paths; @sets = sort { $a->size <=> $b->size } @sets; # UPDATED my @filtered; LINE: for my $i (0 .. $#sets) { my $set_i = $sets[$i]; for my $j ($i + 1 .. $#sets) { next LINE if $set_i->is_subset($sets[$j]); } push @filtered, $set_i; } print $_->as_string, "\n" for @filtered;

<Update> Added sort line marked # UPDATED as per BrowserUk’s post below. </Update>

Now (other things being equal), the larger the set J, the more likely it is that J is a superset of a given set I. So an obvious speedup to try is to test I against the larger sets first, by reverseing the inner loop:

for my $j (reverse($i + 1 .. $#sets))

Of course, you’ll need to Benchmark the code to determine whether this makes a significant difference for your input data.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^2: Fast Way to find one array in second array
by BrowserUk (Patriarch) on Aug 14, 2016 at 16:50 UTC

    That doesn't seem to produce the correct results. Given this input:

    C:\test>1169736-st.pl [ ["S", "N", "H", "A", "C", "J", "D", "E", "Z", "L"], ["S", "Y", "Y", "O", "D", "M", "G", "W", "F", "U"], ["Z", "Z", "P", "K", "G", "H", "V", "A", "J", "C"], ["P", "E", "E", "V", "M", "E", "N", "T", "K", "H"], ["I", "L", "H", "Z", "H", "T", "O", "F", "T", "V"], ["I", "Y", "H", "I", "D", "T", "V", "S", "P"], ["G", "D", "A", "B", "U", "W", "F", "D", "O"], ["L", "J", "B", "P", "U", "U", "N", "H"], ["B", "A", "X", "H", "H", "P", "R", "V"], ["M", "F", "T", "M", "L", "Y", "T", "C"], ["L", "C", "Y", "X", "O", "I", "M", "J"], ["K", "T", "P", "O", "J", "D", "F"], ["R", "T", "S", "M", "D", "J", "V"], ["D", "Y", "D", "X", "S", "H"], ["N", "O", "P", "P", "E"], ["U", "N", "Z", "T", "I"], ["B", "Z", "R", "D", "W"], ["N", "X", "A", "Z", "O"], ["T", "R", "O", "B", "L"], ["X", "V", "T", "E"], ["V", "P", "M"], ["V", "Q", "R"], ["V", "D", "C"], ["S", "H", "L"], ["P", "N", "Q"], ["A", "A"], ["R", "M"], ["O"], ["S"], ["N"], ["N"], ["C"], ]

    It produces:

    [ bless({ A => undef, C => undef, D => undef, E => undef, H => undef, +J => undef, L => undef, N => undef, S => undef, Z => undef }, "Set::T +iny"), bless({ D => undef, F => undef, G => undef, M => undef, O => undef, +S => undef, U => undef, W => undef, Y => undef }, "Set::Tiny"), bless({ A => undef, C => undef, G => undef, H => undef, J => undef, +K => undef, P => undef, V => undef, Z => undef }, "Set::Tiny"), bless({ E => undef, H => undef, K => undef, M => undef, N => undef, +P => undef, T => undef, V => undef }, "Set::Tiny"), bless({ F => undef, H => undef, I => undef, L => undef, O => undef, +T => undef, V => undef, Z => undef }, "Set::Tiny"), bless({ D => undef, H => undef, I => undef, P => undef, S => undef, +T => undef, V => undef, Y => undef }, "Set::Tiny"), bless({ A => undef, B => undef, D => undef, F => undef, G => undef, +O => undef, U => undef, W => undef }, "Set::Tiny"), bless({ B => undef, H => undef, J => undef, L => undef, N => undef, +P => undef, U => undef }, "Set::Tiny"), bless({ A => undef, B => undef, H => undef, P => undef, R => undef, +V => undef, X => undef }, "Set::Tiny"), bless({ C => undef, F => undef, L => undef, M => undef, T => undef, +Y => undef }, "Set::Tiny"), bless({ C => undef, I => undef, J => undef, L => undef, M => undef, +O => undef, X => undef, Y => undef }, "Set::Tiny"), bless({ D => undef, F => undef, J => undef, K => undef, O => undef, +P => undef, T => undef }, "Set::Tiny"), bless({ D => undef, J => undef, M => undef, R => undef, S => undef, +T => undef, V => undef }, "Set::Tiny"), bless({ D => undef, H => undef, S => undef, X => undef, Y => undef } +, "Set::Tiny"), bless({ E => undef, N => undef, O => undef, P => undef }, "Set::Tiny +"), bless({ I => undef, N => undef, T => undef, U => undef, Z => undef } +, "Set::Tiny"), bless({ B => undef, D => undef, R => undef, W => undef, Z => undef } +, "Set::Tiny"), bless({ A => undef, N => undef, O => undef, X => undef, Z => undef } +, "Set::Tiny"), bless({ B => undef, L => undef, O => undef, R => undef, T => undef } +, "Set::Tiny"), bless({ E => undef, T => undef, V => undef, X => undef }, "Set::Tiny +"), bless({ M => undef, P => undef, V => undef }, "Set::Tiny"), bless({ Q => undef, R => undef, V => undef }, "Set::Tiny"), bless({ C => undef, D => undef, V => undef }, "Set::Tiny"), bless({ H => undef, L => undef, S => undef }, "Set::Tiny"), bless({ N => undef, P => undef, Q => undef }, "Set::Tiny"), bless({ A => undef }, "Set::Tiny"), bless({ M => undef, R => undef }, "Set::Tiny"), bless({ O => undef }, "Set::Tiny"), bless({ S => undef }, "Set::Tiny"), bless({ N => undef }, "Set::Tiny"), bless({ C => undef }, "Set::Tiny"), ]

    At least the last 6 subsets should not have made it into @filtered.

    It should produce (+-the blessing):

    [ ["S", "N", "H", "A", "C", "J", "D", "E", "Z", "L"], ["S", "Y", "Y", "O", "D", "M", "G", "W", "F", "U"], ["Z", "Z", "P", "K", "G", "H", "V", "A", "J", "C"], ["P", "E", "E", "V", "M", "E", "N", "T", "K", "H"], ["I", "L", "H", "Z", "H", "T", "O", "F", "T", "V"], ["I", "Y", "H", "I", "D", "T", "V", "S", "P"], ["G", "D", "A", "B", "U", "W", "F", "D", "O"], ["L", "J", "B", "P", "U", "U", "N", "H"], ["B", "A", "X", "H", "H", "P", "R", "V"], ["M", "F", "T", "M", "L", "Y", "T", "C"], ["L", "C", "Y", "X", "O", "I", "M", "J"], ["K", "T", "P", "O", "J", "D", "F"], ["R", "T", "S", "M", "D", "J", "V"], ["D", "Y", "D", "X", "S", "H"], ["N", "O", "P", "P", "E"], ["U", "N", "Z", "T", "I"], ["B", "Z", "R", "D", "W"], ["N", "X", "A", "Z", "O"], ["T", "R", "O", "B", "L"], ["X", "V", "T", "E"], ["V", "Q", "R"], ["V", "D", "C"], ["P", "N", "Q"], ]

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
    U, Science is about questioning the status quo. Questioning authority

      You’re right, thanks! I forgot to include the OP’s code for sorting the path arrays according to number of elements. Node updated:

      ... @paths = sort { @$a <=> @$b } @paths; my @sets; push @sets, set(@$_) for @paths; my @filtered; ...

      Now the output for your input data is as follows:

      Update: Since sets contain no duplicate elements, it will be better to sort on number of elements after the arrays have been converted into sets:

      ... my @sets; push @sets, set(@$_) for @paths; @sets = sort { $a->size <=> $b->size } @sets; my @filtered; ...

      Output:

      Cheers,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1169738]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (None)
    As of 2024-04-25 00:55 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found