Boldra has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

We have a tricky java class with an include path list of over 3000 different directories. Naturally the program doesn't need all of these paths, but we don't know which ones are used.

We thought one solution might be to gradually go through the list, trying all combinations of up to 5 paths at once, in an attempt to find the minimal include path for the program. Eg.
$/ = ';'; @paths = <DATA>; foreach my $combination_listref ( &combinations_of( { min_length => 1, + max_length => 5, elements => \@paths }) ) { my @paths = @{$combination_listref}; &test_using_paths(@paths) and say 'SUCCESS WITH ' . scalar (@paths) . ' paths : ' . join +(';', @paths) } __DATA__ D:\opt\tbq\J2E\JC00\exe\jenqulib.jar;D:\opt\tbq\J2E\JC00\exe\jlogunzip +.jar;D:\opt\tbq\J2E\JC00\exe\jperflib.jar;D:\opt\tbq\J2E\JC00\exe\jst +artup.jar;D:\opt\tbq\J2E\JC00\exe\jstartupapi.jar;D:\opt\tbq\J2E\JC00 +\exe\jstartupimpl.jar;D:\opt\tbq\J2E\JC00\exe\servicehttp\tbqmc\frog. +jar;D:\opt\tbq\J2E\JC00\exe\servicehttp\tbqmc\tbqmc.jar;D:\opt\tbq\J2 +E\JC00\exe\servicehttp\tbqmc\tbqmcsoap.jar;D:\opt\tbq\J2E\JC00\exe\se +rvicehttp\tbqmc\tbqmcswing.jar;D:\opt\tbq\J2E\JC00\exe\servicehttp\tb +qmc\soapclient.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\go.jar;D:\opt\tbq\J +2E\JC00\j2ee\admin\launcher\jaas.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\l +ib\admin.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\bytecode.jar;D:\opt\t +bq\J2E\JC00\j2ee\admin\lib\compilation_lib.jar;D:\opt\tbq\J2E\JC00\j2 +ee\admin\lib\com_tbq_pj_jmx.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\co +nnector.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\ejb20.jar;D:\opt\tbq\J +2E\JC00\j2ee\admin\lib\exception.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\l +ib\iq-lib.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\jARM.jar;D:\opt\tbq\ +J2E\JC00\j2ee\admin\lib\jARMSat.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\li +b\jaxrpc-api.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\jcert.jar;D:\opt\ +tbq\J2E\JC00\j2ee\admin\lib\jmonapi.jar;D:\opt\tbq\J2E\JC00\j2ee\admi +n\lib\jms.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\jnet.jar;D:\opt\tbq\ +J2E\JC00\j2ee\admin\lib\jsse.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\j +startupapi.jar;D:\opt\tbq\J2E\JC00\j2ee\admin\lib\jstartupimpl.jar;D: +\opt\tbq\J2E\JC00\j2ee\admin\lib\jta.jar;
So, is there a CPAN module that can create the combinations for us? Or an enthusiastic Monk?


- Boldra

Replies are listed 'Best First'.
Re: permutations and combinations - too many java class paths!
by ELISHEVA (Prior) on Mar 03, 2009 at 15:05 UTC
    That seems the long way around the block: Why not
    1. dump all the jars to get a list of classes and create a map: class to "jars".
    2. Then try to compile the class and note the missing classes.
    3. Look up the jars that contain the missing classes in your class to jar map.
    4. And presto: you have your class path

    If that is still too many jars, you can always run it through an optimization algorithm that tries to find the smallest combination of jars that include all classes, but you should at least have much smaller starting set. All combinations of up to 5 elements selected from a pool of 3000 is a *lot* of combinations to try out.

    Best, beth

Re: permutations and combinations - too many java class paths!
by tilly (Archbishop) on Mar 03, 2009 at 15:32 UTC
    There are over 10**15 different paths with 5 directories from your list. And what happens if you actually need 6 of them? While I could easily produce code to enumerate them, that would be a useless approach.

    Far saner would be to try removing one path at a time until you have identified only the necessary include paths. That would require 3000 steps. I believe that Java searches from the first to the last path for stuff, in which case removing the last path would be the first thing to try.

    If you believe that only a small handful of these 3000 are needed, then you could do a binary search for that handful. The idea here looks something like this untested code:

    { my @untested_blocks; my @needed_paths; sub test_all_paths { @needed_paths = @untested_blocks = (); _test_all_paths(@_); return @needed_paths; } sub _test_all_paths { my @to_test = @_; return unless @to_test; # Mark the first half as not to test. push @untested_blocks, [splice @to_test, 0, int(@to_test/2)]; # Find needed blocks in second half recursively if (not path_works( (map @$_, @untested_blocks), @needed_paths ) { if (1 == @to_test) { push @needed_paths, @to_test; } else { _test_all_paths(@to_test); } } # Find needed paths in the untested block we just added _test_all_paths( @{ pop @untested_blocks } ); } }
    If only a small fraction of your paths are needed, this will find it in a lot fewer steps.
Re: permutations and combinations - too many java class paths!
by almut (Canon) on Mar 03, 2009 at 15:04 UTC

    Why not simply remove paths one by one, instead of trying all 5-out-of-3000 combinations?

    Just succesively remove paths, and every time it no longer works, put that path on the to-keep list...

Re: permutations and combinations - too many java class paths!
by bellaire (Hermit) on Mar 03, 2009 at 15:22 UTC
    Working with permutations can allegedly be done with Algorithm::Permute, but the number of permutations of 5 out of 3000 is gigantically huge. Using ballpark figures, even if you could test 1 million permutations per second, it would still take you over 7000 years to run through them all. And at the end if it turns out you needed 6 paths instead of 5, you've just wasted the last aeon. Bummer.

    So, don't do it that way. :) ELISHEVA's answer is a much better idea.