in reply to Re^7: Tricky chemicals optimization problem
in thread Tricky chemicals optimization problem

Here you go:

#! perl -slw use strict; use List::Util qw[ sum ]; use Data::Dump qw[ pp ]; $Data::Dump::WIDTH = 1000; my %chemsAndFlows; while( <DATA> ) { my( $chem, $mach, $flow ) = split ' '; next if $flow == 0; push @{ $chemsAndFlows{ $chem } }, $flow; } @$_ = sort{ $a <=> $b } @$_ for values %chemsAndFlows; #pp \%chemsAndF +lows; ## Greedy; our $MACHMAX //= 11000; my( $iMach, %machsAssigned ) = 0; while( keys %chemsAndFlows ) { my( $machTotal, $mach ) = ( 0, sprintf "E%03d", ++$iMach); for my $chem ( keys %chemsAndFlows ) { for my $iFlow ( reverse 0 .. $#{ $chemsAndFlows{ $chem } } ) { my $flow = $chemsAndFlows{ $chem }[ $iFlow ]; die "MACHMAX too small for $chem:$flow" unless $flow < $MA +CHMAX; if( ( $machTotal + $flow ) <= $MACHMAX ) { $machTotal += $flow; push @{ $machsAssigned{ $mach }{ $chem } }, $flow; splice @{ $chemsAndFlows{ $chem } }, $iFlow, 1; delete $chemsAndFlows{ $chem } unless @{ $chemsAndFlow +s{ $chem } }; } } } } $_->{ ' total' } = sum( map @$_, values %$_ )for values %machsAssigned +; pp \%machsAssigned; #1st column is unique chemical identifier 2nd column is unique machine + identifier 3rd column is processing seconds __DATA__

Append your posted data at the end after the __DATA__ line.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.
p

Replies are listed 'Best First'.
Re^9: Tricky chemicals optimization problem
by Anonymous Monk on Jan 11, 2017 at 15:19 UTC
    This works really awesome..... some of the code a bit over my head so I'm trying to break it down and understand all the pieces. Couple questions:

    1) the 'our' declaration - I read about the definition and usage but still don't really understand why you use it here instead of ''my'?

    2) to make sure I understood your code and the process, I set out to solve the problem if we want to minimize the number of nozzles per machine rather than minimize the number of machines. In other words we would want to minimize the average chemical per machine. How would you alter your code above to accomplish this?

    3) I get a bit lost when you get to the 'splice' and 'delete' lines of your code... I follow that you want to use splice on the HoHoA because after you assign the flow you want to remove it from consideration...but what is the delete part serving to do?

    4) do you have a good book recommendation or online exercises recommendation on these types of dynamic programming? The only language I know (barely) is perl...

    thanks again for your help... it's made huge difference in my learning!
      This works really awesome..... some of the code a bit over my head so I'm trying to break it down and understand all the pieces. Couple questions:
      1) the 'our' declaration - I read about the definition and usage but still don't really understand why you use it here instead of ''my'?

      See the -s switch in perlrun.

      In a nutshell: -s allows me to supply -MACHMAX=200000 on the command line to the script and it sets the global variable $MACHMAX to the value supplied.

      Without the our I would get a warning: "Global symbol "$MACHMAX" requires explicit package name ...".

      If I used my $MACHMAX; I would still get a warning: "Name "main::MACHMAX" used only once: possible typo" and it would be a different variable from the one assigned from the command line. I just wouldn't work.

      2) to make sure I understood your code and the process, I set out to solve the problem if we want to minimize the number of nozzles per machine rather than minimize the number of machines. In other words we would want to minimize the average chemical per machine. How would you alter your code above to accomplish this?

      I don't understand how you can "minimixe the number of nozzles"? Don't you need all the different flows to be separate? If not, why are there multiple flows of the same chemical on the same machine in the existing scheme?

      If you don't care how many machines, just have a separate machine for each chemical; and if the total flow required for a given chemical exceeds the machines capacity, add another for that chemical.

      The code I supplied does a pretty good job of putting as many of the flows for a given chemical onto the same machine. If you can combine flows to a single nozzle, simply adding all the flows for a given chemical on a given machine into single nozzle will do a pretty good job of minimizing the number of nozzles.

      The code I posted already puts all the flows for a given chemical on a given machine into a separate array, so combining them would be trivial, assuming that there in no limit on the flow from a nozzle other than the machine limit. (A question I did ask but you didn't answer as far as I've noticed.)

      3) I get a bit lost when you get to the 'splice' and 'delete' lines of your code... I follow that you want to use splice on the HoHoA because after you assign the flow you want to remove it from consideration... but what is the delete part serving to do?

      The delete removes the chemical from the hash once all its flows have been assigned to a machine. Once the hash is empty the while loop terminates.

      4) do you have a good book recommendation or online exercises recommendation on these types of dynamic programming? The only language I know (barely) is perl... thanks again for your help... it's made huge difference in my learning!

      Not personally. I learnt perl via this site -- initially asking, then answering questions and reading other peoples answers. -- and by using the extraordinarily comprehensive documentation and just trying things out.

      But if you do a search here for "books" & "perl" there are numerous existing threads that answer that question.

        Thanks again for the thoughtful replies, I'm learning a bunch each iteration here (hope it's not irritating for you).

        In response to your call outs : I don't understand how you can "minimixe the number of nozzles"? Don't you need all the different flows to be separate? If not, why are there multiple flows of the same chemical on the same machine in the existing scheme? So if a chemical is the same, and its nozzle is on the same machine it can be collapsed into one. The reason my data is showing differently is because it represents real world inefficiencies in the manufacturing. Essentially the case you pointed out is an easy win or is a particular exception (perhaps due to flow rate which we don't know ). We are assuming now that it's always the former case, to simplify the analysis.

         If you don't care how many machines, just have a separate machine for each chemical; and if the total flow required for a given chemical exceeds the machines capacity, add another for that chemical.

        I explained this poorly. What I should.have said is : we want to minimize the numbers of nozzles with precedence over minimizing the number of machines. The relative weighting of each is something I'm been thinking a lot about, but st a high level the minimizing of nozzles has to take precedence over minimizing of machines, that being said you still want to end with the fewest machines possible.

        The code I supplied does a pretty good job of putting as many of the flows for a given chemical onto the same machine. If you can combine flows to a single nozzle, simply adding all the flows for a given chemical on a given machine into single nozzle will do a pretty good job of minimizing the number of nozzles.

        I hear what you are saying but I'm failing to see what the gap might be if we made minimizing the lines higher precedence... I.e., is the fact that your code is doing a good job of minimizing the nozzles a side effect of the operation , or are you purposefully designing it to do that -if so can you point out the line that provides that effect? If you switch all like chemical nozzles across different machines e.g., let's say : Mach A Chem A 45%, Mach B Chem A 30%, Mach C Chem A 40% , where the percents represent the percent of total capacity that flow value would take of a single machines capacity, the result you'd want is Mach A Chem A 85% (A+ C ), and then a second machine Mach B remains at 30% for Chem A because Mach A does not have the capacity left to take on this 30% block. Once you do this you would search for the optimal solution to fill in the remaining 15% capacity in Mach A with a different chemical(s), and 70% on Mach B with different chemical(s). Ideal here would be a second chemical that runs 10% on 7 machines, and thus can be collapsed into one nozzle that perfectly utilizes the 70% capacity remaining on machine B. Does the greedy algorithm approach still work optimally here? Or maybe a gradient descent? I'm really struggling to program in the logic I just described via the Chem A ,Chem B example due to the large numbers of machines and chemicals

        The code I posted already puts all the flows for a given chemical on a given machine into a separate array, so combining them would be trivial, assuming that there in no limit on the flow from a nozzle other than the machine limit. (A question I did ask but you didn't answer as far as I've noticed.) Which array is this in the code you provided?

        Re: flow - the 'flow' values (3rd column of the data) are actually seconds of flow, and the machines total capacity is also represented as seconds of flow. You must think of them as blocks though, as you currently are I.e., you can sum the seconds 'blocks' but not not split them into pieces.

        On books- I definitely will do that and start reading this weekend. Thanks a ton for the help again. You're a wizard with this stuff.