in reply to Seeking general directions for a Windows specific app

Assuming that enqueue operations would speed them up, things only really get interesting when you have 2 operations that can be performed completely in parallel.

for example:
C: -> F: while also doing D: -> G:

Now given a list of drive pairs (from and to), and costs of operation (file size combined with drive speed), what you get looks like a weighted graph. And then you have to pick the sequence of all edges that optimizes something to the effect of:

# warning: untested! use List::Util 'sum'; use Math::Combinatorics 'combine'; # @edges is the sequence of operations # edge_value and has_common_drive are left as an excercise to the impl +ementer my %seen = (); my $fitness = sum( # sum up operations until blocking map { sum( map { edge_value($_) } @edges[ $_->[0] .. $_->[1] - 1 ] ) } # take only the first collision after an operation grep {not $seen{$_->[0]}++} # sort collisions by distance between sort { $a->[1] - $a->[0] <=> $b->[1] - $b->[0] } # find colliding operations grep { has_common_drive( @{$edges}[@$_] ) } # all combinations of operations combine(2, 0..$#edges) );
I'm not inclined to prove it, but your problem seems to be at least NP complete.

And then you have to consider that you would have to re-evaluate your moving strategy whenever a new edge is added to your graph (whenever you enqueue a new operation).

Don't forget that operations on the same file *MUST* occur in FIFO order when at least one of them is a write. This is likely uncommon, but it is a critical error to do operations otherwise.

The only reasonable solution is to apply heuristics and get a "close enough" answer. But what is "close enough". And how hard is it going to be to get there? Too hard and not worth it IMHO.

In short, I don't think that writing a queue for this would be easy if you hope to get anything close to optimal.

Replies are listed 'Best First'.
Re^2: Seeking general directions for a Windows specific app
by Jenda (Abbot) on Aug 14, 2008 at 14:38 UTC

    To know whether you can do your C: -> F: and D: -> G: in parallel you'd have to know these are all different PHYSICAL drives (well, I would not worry about two remote mapped drives being the same). If C: and D: reside on the same harddrive, they can't be performed completely in parallel. In general, I would leave this to the user, if he enqueued the operation, he wants it to start after the previous one(s) complete.

      To be fair, I hadn't even remotely thought of (i) the problem of logical drives being really part of the same physical drives, nor -more importantly- of (ii) plobsing++'s subtleties, assuming naively that a much simpler logic involving (something like) a %is_in_use hash holding arrayrefs of "transfers" (of which the first one to be considered active) would do. But eventually you're right that if I still roll up this enqueing utility/experiment of mine, then I may want to skip all the hassle and just enque everything at the user's request.

      --
      If you can't understand the incipit, then please check the IPB Campaign.