Here is an OO solution that has fixed memory limits. The API is a little strange, the bucket is destroyed when it becomes full, and upon destruction you should ensure that it gets dumped somewhere...
package Bin; use Carp; use strict; # Add an element to a bucket, returns the head of the list. sub add { my $self = shift; my $thing = shift; my $size = shift; if ($size <= ($self->{max_size} - $self->{size})) { $self->{size} += $size; push @{$self->{things}}, $thing; $self->{skipped} = 0; } elsif ($size > $self->{max_size}) { confess("'$thing' larger than a bucket ($size vs $self->{max_size} +)"); } else { $self->{next} ||= new Bin(@$self{'max_size', 'max_skips'}); if ($self->{max_skips} < ++$self->{skipped}) { return $self->{next}->add($thing, $size); } else { $self->{next} = $self->{next}->add($thing, $size); } } return $self; } sub DESTROY { my $self = shift; print "Bucket size $self->{size}: @{$self->{things}}\n"; <STDIN>; } sub new { my $self = bless {}, shift; $self->{max_size} = shift || 10_000_000; $self->{max_skips} = shift || 50; $self->{skipped} = 0; $self->{size} = 0; if (@_) { $self->add(@_); } $self; } package main; my $bin = new Bin; foreach my $cnt (9_800..10_001) { foreach (1, 10, 100) { $bin = $bin->add("$cnt\_$_", $cnt * $_); } } undef($bin);

In reply to Re (tilly) 3: file chunk algorithm by tilly
in thread file chunk algorithm by thealienz1

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.