Re: grabbing random n rows from a file

The fact that your sets are grouped is a great benefit. We can work with a set at a time.

The fact that your sets are of varying length is a great hindrance. Work needs to be done to locate the end of each set.

I made the following assumptions:

m (and thus j) is rather small. Specifically, keeping m lines in memory is not a problem. ( Confirmed in "Update 2". )
m is not the same for every set. ( Confirmed in "Update 2". )
You don't want the same random j lines from every set. It's a minor change if you do.
You don't care if the random j lines are in their original order. It's a minor change if you do.

My solution:

use strict;
use warnings;

use List::Util qw( shuffle );

my $j = 90;

sub extract_id {
   my ($line) = @_;
   ...
   return ...;
}

my @m;
my $id;
my $last_id;

for (;;) {
   my $line = <DATA>;

   $id = extract_id($line)
      if defined($line);

   if (@m) {
      if (!defined($line) || $id ne $last_id) {
         my $j = $j < @m ? $j : @m;
         print $m[$_] foreach (shuffle(0..$#m))[0..$j-1];
         @m = ();
      }
   }

   last if !defined($line);

   push(@m, $line);
   $last_id = $id;
}
[download]

Untested. (Update: Tested. Fixed. )

Memory can be saved by stored file positions in @m instead of the actual lines, but that's not needed based on your "Update 2".

Alternative:
print splice(@m, rand(@m), 1) while $j--;

Comment on Re: grabbing random n rows from a file Select or Download Code