Monks,

I have an algorithm question here. Here's the scoop:

I am looping to build a list of records. I need to group certain records together, creating a "parent" record that comes before the group of sub-records. The grouping is done by a certain "ID" in each record that indicates the "group" that it is part of. The problem is boundary conditions and the awful cut-and-paste of a bunch of code in order to catch the last set:

Here's some psuedo-code that illustrates what I'm doing:

my @complete_list = (); my @sub_list = (); my $cur_id = 0; for my $rec ( @records ) { my $this_id = $rec->{ID}; # lots of other attributes here... if ( $this_id != $cur_id ) { # we crossed an ID boundary: # push a "heading" record on to the list if there # is more than one record in the sub list if ( @sub_list > 1 ) { # maybe lots of code here to build the record, # using all those attributes... push @complete_list, { # ... some record }; } # push the sub-list onto the main list and emtpy it out push @complete_list, @sub_list; @sub_list = (); } # accumulate records in the sub_list push @sub_list, { # ... some record }; # keep track of the "current" (last) ID to see when it changes $cur_id = $this_id; } #################### # here is the boundary-condition where we need to # take care of the last sub_list # cut-and-paste copy of the code inside the # ID-changed test: "if ( $this_id != $cur_id )" in the loop # YUCK! if ( @sub_list > 1 ) { # maybe lots of code here to build the record, # using all those attributes... push @complete_list, { # ... some record }; } push @complete_list, @sub_list; # all done return \@complete_list;

Hopefully my comments in the code make it clear what I don't like: I have to cut-and-paste the whole chunk of code that does the "grouping" after the loop in order to catch the last "group".

Of course the most obvious solution to avoid cut-and-paste is to put that code into a subroutine. That is probably what I'll do in the end. I'll have to pass it quite a few arguments, since it needs the record, a reference to the whole big list "\@complete_list", a reference to the sub_list, the "previous id" ($cur_id), and maybe some other stuff.

I'm hoping there's a nicer way to code this so that I can handle the boundary condition more elegantly, and not have to factor out the grouping code into a subroutine. For some reason, that way is eluding my mind.

Can someone please hit me over the head with my now-dusty "Introduction to Algorithms" textbook and enlighten me? Please?

--
edan


In reply to Looping and Grouping: there must be a better way! by edan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.