fiddler42 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, Holy Ones,

The following code (below) is returning

ARRAY(hex#)

...for all the $records in @sorted_recs. The ST sort seems to be working because the ARRAY references are in different order between @records and @sorted_recs. I just can't retrieve the original contents of the records after the sort completes. Is it the "anonymous array that is a copy of @kv_pairs" (see comments below) that is getting bothced?

Any thoughts?

open (SUM2,">$ARGV[0].sum.sorted"); my @records; { local $/ = /^\s+$/; # Step through text file from one blank line to the # next, grabbibg everything in between all at once. open IN2, "$ARGV[0].sum" or die "Cannot open data file.\n$!"; while ( my $record = <IN2> ) { my @kv_pairs = split /\n/, $record; # Split each record into its <always 9> lines, and store # each line as an array element. push @records, [@kv_pairs]; # Create an anonymous array that is a copy of @kv_pairs. # Push the anonymous array containing a complete set # of keys/values for a given record into the @records array # as an array of arrays. } close IN2; } print STDERR 'Record count: ', scalar @records; # ST sort... my @sorted_recs = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, (split /=\s+/,$_->[4])[1] ] } # The 5th line [4] of each record has a decimal number # after the "=". Grab that number and sort all records # by those numbers. @records; print SUM2 foreach @sorted_recs; close IN2; close SUM2;
Thanks,
-Chris

Replies are listed 'Best First'.
Re: Lost contents of arrays...
by davido (Cardinal) on Jan 18, 2004 at 17:32 UTC
    This looks a little familiar. *grin*

    First, use local $/ = "\n\n";, not a regex.

    What's going on with the printing is that the part of the script that reads in the datafile and builds up @records is reading in one complete record at a time, and putting its six elements into an anonymous array that gets pushed onto @records. The result is that @records contains array-refs to anon-arrays of records. So you're only printing the top level, which is just the array-refs, when the fact is that you've got a two-dimensional array (a list of lists, an array of arrays). You need to be printing one level deeper.

    This can be confirmed by adding the following to your script:

    use Data::Dumper; # All of your existing code goes here. print Dumper \@sorted_recs;

    That will dump the data-structure held in @sorted_recs onto your display.

    The easiest way to deal with this is just to iterate over the top level, and then dive into the lower level to do your printing, like this:

    foreach my $record_ref ( @sorted_recs ) { foreach my $rec_line ( @{$record_ref} ) { print SUM2 $rec_line; } print SUM2 "\n"; # put that extra newline back, to # delimit the records. }

    That dereferences the top level, one element at a time, and then subsequently, prints each line of each individual record.

    I think you might find perlreftut (reference tutorial), perllol (lists of lists), and perlref (references) really informative. Oh, I almost forgot, perldsc (data structure cookbook).


    Dave

Re: Lost contents of arrays...
by Enlil (Parson) on Jan 18, 2004 at 17:32 UTC
    I don't think that local $/ = /^\s+$/; is not doing what you think it is. According to perlvar:

    Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-)

    That said, you can switch:

    print SUM2 foreach @sorted_recs;
    to something like:
    foreach my $array_ref( @sorted_recs ) { print SUM2 foreach ( @$array_ref }; }
    to dereference the array (take a look at perlref, and at tye's References quick reference for more info)

    -enlil

Re: Lost contents of arrays...
by CountZero (Bishop) on Jan 18, 2004 at 17:34 UTC
    It is a bit difficult to test your script as we don't have access to (some of) your data. Can you add it after a __DATA__ mark? I'm too lazy to make it up myself.

    More to the point, I think you are missing some dereferences somewhere.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      This actually appears to be in followup to Sorting question..., and in specific, he's using an adaptation on the example code I provided in Re: Sorting question....

      An example of the dataset is provided in the parent node of that thread. And a description of my proposed solution is provided within that thread as well.

      The solution will need to be reworked a bit if the assumptions I made (and documented) in my original solution aren't accurate. In particular, I would be concerned with how records are delimited, and the exact order of lines within each record. But with those caviets, the solution should work as long as the user massages it to find the right line within each record.

      If I had the solution to write again, given the ambiguity of the original dataset, I might grep each record to find the element on which the sort criteria is based, instead of indexing into is (which assumes a fixed and known position). Then I would split the results of the grep and build my ST data-structure in that way. It would be slower to set up each ST (because grepping a six-element record is slower than indexing its 4th element), but it would be more robust and less likely to break if records change size in the future.


      Dave

        All,

        Thanks so much for all the help--I now have everything working! I needed some extra hand-holding for this one because a lot of the suggestions provided to me were a little (okay, some a lot :-) outside of my perl vocabulary.

        Here's a final description of the problem, with working code to follow (below).

        Thanks again to all.

        Regards,
        -Chris

        The problem: sort a large file of text formatted like so:

        Clock: fci_rx_clk Pin: clkgate_i/peaz_gate7/S_5/Y Net: fci_rx_clk Operating Condition = worst The clock global skew = 0.290 The longest path delay = 4.328 The shortest path delay = 4.038 The longest path delay end pin: co_8port_i\/peaz_i/LOCKUP4/GN The shortest path delay end pin: co_8port_i\/peaz_i/fci_i/dec_i\/bmcf_ +INST\/slice03_INST\/b_reg[5]/CK Clock: aai_tx_clk Pin: clkgate_i/peaz_gate1/S_5/Y Net: aai_tx_clk Operating Condition = worst The clock global skew = 0.192 The longest path delay = 3.430 The shortest path delay = 3.237 The longest path delay end pin: co_8port_i\/peaz_i/aai_i/aai_test_2_a +ai_tx_clk_sgb_2_inst/ff1/CK The shortest path delay end pin: co_8port_i\/peaz_i/aai_i/aai_test_2_a +ai_tx_clk_sgb_0_inst/ff1/CK Clock: aai_rx_clk Pin: clkgate_i/peaz_gate2/S_5/Y Net: aai_rx_clk Operating Condition = worst The clock global skew = 0.349 The longest path delay = 3.996 The shortest path delay = 3.647 The longest path delay end pin: co_8port_i\/peaz_i/LOCKUP/GN The shortest path delay end pin: co_8port_i\/peaz_i/aai_i/atm_path_i\/ +rx_path_i\/rxpram_reg_i\/ar_hec_cnt_reg[3]/CK

        ...by the "skew" value for each clock. The end result should display all the text in the same format--e.g. 9 lines for each clock--only in a new order. Here is the code to accomplish that...

        open (SUM2,">$ARGV[0].sum.sorted"); my @records; { local $/ = /\n\n/; open IN2, "$ARGV[0].sum" or die "Cannot open data file.\n$!"; while ( my $record = <IN2> ) { my @kv_pairs = split /\n/, $record; push @records, [@kv_pairs]; } close IN2; } print STDERR 'Record count: ', scalar @records; my @sorted_recs = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { [ $_, (split /=\s+/,$_->[4])[1] ] } @records; foreach my $records ( @sorted_recs ) { foreach my $record ( @{$records} ) { print SUM2 $record; } print SUM2 "\n"; } close (IN2); close (SUM2);
        I entirely agree with your point of view. I might have found it myself if I knew where this question was coming from.

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law