elwoodblues has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I'm trying to sort some data that is in the following format:
epoc_time field_1 field_2

what would be the best way of storing this internally (an array or hash tied to the epoch_time?) and how would I best sort this on most recent first? (I'm a shell programmer, and can easily do this with sort, but thought I'd try to learn more perl and expand my knowledge slightly)

Update:
Thanks for the replies and wisdom. That's led me to water, now I can drink!

Replies are listed 'Best First'.
Re: best way to sort
by jethro (Monsignor) on Feb 26, 2010 at 02:11 UTC

    The data structure depends on what you want to do with that data, not what the data looks like.

    Since you want to sort it chronologically I assume the most frequent operation will be to look for all items between two dates. In that case a simple array would be best (i.e. the solution you would use as shell programmer too).

    Sorting is done with the sort() function. You can provide sort() with a function it uses to compare two items (aliasing the items to $a and $b). In your case something like this would work:

    my @sorted= sort { $b<=>$a } @unsorted;

    You can read the perl documentation of the sort function for more info (with perldoc -f sort)

    Generally a hash is not suitable for storing sorted data.

Re: best way to sort
by ww (Archbishop) on Feb 26, 2010 at 02:40 UTC
    Some resources re sorting:
    1. http://perldoc.perl.org/perlfaq4.html#How-do-I-sort-an-array-by-%28anything%29%3f
    2. Replies in MsExcel like Multi Column Sorting
    3. Super Search
      ...and
    4. "Learning Perl" (AKA "The Llama book") Schwartz, foy & Phoenix, O'Reilly, esp Ch 15

    The preferred data structure will depend on how you want to use the data later -- your notion of a hash with epoch_times as the keys will be straightforward (the time will be converted to a string, but that's not an issue); putting the data into a database (SQLite, for one) will give the power to manipulate and retrieve it in many ways.

    And you certainly could put it in an array if the format you show is absolutely guaranteed to be invariant:

    1. split offers the option of splitting once, after the first space:
      @array = split /PATTERN/,EXPR,LIMIT
      where epoch_time will be the first, third, fifth (and so on) elements of the array to which you've split
    2. or making three fields:
      split /PATTERN/,EXPR

    TIMTOWTDI: could be much more elegant (MUCH more!) but this may be easy to follow:

    #!/usr/bin/perl use strict; use warnings; #825421 use Data::Dumper; my (@array, @tmp, $time, $fields); my @lines = <DATA>; for my $line(@lines) { chomp $line; ($time, $fields) = split / /, $line, 2; push @array, $time; push @array, $fields; } my @arr2; for my $line(@lines) { chomp $line; my @tmp2 = split / /, $line; for my $element2(@tmp2) { push @arr2, ($element2 . "\t"); } } print "\n first array \n"; for $_(@array) { print "$_ \n"; } print "\n"; print "\n second array\n"; for $_(@arr2) {print "$_ \n "; } print "\n"; __DATA__ 12345 f1 f2 23456 F3 F4 98765 f-five f-six<c> <p>Output:</p> <c> first array 12345 f1 f2 23456 F3 F4 98765 f-five f-six second array 12345 f1 f2 23456 F3 F4 98765 f-five f-six

    Updated code and output after initially (and dumb-ly!) pasting both in a form that didn't even come close to OP's spec.

    Code for sorting left as an exercise for elwoodblues. :-)

Re: best way to sort
by roboticus (Chancellor) on Feb 26, 2010 at 14:35 UTC

    elwoodblues:

    Just a note: It's always good to learn something new, so ++ for that. However, sometimes the sort utility is the *right* tool for the job. If you're sorting huge data files, it's usually better to use sort than perl. However, if you're going to modify the data during/after the sort and rewrite it, then using perl is frequently better.

    ...roboticus