Re: reading a delimited file and selecting values from it

Here's a start. This is actually pretty close to the algorithm that nefigah described (though I wrote it before reading that).

use strict;
use warnings;

use Data::Dumper;

my $last_hour = 23;
my %best_of;

while (<DATA>) {
    my ( $quote, $time ) = m{ \A         # beginning of line
                              ( [^,]+ )  # non-commas
                              \s* , \s*  # comma with optional spaces
                              (          # open capture
                               \d\d?     # hours
                               :
                               \d\d      # minutes
                               :
                               \d\d      # seconds
                              )
                            }xms;
    my ( $hour, $min, $sec ) = split /:/, $time;

    if ( '00' eq $sec && '00' eq $min && -1 == --$hour ) {
        $hour = $last_hour;
    }

    my $seconds_past = $min * 60 + $sec;
    if ( ! $seconds_past
        || $best_of{ $hour }{second} < $seconds_past ) {
        $best_of{ $hour } = { second => $seconds_past,
                              time   => $time,
                              quote  => $quote, };
    }
}

print Dumper \%best_of;

__DATA__
1.53311 ,1:59:52
1.53311 ,1:59:5220
1.53311 ,1:59:52
1.53311 ,1:59:52hi
1.53311 ,2:00:00
1.53306 ,2:00:03
1.53307 ,2:00:06
[download]

Here's the output:

$VAR1 = { 
          '1' => { 
                   'quote' => '1.53311 ',
                   'time' => '2:00:00',
                   'second' => 0
                 },
          '2' => { 
                   'quote' => '1.53307 ',
                   'time' => '2:00:06',
                   'second' => 6
                 }
        };
[download]

This pops out a couple of warnings ("Use of uninitialized value in numeric lt (<)") in the last condition because it's comparing $seconds_past to an undef that gets autovivified in %best_of.

Anyway, what you end up with is a hash with each hour seen as a key. The values are hash refs that contain the data you're interested in.

Comment on Re: reading a delimited file and selecting values from it Select or Download Code

Replies are listed 'Best First'.
Re^2: reading a delimited file and selecting values from it by Conal (Beadle) on Mar 12, 2008 at 02:55 UTC
great, thats fantastic Kyle.. the data will be a lot more useful to have the values in some kind of array cos i plan to manipulate it further later .. your code with be invaluable to me as a base structure. can i just ask , how i get the time variable to deal with phantom extra digits? e.g 1:59:52hi how do i get it to disregard the extraneous data at the end? thanks again	[reply]
Re^3: reading a delimited file and selecting values from it by kyle (Abbot) on Mar 12, 2008 at 03:03 UTC
The pattern I used already does that, as written. The pattern matches everything you want, up to the extraneous data, and that's where it stops. The problem with it (if you consider this a problem) is that the loop doesn't notice if there's a non-match. If you have some bogus line in the file, it's going to try to use it anyway. This will probably manifest as an undef quote at midnight. That's part of why I said it's a start.	[reply]