capriguy84 has asked for the wisdom of the Perl Monks concerning the following question:

I have a task of parsing huge CSV files with multiple entries for same second of date/time. I need to Process the lines for each second and print it out. I need help with parsing part of the CSV. Sample below.
NAME 6/1/2011 9:30:00 16.76 469898 Q 0 NAME 6/1/2011 9:30:00 16.75 300 Q 0 NAME 6/1/2011 9:30:00 16.75 150 Q 0 NAME 6/1/2011 9:30:01 16.76 756 Q 0 NAME 6/1/2011 9:30:01 16.76 300 Q 0 NAME 6/1/2011 9:30:02 16.76 100 Q 0
The task is to parse lines and computes values high/low first/last value for column 4 for each second.
I am usign Text::CSV_XS to parse CSV but how to get the time/date value from next line to compare with previous line.
while (<FILE>) { $csv->parse($_); my @columns = $csv->fields(); $start_time = str2time($columns[1].' '.$columns[2]); next unless ($curr_time != $start_time) { $exchange = $column[5]; push (@arr1, $column[3]); push (@arr2, $column[4]); push (@arr3, $column[5]); $curr_time = str2time($columns[1].' '.$columns[2]); print "Avg= high= low= in one second=";

Replies are listed 'Best First'.
Re: Parse CSV lines and process for each second
by blue_cowdawg (Monsignor) on Sep 07, 2011 at 18:51 UTC
        I need to Process the lines for each second and print it out

    here's a pseudo-code-ish description of how you might approach this:

    my $hash={}; while (have_records){ my $time = (date_from_record) . "-" . (time_from_record); if ( ! defined($hash->{$time} ) { $hash->{$time} = { min => SOME_VERY_LARGE_NUMBER, max => 0 } } $hash->{$time}->{min} = ( (VALUE_FROM_RECORD) < $hash->{$time} +->{min} ? (VALUE_FROM_RECORD) : $hash->{$time} +->{min} ) ; $hash->{$time}->{max} = ( (VALUE_FROM_RECORD) > $hash->{$time} +->{max} ? (VALUE_FROM_RECORD) : $hash->{$time} +->{max} ) ; } # NOTE: change (date_from_record), (have_records and other # psuedo-values to things that are reasonable.

    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
      I am finding hard to understand your response. The logic I am trying to apply is
      1. Parse line1, get time-date1. Push elements to array. Go to next line.
      2. Parse line2, get time-date2. Check if time-date2 == time-date1.
      2.a. YES: Then push the elements to an array
      2.b. NO: Then print values
      The hard part is to process the first line. I am trying to use if control loop.
            I am finding hard to understand your response. The logic I am trying to apply is

        You are trying to hard to use pushes into arrays when what you really want to do is set up what I've called in the past a "keyed abacus." Use your date/time information as a key to a hash of hashes such that each element in the hash is a hash consisting of {min, max} values. Iterate through your data, create a key into your has with the date and time concatenated, if that key already exists update the min and max elements of the hash with the new values.

        Other than writing the code out for you long hand I'm not sure how much better I can explain it.

        #!/usr/bin/perl -w use strict; my $table={}; while(my $line=<DATA>){ chop $line; my ($name,$date,$time,$value)=split(/[\s]+/,$line); my $key = $date . "-" . $time; if ( ! defined($table->{$key}) { $table->{$key}={ name => '', min => 99999999, max => 0 }; } $table->{$key}->{min} = $value if $value < $table->{$key}->{min} $table->{$key}->{max} = $value if $value > $table -> {$key}->{max +} } foreach my $key (sort keys %$table){ my ($date,$time) = split("-",$key); printf "Date: %s Time: %s max = %d min = %d\n",$date,$time, $table->{$key}->{max}, $table->{$key}->{min}; } exit(0); __END__ fred 9/1/2011 15:00:00 50 mary 9/1/2011 15:00:00 0 john 9/1/2011 15:00:00 16

        You should see an output something like:

        Date: 9/1/2011 Max: 50 Min: 0

        By the way...this looks like the sort of assignment I would give to my students for homework when I used to teach Perl.


        Peter L. Berghold -- Unix Professional
        Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg