rruser has asked for the wisdom of the Perl Monks concerning the following question:
Currently we receive 3 files monthly and I have a script which outputs the date in yyyy,mm,dd format. My need is to create a file containing the record with the longest time spread, as you can see there can be multiple dates or only 1 date. my output files are in the following format:
file1: COMPANY ABCD 764200 E 2013,12,13 2013,12,19 COMPANY BCDX 156167 L 2013,11,29 2013,12,03 COMPANY BCYX 165230 L 2013,12,13 2013,12,19 file2: COMPANY ABCD 764200 E 2013,12,13 2013,12,19 COMPANY BCDX 156167 L 2013,12,28 2013,12,31 file3: COMPANY ABCD 764200 E 2013,12,13 2013,12,17 COMPANY BCDX 156167 L 2013,11,30 2013,12,03 COMPANY BCYX 165230 L 2013,12,13 2013,12,17 COMPANY BCYX 156095 L 2013,11,30 2013,12,08
What I have so far:
Have extraction script append all 3 files to 1 file
Check for identical entries with script checking for duplicates
Script in progress that gets the difference between the 2 dates and skips any difference less that 5 (4 days and under are free days)
I just don't know how to code for extracting the record with the longest time spread discarding any others.
Here is the script I am working on (your input and suggestions are much appreciated...thanks)
#!/usr/bin/perl # use strict; use warnings; use Date::Calc qw( Delta_Days ); my @entries=(); ## hold my entries open (my $file, '<', (@ARGV)) or die $!; while (<$file>) { my @flds = split '\s+', $_; s/\s+$//; ## load my entries my $entry; $entry->{COMPANY} = $flds[0]; $entry->{CAR_PART_1} = $flds[1]; $entry->{CAR_PART_2} = $flds[2]; $entry->{LE} = $flds[3]; $entry->{BEG_DATE} = $flds[4]; $entry->{END_DATE} = $flds[5]; ## Push $entry onto @entries push (@entries, $entry); } close $file; foreach my $ent (sort @entries) { my @ymd1 = split ',',$ent->{BEG_DATE}; my @ymd2 = split ',',$ent->{END_DATE}; my $diff = Delta_Days(@ymd1, @ymd2); if ($diff < 5) { next; } else { ## this is where I need logic for grabbing only the re +cord with the most days printf "$ent->{CAR_PART_1} $ent->{CAR_PART_2} $ent->{LE} $ +ent->{BEG_DATE} $ent->{END_DATE} $diff\n"; ## testing output } } exit;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Working with multiple records with different dates
by LloydRice (Beadle) on Jan 18, 2014 at 00:49 UTC | |
|
Re: Working with multiple records with different dates
by kcott (Archbishop) on Jan 18, 2014 at 13:21 UTC | |
by rruser (Acolyte) on Jan 21, 2014 at 23:01 UTC | |
by kcott (Archbishop) on Jan 22, 2014 at 11:59 UTC |