Currently we receive 3 files monthly and I have a script which outputs the date in yyyy,mm,dd format. My need is to create a file containing the record with the longest time spread, as you can see there can be multiple dates or only 1 date. my output files are in the following format:

file1: COMPANY ABCD 764200 E 2013,12,13 2013,12,19 COMPANY BCDX 156167 L 2013,11,29 2013,12,03 COMPANY BCYX 165230 L 2013,12,13 2013,12,19 file2: COMPANY ABCD 764200 E 2013,12,13 2013,12,19 COMPANY BCDX 156167 L 2013,12,28 2013,12,31 file3: COMPANY ABCD 764200 E 2013,12,13 2013,12,17 COMPANY BCDX 156167 L 2013,11,30 2013,12,03 COMPANY BCYX 165230 L 2013,12,13 2013,12,17 COMPANY BCYX 156095 L 2013,11,30 2013,12,08

What I have so far:

Have extraction script append all 3 files to 1 file

Check for identical entries with script checking for duplicates

Script in progress that gets the difference between the 2 dates and skips any difference less that 5 (4 days and under are free days)

I just don't know how to code for extracting the record with the longest time spread discarding any others.

Here is the script I am working on (your input and suggestions are much appreciated...thanks)

#!/usr/bin/perl # use strict; use warnings; use Date::Calc qw( Delta_Days ); my @entries=(); ## hold my entries open (my $file, '<', (@ARGV)) or die $!; while (<$file>) { my @flds = split '\s+', $_; s/\s+$//; ## load my entries my $entry; $entry->{COMPANY} = $flds[0]; $entry->{CAR_PART_1} = $flds[1]; $entry->{CAR_PART_2} = $flds[2]; $entry->{LE} = $flds[3]; $entry->{BEG_DATE} = $flds[4]; $entry->{END_DATE} = $flds[5]; ## Push $entry onto @entries push (@entries, $entry); } close $file; foreach my $ent (sort @entries) { my @ymd1 = split ',',$ent->{BEG_DATE}; my @ymd2 = split ',',$ent->{END_DATE}; my $diff = Delta_Days(@ymd1, @ymd2); if ($diff < 5) { next; } else { ## this is where I need logic for grabbing only the re +cord with the most days printf "$ent->{CAR_PART_1} $ent->{CAR_PART_2} $ent->{LE} $ +ent->{BEG_DATE} $ent->{END_DATE} $diff\n"; ## testing output } } exit;

In reply to Working with multiple records with different dates by rruser

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.