Re^2: issue with output of file matching

Thanks so much for the help again, I replace the sub input_data and sub output_data in original script with this and I get the following:

ACFX  28523  L           05/18/13  ABCCO    
ACFX  28523  L  05/01/13           ABCCO-C
[download]

I am pasting entire script, maybe I am doing something wrong, this is a little over my head, hopefully will begin understanding better

#!/usr/bin/perl
# 
use strict;
use warnings;
use Date::Calc qw( Delta_Days );

my %hashC=();
my %hash=();
input_data(1,'out1.txt');
input_data(2,'out2.txt');
#input_data(1,'pcarry.txt');
#input_data(2,'rcarry.txt');
output_data('final.txt');


#sub input_data {
#  my ($ix,$filename) = @_;
#  open FILE1, "<", $filename or die "$filename : $!\n";
#  while ( <FILE1> ) {
#    chomp $_;
#    my ( $key, $le, $date, $company ) = split ',', $_;
#    my $pk = join "\t",$key,$le,$company;
#    push @{$hash{$pk}[$ix]},fmt_ymd($date);
#  }
#  close FILE1;
#}
sub input_data {
  my ($ix,$filename) = @_;
  open FILE1, "<", $filename or die "$filename : $!\n";
  while ( <FILE1> ) {
    chomp $_;
    my ( $key, $le, $date, $company ) = split ',', $_;    
    my $pk = join "\t",$key,$le,$company;

    # remove -C from key and store
    if ($pk =~ s/-C$//){
    #  print "-C removed $pk\n";
      $hashC{$pk} = '-C';
    }
    
    push @{$hash{$pk}[$ix]},fmt_ymd($date);
  }
  close FILE1;
}

#sub output_data {
#  my $filename = shift;
#  open OUTFILE, ">", $filename or die "$filename : $!\n";

  # primary key
#  for my $pk (sort keys %hash){
#     my ($key,$le,$company) = split "\t",$pk;

sub output_data {
  my $filename = shift;
  open OUTFILE, ">", $filename or die "$filename : $!\n";

  # primary key
  for my $pk (sort keys %hash){
    my ($key,$le,$company) = split "\t",$pk;
    # add -C back if required
    $company .= $hashC{$pk} || '';


    # get multiple dates
#    print "$pk\n";
#   my @dates  = @{$hash{$pk}[1]};
#    my @rdates = @{$hash{$pk}[2]};
    my @dates  = (defined $hash{$pk}[1]) ? @{$hash{$pk}[1]} : ();
    my @rdates = (defined $hash{$pk}[2]) ? @{$hash{$pk}[2]} : ();
    
    # even up number of dates:
    while (@dates < @rdates) {
      push @dates,'1900-01-01';
    }
    while (@rdates < @dates) {
      push @rdates,'1900-01-01';
    }
    
    # print out multiple dates for each key
    for my $date (reverse sort @dates){
    
      # use match sub if more than 1
      if (@rdates > 1){
          @rdates = match($date,@rdates);
      }
      # rdates sorted so best match is first element
      my $rdate = shift @rdates;
      print join ' ',$key,$le,fmt_mdy($date),fmt_mdy($rdate),$company,
+"\n";
    }
  }
  close OUTFILE;
}

# match dates by calc days diff
# and sorting to get least diff
sub match {
  my ($date,@rdates) = @_;
  my @days=();
  # split date into y,m,d
  my @d1 = split /\D/,$date;
  
  # calc diff and store with date
  for my $rdate (@rdates){
    my @d2 = split /\D/,$rdate;
    push @days,[$rdate,abs Delta_Days(@d1,@d2)];
  }
  
  # sort array by days
  @days = sort {$a->[1] <=> $b->[1]} @days;
  
  # extract dates 
  return map {$_->[0]} @days;
}


# change mm/dd/yy to yyyy-mm-dd
sub fmt_ymd {
  my $mdy = shift;
  $mdy =~ s/ //g;
  my ($m,$d,$y) = split /\D/,$mdy;
  if ($y < 99){ $y += 2000 };
  return sprintf "%04d-%02d-%02d",$y,$m,$d;
}

# change yyyy-mm-dd to mm/dd/yy
sub fmt_mdy {
  my $ymd = shift;
  $ymd =~ s/ //g;
  return ' 'x8 if $ymd eq '1900-01-01';
  my ($y,$m,$d) = split /\D/,$ymd;
  $y -= 2000;
  return sprintf "%02d/%02d/%02d",$m,$d, $y;
}
[download]

Comment on Re^2: issue with output of file matching Select or Download Code

Replies are listed 'Best First'.
Re^3: issue with output of file matching by poj (Abbot) on Jun 07, 2013 at 20:30 UTC
I suggest you comment out this line temporarily `$company .= $hashC{$pk} \|\| '';`. If you still get -C appearing in the output then check the data carefully for trailing spaces. You can also add the primary key into the output temporarily with separators like this to see spaces or other reason why they don't match up. `print join ' ',$key,$le,fmt_mdy($date),fmt_mdy($rdate),$company,"\|$pk +\|\n";` [download] poj	[reply] [d/l] [select]
Re^4: issue with output of file matching by rruser (Acolyte) on Jun 07, 2013 at 21:49 UTC
thanks for that code I was able to fix my formatting and space issues so the appended file is the same field widths. I am still not getting the fields to match up. `ACFX 28523 L 05/18/13 ABCCO ACFX 28523 L 05/01/13 ABCCO-C ACFX 28526 L 05/28/13 ABCCO ACFX 28526 L 05/01/13 ABCCO-C ACFX 44866 L 05/28/13 ABCCO ACFX 44866 L 05/01/13 ABCCO-C ADMX 49266 L 05/03/13 05/16/13 PFGCO ADMX 63770 L 05/12/13 05/21/13 PFGCO ADMX 63975 L 05/12/13 05/30/13 PFGCO` [download] `the first and second row need to match (they do perfectly without the +-C) need them to look like: ACFX 44866 L 05/01/13 05/28/13 ABCCO-C` [download] I made sure file1 and file2 are both the same column widths, so now i just think the issue is with the company names not being exact...thanks	[reply] [d/l] [select]
Re^5: issue with output of file matching by poj (Abbot) on Jun 08, 2013 at 12:40 UTC
What output do you get with `$pk` added and the separator changed to \| like this ? ; `print join '\|',$pk,$key,$le,fmt_mdy($date),fmt_mdy($rdate),$company,"\ +n";` [download] poj	[reply] [d/l] [select]
Re^6: issue with output of file matching by rruser (Acolyte) on Jun 11, 2013 at 14:10 UTC
Re^7: issue with output of file matching by poj (Abbot) on Jun 11, 2013 at 15:01 UTC
Some notes below your chosen depth have not been shown here