Mister_Inkster has asked for the wisdom of the Perl Monks concerning the following question:

I am a newbie to Perl. I have a list of records (sorted by Invoice #, and decending time) with 4 columns as follows:
invoice, agent, date, timeInMin.
(timeInMin is just hours*60+minutes) The records look like this:
-------------/cut/--------------------
1056128833340 Robb 2003-06-20 665 **
1056128833340 t028348 2003-06-20 607
1055439973653 T012697 2003-07-22 962
1055439973653 t012697 2003-07-22 948
1055439973653 t806174 2003-07-15 792
1055439973653 T806174 2003-07-15 791
1055439973653 t021191 2003-07-08 786
1055439973653 Robb 2003-06-17 503
1055439973653 Larry 2003-06-16 815 **
1055439973653 t021191 2003-06-12 646
--------------/cut/----------------------

(** indicates the records I want to pull out)

As you can see, there are mulitple records for each invoice. I'm trying to figure out how would I determine the most RECENT record that has Larry or Robb as the agent, and where the immediately previous record does NOT have Larry or Robb as the agent. I need to do this for every invoice. MrI

Replies are listed 'Best First'.
Re: Searching a multidimensional list
by bbfu (Curate) on Aug 01, 2003 at 23:00 UTC

    BrowserUk has a good point. You need to clarify your requirements. However, assuming you mean "as ordered in the file", the following code should do:

    #!/usr/bin/perl use warnings; use strict; my %Invoices; my $last_name = ''; my %data; while(my $line = <DATA>) { chomp $line; # Split the fields into a hash. Depending on your # data, this might be better done with Text::xSV. @data{qw(orig inv name date time)} = ($line, split(/\s+/, $line)); next unless $data{name} eq 'Robb' or $data{name} eq 'Larry'; next if $last_name eq $data{name}; # If this field passes the tests, save a *copy* for later use. push @{$Invoices{$data{name}}}, {%data}; } continue { # Store the last name seen $last_name = $data{name}; } # For each (Larry Robb) foreach (keys %Invoices) { # Sort the invoices by date then time, in reverse order. $Invoices{$_} = [sort { $b->{date} cmp $a->{date} or $b->{time} <=> $a->{time} } @{$Invoices{$_}}]; # Print out the most recent invoice. print "Most recent for $_: $Invoices{$_}[0]{orig}\n"; } __DATA__ 1056128833340 Robb 2003-06-20 665 ** 1056128833340 t028348 2003-06-20 607 1055439973653 T012697 2003-07-22 962 1055439973653 t012697 2003-07-22 948 1055439973653 t806174 2003-07-15 792 1055439973653 T806174 2003-07-15 791 1055439973653 t021191 2003-07-08 786 1055439973653 Robb 2003-06-17 503 1055439973653 Larry 2003-06-16 815 ** 1055439973653 t021191 2003-06-12 646

    If you mean "most recent" and "immediately previous" as in timestamp, you will need to read the whole file in at once, sort on the timestamp (move the sort in the code above out to work on the entire file; maybe un-reverse the sort-order as well), and then process as shown above.

    bbfu
    Black flowers blossom
    Fearless on my breath

      Thanks BBFU!
      I implemented your code and it gives the most recent records for Larry and Robb, but I'm afraid I haven't been clear enough. I need to find within each set of records that share a common invoice number,
      1) the most recent record (by timestamp) where the AGENT is either Robb or Larry; AND
      2) the record (with the same invoice number) that immediately preceded it in the invoice chronology does NOT have Robb or Larry as the AGENT.
      The data will support these conditions.
      I hope that is clearer.
      MrI

        Ok, the following assumes that the file is already sorted by invoice number and then timestamp.

        #!/usr/bin/perl use warnings; use strict; #our $Logfile = 'file.log'; #open my $fh, $Logfile or die "Can't open $Logfile: $!\n"; my $fh = \*DATA; # Testing purposes # The first element of @records is the "current" line. # The second element is the "next" line, if any. my @records = get_next_record($fh); do { # Read the next line push @records, get_next_record($fh); if( ($records[0]{name} eq 'Larry' or $records[0]{name} eq 'Robb') and not ( # No more lines $#records and # Not the name ($records[1]{name} eq 'Larry' or $records[1]{name} eq 'Robb') ) ) { print $records[0]{record}; # Skip to the next set of records, and start over. # ('skip' is removed immediately and is only for place-holding.) @records = ('skip', find_next_record_set($fh, $records[0]{inv})); } # Remove the old "current" line, and make the next line current. shift @records; } while @records; exit; sub parse_record { my $record = shift; my %data = ('record' => $record); @data{qw(inv name date time)} = split(/\s+/, $record); return \%data; } sub get_next_record { my $fh = shift; my $rec = <$fh>; return defined($rec) ? parse_record($rec) : (); } sub find_next_record_set { my $fh = shift; my $rs = shift; my $line; 1 while(defined($line = <$fh>) and $line =~ /^\Q$rs/); return defined($line) ? parse_record($line) : (); } __DATA__ 1056128833340 Robb 2003-06-20 665 ** 1056128833340 t028348 2003-06-20 607 1055439973653 T012697 2003-07-22 962 1055439973653 t012697 2003-07-22 948 1055439973653 t806174 2003-07-15 792 1055439973653 T806174 2003-07-15 791 1055439973653 t021191 2003-07-08 786 1055439973653 Robb 2003-06-17 503 1055439973653 Larry 2003-06-16 815 ** 1055439973653 t021191 2003-06-12 646

        bbfu
        Black flowers blossom
        Fearless on my breath

Re: Searching a multidimensional list
by waswas-fng (Curate) on Aug 01, 2003 at 22:41 UTC
    @inv_return = get_best_inv("Robb"); print "Name: ",$inv_return[1]," Invoice Number: ", $inv_return[0], " D +ate: ",$inv_return[2]," - ",$inv_return[3],"\n"; sub get_best_inv { $whatname = shift; my ($inv, $name, $date, $timestamp) = ""; open (INFILE, "t"); while (<INFILE>) { chomp; ($inv,$name,$date,$timestamp) = split " "; if (($name eq "$whatname") and ($date ge $best_date) and ($timestamp + >= $best_timestamp)) { ($best_inv, $best_name, $best_date, $best_timestamp) = ($inv, $n +ame, $date, $timestamp); } } close (INFILE); return ($best_inv, $best_name, $best_date, $best_timestamp); }
    not tested (as I dont have perl on my lil pocket pc). and since your imidiatly previous statment confuses me because your sample picks dont folow what I would consider your explenation -- changing the code is left up to you to provide that functionality.

    -Waswas
Re: Searching a multidimensional list
by BrowserUk (Patriarch) on Aug 01, 2003 at 22:23 UTC

    When you say "most recent" and "immediately previous", do you mean as ordered in the file, or as ordered by the timestamp?


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      OK, the list of records is already sorted chronologically, so I am looking for the record that is closest to today, where:
      1) the agent is either Robb or Larry; AND
      2) the record directly below it in the list(i.e."immediately previous") does not have Robb or Larry as the agent.
      I hope that is clearer..
      Thx for the quick responses!
      MrI

        Okay.

        P:\test>perl -ne"$p=tell ARGV; print if /Larry|Robb/ and not scalar <> + =~ /Larry|Robb/; seek ARGV, $p, 0; " logfile 1056128833340 Robb 2003-06-20 665 ** 1055439973653 Larry 2003-06-16 815 **

        Usual caveat: 's instead of "s on *nix.

        Update: Added code to bottle out as soon as we've printed the two records.

        perl -ne"BEGIN{$C=0}$p=tell ARGV; ++$c, print if /Larry|Robb/ and not +scalar <> =~ /Larry|Robb/; seek ARGV, $p, 0; exit if $c==2 " logfile
        or golfed a bit
        perl -ne"BEGIN{$C=0}$p=tell ARGV;/Larry|Robb/&<>!~/Larry|Robb/and++$c, +print;seek ARGV,$p,0;$c==2&&exit" logfile

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        If I understand your problem, I can solve it! Of course, the same can be said for you.