mmittiga17 has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

Quick question:

I have CSV file containing account information. Each line contains has a unique account number and other information.

I need to parse the file for the last line for each unique account number. Sample data:
account numbers=abc123 and xyz123 20080618,adfadf,adfadf,abc123,ljlkjll, 20080618,jfjfjf,bmbmbm,abc123,zzzdddd, 20080618,343434,959595,abc123,fjadfaf, <==parse 20080618,adfadf,adfadf,xyz123,ljlkjll, 20080618,adxxxf,gggsss,xyz123,ggaaddf, 20080618,dddfff,dfdfdd,xyz123,xxxxxd, <==parse

So what I need is a way to iterate through each line of the file and for each unique account number print out the last line containing the unique account number.

So for accont xyz123, I would want to the last line for that account.
20080618,adfadf,adfadf,xyz123,ljlkjll, 20080618,adxxxf,gggsss,xyz123,ggaaddf, 20080618,dddfff,dfdfdd,xyz123,xxxxxd, <==parse
Any suggestion or idea on how to accomplish this, will be appreciated

Replies are listed 'Best First'.
Re: Parse file question
by poolpi (Hermit) on Jun 23, 2008 at 10:14 UTC

    Maybe that a hash can help you :

    #!/usr/bin/perl -w use strict; use Data::Dumper; my %buf; while( my $line = <DATA> ){ next unless $line; chomp $line; $line =~ s/,\s*$//; my $accnum = (split ',', $line)[3]; $buf{$accnum} = $line if $accnum; } print Dumper \%buf; __DATA__ 20080618,adfadf,adfadf,abc123,ljlkjll, 20080618,jfjfjf,bmbmbm,abc123,zzzdddd, 20080618,343434,959595,abc123,fjadfaf, 20080618,adfadf,adfadf,xyz123,ljlkjll, 20080618,adxxxf,gggsss,xyz123,ggaaddf, 20080618,dddfff,dfdfdd,xyz123,xxxxxd,
    Output : $VAR1 = { 'xyz123' =>'20080618,dddfff,dfdfdd,xyz123,xxxxxd', 'abc123' =>'20080618,343434,959595,abc123,fjadfaf' };

    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: Parse file question
by psini (Deacon) on Jun 23, 2008 at 09:55 UTC

    Read the file, line by line, and parse each line using Text::CSV, so that the line is saved as an array of fields

    The second part of the problem is (a little) more tricky: you could easily detect which line is the first of a group with the same account number, but you want the last, and it is not so easy for you have not way to say "this is the last line for account 1234" until you actually read the next line. So you have to keep a buffer of the last two parsed lines: if account numbers differ, you take the last field of the first line. The only caveat is that you must consider the last line of the file, too

    Careful with that hash Eugene.

Re: Parse file question
by Anonymous Monk on Jun 23, 2008 at 09:38 UTC
Re: Parse file question
by waldner (Beadle) on Jun 23, 2008 at 11:23 UTC
    Not perfect, but anyway, here it is:
    perl -F, -nae '$ok?(print $last):($ok=1) if ($F[3] ne (split/,/,$last) +[3]); $last=$_; END{print $last;}' file
    If you don't mind output ordering, or if the lines for a given ID are not consecutive, then
    perl -F, -nae '$s{$F[3]}=$_; END{print values %s;}' file
    (and it's simpler too).
Re: Parse file question
by Anonymous Monk on Jun 23, 2008 at 09:35 UTC
    Like the preview page says
    If you think you're going to use <pre> tags — don't! Use <code> tags instead! This applies to data as well as code.