Knoperl has asked for the wisdom of the Perl Monks concerning the following question:

Dear most gracious and benevolent monks,

I beseach your grace and ask that you examine my humble supplication.

I am trying to go through a sorted text file, which are a bunch of comma seperated fields of very long lines (80+++ characters). Here is an simplified example:

Dog, Table, 12:30, Mercury,abcdef Dog, Chair, 5:30, Mercury,abcdef Dog, Table, 12:30, Venus,abcdef Pig, Door, 3:30, Earth, 123 Pig, Door, 3:30, Earth, 45678943985 Goat, Couch, 8:45, Mars, 9876

I would like every line that has the same 1st variable to be combined into one text file named after the 1st field. All 1st fields would have a file named after it. For the above example which has 3 different 1st fields I would have 3 text files like so:

{Dog.txt} Dog, Table, 12:30, Mercury,abcdef Dog, Chair, 5:30, Mercury,abcdef Dog, Table, 12:30, Venus,abcdef {Pig.txt} Pig, Door, 3:30, Earth, 123 Pig, Door, 3:30, Earth, 45678943985 {Goat.txt} Goat, Couch, 8:45, Mars, 9876
Here is code I have written so far.
#!/usr/bin/perl -w use strict; my $sourcefile = $ARGV[0]; my $currentline; open(my $fh, '<', $sourcefile) or die ("Could not open $sourcefile"); #gets the filename from the cmd line while(my $line = <$fh>) #until EOF, do the following { my $nextline =' '; #initialize the next line to empty before starting foreach my $line (<$fh>) #for every line go in { chomp($line); my @currentline = split(/,/, $line); #split the fields next if (($currentline[0]) eq (@nextline[0])); #if 1st field in currentline = 1st field in the 2nd li +ne open OUT, ">> $currentline[0].txt" or die "Could not open file $currentline[0].txt for outpu +t.\n$!"; #create the filename based upon the first field conten +ts #if the file already exists, append to it print OUT $currentline; #throw everything in that particular line into the fil +e $nextline = $currentline; #go to the next line close OUT; #close the file } }
I do beg for you assistance and fully prostrate my self in gratitude to those who are so wise and charitable.
  • Comment on Aggregating Lines from a CSV who lines match a particular field and save those matches based on that matching field name
  • Select or Download Code

Replies are listed 'Best First'.
Re: Aggregating Lines from a CSV who lines match a particular field and save those matches based on that matching field name
by ikegami (Patriarch) on Dec 10, 2008 at 05:56 UTC
    Don't open and close the file over and over again. Take advantage of the fact that your file is sorted.
    my $fh; my $last; while (<>) { my ($file) = /^([^,]+),/; if (!defined($last) || $file ne $last) { open($fh, '>', "$file.txt") or die($!); } print $fh $_; $last = $file; }
Re: Aggregating Lines from a CSV who lines match a particular field and save those matches based on that matching field name
by NetWallah (Canon) on Dec 10, 2008 at 06:08 UTC
    Here is a one-liner:
    perl -naF, -e '$F[0] eq $prev or close FIL,open FIL,qq|>>|, $F[0] . q +q|.txt|;print FIL $_;$prev=$F[0]' mycsv.csv
    Where the input file (sorted text file) is mycsv.csv .

    If you are on Windows, use double-quotes instead of single.

    Update: Changed OPEN to APPEND instead of overwrite, which seems to be what the OP wants.

         ..to maintain is to slowly feel your soul, sanity and sentience ebb away as you become one with the Evil.

      Changed OPEN to APPEND instead of overwrite, which seems to be what the OP wants.

      He was forced to use append because he reopened the file for every line. There's no evidence either way as to what he wants to do.

        Thank you BOTH so MUCH

        I did want to append so as each time a line was matched the first field, it would be appended to the file if it had already previously been shown to exist

        I am deeply in awed by both of your mastery!

        Netwallah that was also quite sophisticated to fit it into one line.

        I wish to thank you with a haiku. Not that I am a good poet but I will try.

        A haiku to Ikegami:

        Come new winter,

        Code perfect Ikegami’s

        Perl is a beauty

        A haiku to Netwallah

        Perl does flows through you

        Short but cool like a melting

        Icicle in spring

        I will redouble my efforts to learn Perl better. Thank you again SOOOOOOOO MUCH!!!!!