morcadiss has asked for the wisdom of the Perl Monks concerning the following question:

I have a short perl script which I'd like to open another file for reading in data, then create a new file for each unique identifier (caseid). My script currently does this adequately. However, I can't get the $counter variable to reset with each new file.

I suspect there might be a better way to do this, perhaps a hash or arrays, or a hash of hashes. I'm still trying to figure out the best method, but will settle for something that works.

Sample input file (from test.txt)

CASE ID,ITEM,ENTITY,TYPE,ORDER,TEXT

10001,comments@txt,2,os,1,Monday

10001,comments@txt,7,os,1,Tuesday

10001,comments@txt,8,os,1,Wednesday

10001,comments@txt,10,os,1,

Code is as follows
#!/usr/bin/perl use strict; use warnings; use Text::CSV; use Text::Wrap; $Text::Wrap::columns=100; my $file = "test.txt"; my @line_list; #list of csid on each line, contains duplicates my @csid_list; #list of unique csids #my $oldcsid = 0; my $csv = Text::CSV->new or die "Cannot use CSV: ".Text::CSV->error_d +iag (); open (my $fh, '<', "$file") or die "Cannot open $file: $!"; $csv->getline ($fh); #skip header row while ( my $row = $csv->getline( $fh ) ) { push @line_list, $row->[0]; } @csid_list = sort {$a <=> $b} uniq(@line_list); # Adds title to each file foreach my $csid (@csid_list) { open (my $csid_fh, '>>', "$csid.txt"); print $csid_fh "####################\n\nCalYouth Case ID: $csid\n\n# +###################"; close $csid_fh; } close $fh; my $csv2 = Text::CSV->new or die "Cannot use CSV: ".Text::CSV->error_d +iag (); open (my $fh2, '<', "$file") or die "Cannot open $file: $!"; $csv2->getline ($fh2); # Prints notes to correct csid file my $counter = 1; #counts iterations through loop while ( my $row = $csv2->getline($fh2)) { foreach my $csid (@csid_list) { if ($row->[0] == $csid) { if ($row->[5]) { my $temp_string = wrap("\t","\t","$row->[5]"); #wrap() is from Te +xt::Wrap open (my $csid_fh, '>>', "$csid.txt"); print $csid_fh "\n\n\t------------------------------\n\nNote Entr +y $counter:\n\n$temp_string\n\n"; #$counter should count each note in + a file ++$counter; #increments $counter close $csid_fh; } else { next; } } } } # Sorts out duplicates from an array sub uniq { my %seen; grep !$seen{$_}++, @_; } #More experimental code. Sigh. #Reset the counter for new files #if (-e "$csid.txt"){ # $counter++; # print "$csid.txt exists $counter \n"; + #}else{ # $counter=0; #} #Experimental code to handle counter resets. Isn't working. #if ($oldcsid == 0){ # $oldcsid = $csid; # $counter++; #} #elsif ($oldcsid == $csid) { # $counter++; #} #else{ #$counter = 1; Endif for ($row->[0] == $csid). I think th +is will never be false, except on a blank line, which an output file +probably won't have, unless data export >1000 chars. However, removin +g this check broke stuff. #$oldcsid = $csid; #}

Replies are listed 'Best First'.
Re: Unable to get counter to reset
by roboticus (Chancellor) on Jan 13, 2015 at 17:11 UTC

    morcadiss:

    As you guessed, using a hash or array can help out. Here's how I'd do it with a hash. The hash key will be the filename I want to write to:

    my %Files; sub open_file { my $file_name = shift; die "$file_name is already open" if exists $Files{$file_name}; open my $FH, '>', $file_name or die "$file_name open error: $!"; $Files{$file_name} = { COUNT=>0, FH=>$FH }; } sub file_handle { my $file_name = shift; die "$file_name: Hasn't been opened yet" if ! exists $Files{$file_ +name}; return $Files{$file_name}{FH}; } sub file_counter { my $file_name = shift; die "$file_name: Hasn't been opened yet" if ! exists $Files{$file_ +name}; return $Files{$file_name}{COUNT}; } sub write_to_file { my $file_name = shift; my @stuff_to_print = @_; die "$file_name: Hasn't been opened yet" if ! exists $Files{$file_ +name}; my $file = $Files{$file_name}; my $FH = $file->{FH}; print $FH @stuff_to_print; ++$file->{COUNT}; } for my $file ('foo', 'bar', 'baz') { open_file($file); } for my $file ('foo', 'foo', 'baz', 'foo', 'bar') { my $count = file_counter($file); write_to_file($file, "Count is $count\n"); }

    Note: untested, yadda yadda...

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Thank you!
Re: Unable to get counter to reset
by nlwhittle (Beadle) on Jan 13, 2015 at 17:05 UTC

    There are a few changes I'd like to suggest (such as breaking this code up into functions), but answering your immediate problem, I would create a hash with each key named with a csid. Then every time you process a line, you can increment the hash entry that matches the csid. You'll end up with the hash having a unique count for each file. To create the hash, you can do this:

    my %line_count; for my $csid (@csid_list) { $line_count{$csid} = 0; }

    Then, just before you use the counter variable in your print statement, you can do this:

    ++$line_count{$csid};

    This will allow you to keep count on each file, even if the csids in the source file are not in sequence.

    --Nick