Knoperl has asked for the wisdom of the Perl Monks concerning the following question:

Dear Most Merciful and Benevolent Monks,

I come with utter supplication and freely admit my utter ignorance and sloth. I plead with you to have extreme mercy and patience so as to accept my most basic request.

I have a text file called funtime.dat that is comma separated values. Here is a small snippet:

IM,BEN01,D,0 IM,BEN02,D,0 IM,BEN03, ,0 IM,BEN04, ,0 IM,BEN05, ,0 IM,BEN06,C,0 IM,BEN07, ,0 IM,BEN08, ,0 IM,BEN09, ,0 IM,BEN10,D,0

My Perl code is thus and works with the following output included below:

#!/usr/bin/perl open(MYINPUTFILE, "<funtime.dat"); while(<MYINPUTFILE>) { my($line) = $_; chomp($line); @data = split(/,/, $line); if ((@data[2] eq "D")||(@data[2] eq "C")) { printf ("\n type "); print @data[2]." "; print @data[1].".txt "; } else { print @data[1].".txt "; } } close(MYINPUTFILE);

My output is

type D BEN01.txt type D BEN02.txt BEN03.txt BEN04.txt BEN05.txt type C BEN06.txt BEN07.txt BEN08.txt BEN09.txt type D BEN10.txt

Actually what I want is it to concatenate the files into the first file name:

type D BEN01.txt > BEN01.txt type D BEN02.txt BEN03.txt BEN04.txt BEN05.txt >BEN02.txt type C BEN06.txt BEN07.txt BEN08.txt BEN09.txt > BEN06.txt type D BEN10.txt > BEN10.txt

Please disregard the C and D as those I use for debugging purposes. Obviously I would then pipe the output of the perl program into a .BAT program which I would execute in MS-DOS.

I believe I should be using some global values instead of local values or something along those lines. In terms of when the value of the line is a "D" or a "C", it gets text file stored as a value until another "D" or "C" value is read and then it prints out a "> textfile.txt" and the makes a newline and prints out "type" and so on.

If this is too difficult how about using the last file before the document has a "D" or "C" to pipe it out to such as this following example:

type D BEN01.txt > BEN01.txt type D BEN02.txt BEN03.txt BEN04.txt BEN05.txt > BEN05.txt type C BEN06.txt BEN07.txt BEN08.txt BEN09.txt >BEN09.txt type D BEN10.txt > BEN10.txt

You might ask why don't I just concatenate the files in Perl. The thing is I am getting the .dat file and therefore it is an extra step. These .dat files can actually be 1,000,000+ lines so that is why I am using Perl to automate this process.

Thank you so much for your assistance.

Replies are listed 'Best First'.
Re: Global vs. Local Variables Assistance Requested
by ikegami (Patriarch) on Dec 04, 2008 at 22:35 UTC
    Like this?
    #!/usr/bin/perl my $head; open(my $fh, '<', "funtime.dat") or die $!; while(my $line = <$fh>) { chomp($line); my @data = split(/,/, $line); if (($data[2] eq "D")||($data[2] eq "C")) { print(" > $head.txt\n") if defined($head); print("type $data[2]"); $head = $data[1]; } print(" $data[1].txt"); } print(" > $head.txt\n") if defined($head);
      Dear ikegami,

      That was perfect. Thank you very much!

      How quick is the wonderful PerlMonks in their righteousness!

Re: Global vs. Local Variables Assistance Requested
by GrandFather (Saint) on Dec 04, 2008 at 22:53 UTC

    Yes, you need a variable that is global to the while loop because you need to preserve some state between iterations of the loop. Consider:

    use strict; use warnings; my $inFileStr = <<STR; IM,BEN01,D,0 IM,BEN02,D,0 IM,BEN03, ,0 IM,BEN04, ,0 IM,BEN05, ,0 IM,BEN06,C,0 IM,BEN07, ,0 IM,BEN08, ,0 IM,BEN09, ,0 IM,BEN10,D,0 STR my $first; open my $inFile, '<', \$inFileStr or die "Can't open input file: $!\n" +; while (<$inFile>) { chomp; my @data = split /,/; if ($data[2] =~ /^(D|C)$/) { print " > $first.txt\n" if defined $first; print "type $data[2] "; $first = $data[1]; } print "$data[1].txt "; } close $inFile; print " > $first.txt\n" if defined $first;

    Prints:

    type D BEN01.txt > BEN01.txt type D BEN02.txt BEN03.txt BEN04.txt BEN05.txt > BEN02.txt type C BEN06.txt BEN07.txt BEN08.txt BEN09.txt > BEN06.txt type D BEN10.txt > BEN10.txt

    There are a few things that I've tidied up that you should take note of. First off, always use strictures (use strict; use warnings;). Use the three parameter version of open and check the result (using die is pretty standard for that). Use lexical variables (my $inFile) for file handles. printf is not print - do not confuse them or you will likely have an unhappy life. Regular expressions are clearer (when you get used to them) for non-trivial string matching (see perlretut and perlre).


    Perl's payment curve coincides with its learning curve.
      Grandfather,

      Please accept my sincere gratitude for your very kind advice and assistance.

      I will be more careful and mindful in the future.

Re: Global vs. Local Variables: try using more memory
by Narveson (Chaplain) on Dec 05, 2008 at 06:28 UTC

    Consider making slightly more use of your program's memory.

    my %command; my $destination; while(<DATA>) { # assign split results to named scalars my ($ignored, $file, $flag) = split /,/; if ($flag =~ /^[CD]$/) { $destination = $file; $command{$destination} = "type $flag"; } $command{$destination} .= " $file.txt"; } for my $destination (sort keys %command) { # Look, only one print statement! print "$command{$destination} > $destination.txt\n"; } __DATA__ IM,BEN01,D,0 IM,BEN02,D,0 IM,BEN03, ,0 IM,BEN04, ,0 IM,BEN05, ,0 IM,BEN06,C,0 IM,BEN07, ,0 IM,BEN08, ,0 IM,BEN09, ,0 IM,BEN10,D,0