in reply to extract and export to multiple files

Using strictures is good.

Declaring all your variables in one block up front is very bad. Not checking file open is bad. Not using the three parameter version of open is bad. Not using lexical file handles is bad.

Avoid using explicit counters where you don't need to.

Declare variables in the smallest possible scope. Be aware of the last parameter to split and use it where appropriate. Taking all the above into account the following should serve you better:

#!/usr/bin/perl use strict; use warnings; my $outdir = 'D:/share/out'; my $inpath = "D:/share/in/content.txt"; open my $inFile, '<', $inpath or die "Failed to open $inpath: $!"; while (<$inFile>) { chomp; my ($title, $content) = split "\t", $_, 2; my $outpath = "$outdir/$title$."; open my $outFile, '>', $outpath or die "Can't create $outpath: $!" +; print $outFile $content; close $outFile; } close $inFile;

Note in particular the use of the input line number special variable $. (in "$outdir/$title$.") instead of using an explicit counter.

Update: Fix silly open error - thanks jwkrahn

True laziness is hard work

Replies are listed 'Best First'.
Re^2: extract and export to multiple files
by jwkrahn (Abbot) on Apr 09, 2010 at 06:16 UTC
    open my $inFile, '>', $inpath or die "Failed to open $inpath: $!";

    Your $inpath is being opened for output.

    open my $inFile, '<', $inpath or die "Failed to open $inpath: $!";
Re^2: extract and export to multiple files
by zzgulu (Novice) on Apr 09, 2010 at 13:53 UTC
    Thank you so much for your great help. So while the output of the split remains in the default variable ,$_, what does 2 do? Also, how can I manipulate $. variable in a way that sequential numbers being added to the $title and not to the file extension: I am getting file.txt1 instead of file1.txt I tried ".txt" after ".$" but obviously didn't work Thanks again for your explanations and help

      I strongly recommend that you follow the links (did I make them too subtle?) and read the documentation. In fact spending a few hours browsing the documentation that came with Perl and a few more hours reading some of the Perl tutorial and reference books ("Learning Perl" and "Programming Perl" at least) will save you days of beating your head against the desk if you intend to use Perl regularly. There are some good recommendations in the So what is your Perl book "Trilogy" anyway? thread. Visiting the Tutorials section is well worth doing too.

      I followed the pattern you gave (">$filename"."$title"."$count") for forming the file name. The sequence number following the file name is what you would have got in your original code. If you want to put it before the last '.' then you need to edit the file name string. Something like the following in place of the original $outpath assignment ought do the trick:

      my $outpath = "$outdir/$title"; $outpath =~ s/(\.[^.]*|)$/$.$1/;

      You will definitely want to read some of the documentation for regular expressions: perlrequick and perlretut at least (there is more).

      True laziness is hard work
        Thank you so much. Actually I started reading the Llama book although I am not a frequent user of Perl. However, I anticipate I will be using regular expression a lot in future. Speaking of regular expression, how can I match ampersand in a character set? I was trying to match "DESIGN & PLAN:" or "DESIGN/PLAN:" or "DESIGN PLAN" with [A-Z &\/]+[:|;] but this doesnot match the first variation. I have thousands of "titles" that are all in upper case and may consist of one or more words that may end in ":" or ";" It seems matching special characters in a character set is different and basically everything except "-" as in A-Z is treated as non-special character but for some reason it does not match "&" in my regex. Thanks again for all the hints.