shilpam has asked for the wisdom of the Perl Monks concerning the following question:

I have a script which reads a tab delimited text file. The text file looks something like:
11811   NAME1   AGE1   LOCATION1   12156   NAME2   AGE2   LOCATION2   17899   NAME3   AGE3   LOCATION3 ... and so on I don't have problem reading the file, infact I can successfuly read and store it in an array too. My code is:
open(DAT,"C:\\StudyPerl\\Projects\\tab_file.txt") || die "Could not op +en the file"; my @line; my @array1; while(<DAT>){ @line = split(/\t/,$_); $array1[0]=$line[0]; $array1[1]=$line[1]; $array1[2]=$line[2]; $array1[3]=$line[3]; } foreach $1 (@array1){ print "The element is: $1\n"; }
My problem is: I want the script to treat the variable after every 4 words as a new record.
I mean I want to store: 11811, NAME1, AGE1 & LOCATION1 in array1 and
12156, NAME2, AGE2 & LOCATION2 in array2 and so on.
Also, some words can be blank spaces, I need to handle them too. Basically, whenever there is/are any blank space(s), the array should store it as it is.
Can someone help me on that?

Replies are listed 'Best First'.
Re: Perl script to read a tab delimited text file
by maa (Pilgrim) on Apr 01, 2004 at 11:54 UTC

    I don't see what your problem is... if the records are "fixed width" (i.e. there are 8 fields per line) then just check for define'dness of $line4 which should be the id of the 2nd part of the record... then save that record however you were going to savr the previous one.

    You should note, however, that your code is only every going to save the last record you read as you have hard-coded the array/indices in your while loop. :-)

    Why don't you consider using a hash where the key is the numeric ID and the content is an anon array?

    while(<DAT>){ @line = split(/\t/,$_); $myhash{$line[0]} = [$line[1],$line[2],$line[3]]; }

    And have you actually tried your split on tab separated nothingness?

    HTH - Mark

Re: Perl script to read a tab delimited text file
by Happy-the-monk (Canon) on Apr 01, 2004 at 11:55 UTC

    I think you might be looking for something similar to this:

    while ( <DAT> ) { my @line = split( /\t/, $_ ); while ( @line ) { my @four_records = splice( @line, 0, 4 ); # think "4 x shift( +@line )" push @array_of_arrays, [ @four_records ]; # make this an array + containing an array of four elements in each array slot. } }

    Cheerio, Sören

      Browsing through the nearly-new uploads to CPAN I found across Array::Each - iterate over one or more arrays, returning one or more.
      That's more a FYI, I think using it in this context would just overcomplicate it, I feel it has it's use when things get much more intricate.

      Cheers, Sören

Re: Perl script to read a tab delimited text file
by dragonchild (Archbishop) on Apr 01, 2004 at 12:54 UTC
    When reading xSV data (x-Separated Values) where x is any single character, it is generally unwise to handle it naively using split. Much better is to use one of the many modules designed for this. The best is Text::xSV, by our own tilly.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Re: Perl script to read a tab delimited text file
by borisz (Canon) on Apr 01, 2004 at 11:54 UTC
    my @array; while ( defined ( $_ = <DAT> ) ) { push @array, /\t*((?:[^\t]+\t+){3}(?:[^\t]+))/g; }
    Boris
Re: Perl script to read a tab delimited text file
by Crian (Curate) on Apr 01, 2004 at 14:16 UTC
    Just a general hint: I think it's a good idea to print the reason ($!) when the opening (or else) of a file fails.

    Because it gives the user a hint, whats going wrong.

    Also it's helpful to name the file...

    If you use q(...) or '...', you don't need to double the backslash.

    All three tipps together remain in something like

    my $file = 'C:\StudyPerl\Projects\tab_file.txt'; open(DAT, $file) or die "Could not open the file '$file': $!";
    hth