Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

WOndered if anyone could help.. All i'm trying to do is read in a file of data (containg 6 'paragraphs' of tab-delimited text). I want to split the text so that each 'paragraph' is an element of an array.
my @file = split (/\n/, $line);
or something.... my problem is that the input file has lots of whitespace in it and can't be split.
the file looks like this.. 1 2 3 1 1 6 4 8 5 6 9 0 8 89 5 0 0 8 7 8 4 6 6 3 79 0 588 7 9 4 3 9 2 9 9 23 8 0 2 8 98 0 9 7 8 0 0
As you can see the irregular spacing etc is making it hard to split. Can anyone suggest a way of splitting this data into elements of an array that might work??? thanks.

Replies are listed 'Best First'.
Re: weird files
by BrowserUk (Patriarch) on Nov 29, 2002 at 09:22 UTC

    Take it in steps.

    #! perl -slw use strict; my $data = do { local $/; <DATA>; }; #! slurp the file into a scalar. $data =~ s/([^\n])(\n)([^\n])/$1$3/g; #! join the paras into lines $data =~ s/\n+/\n/g; #! reduce multiple \n's to 1 my @data = split/\n/, $data; #! and split. print for @data; __DATA__ 1 2 3 1 1 6 4 8 5 6 9 0 8 89 5 0 0 8 7 8 4 6 6 3 79 0 588 7 9 4 3 9 2 9 9 23 8 0 2 8 98 0 9 7 8 0 0

    Gives

    C:\test>216446 1 2 3 11 6 4 8 5 6 9 08 89 5 00 8 7 8 4 66 3 79 0 588 7 9 43 9 2 9 9 23 8 02 8 98 0 9 7 8 00

    Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
    Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
    Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
    Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

      thanks vvv.much.. spent all day yesterday trying to figure this one out! Much appreciated ;-)
Re: weird files
by jmcnamara (Monsignor) on Nov 29, 2002 at 09:24 UTC

    You could use the -00 command line switch, see -0 in perlrun, to read the file in paragraph mode:
    #!/usr/bin/perl -w -00 use strict; open TABFILE, "tabfile" or die "Error message here: $!"; my @array = <TABFILE>; print scalar @array; # prints 6 for the above file

    You can get the same effect from within a program by setting $/ = "":

    #!/usr/bin/perl -w use strict; open TABFILE, "tabfile" or die "Error message here: $!"; { local $/ = ""; my @array = <TABFILE>; } print scalar @array;

    See $/ in perlvar for an explanation of this.

    --
    John.

Re: weird files
by danger (Priest) on Nov 29, 2002 at 09:17 UTC

    Set the  $/ variable to "paragraph" mode and then just read the file to your array:

    $/ = ""; my @file = <FILE>;
Re: weird files
by gjb (Vicar) on Nov 29, 2002 at 09:17 UTC

    If I get your question right, the following code should do what you want. I'm sure others will suggest more concise solutions though ;-)

    #!perl use strict; use warnings; my @data; my @buffer; while (<DATA>) { chomp($_); s/^\s*(.+?)\s*$/$1/; if (/^\s*$/ && scalar(@buffer) > 0) { push(@data, [@buffer]); @buffer = (); } else { push(@buffer, split(/\s+/, $_)); } } if (scalar(@buffer) > 0) { push(@data, [@buffer]); @buffer = (); } foreach my $array (@data) { print join(", ", @$array), "\n"; } __DATA__ 1 2 3 1 1 6 4 8 5 6 9 0 8 89 5 0 0 8 7 8 4 6 6 3 79 0 588 7 9 4 3 9 2 9 9 23 8 0 2 8 98 0 9 7 8 0 0

    Hope this helps, -gjb-

Re: weird files
by Chief of Chaos (Friar) on Nov 29, 2002 at 09:20 UTC
    Hi,
    maybe you can so :
    while (<INFILE>) { $line = $_; chomp($line); next if ($line =~ /^\s*$/); push @file, split(/[\t\s]+/,$line); }
    changes: misunderstood question - added split