Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I want to split a long file and put each section produced into an array. The problem is that the file starts with the pattern that I am using to split the file and so the first element of the array produced is empty. Is there anyway to make the split and not get this? Thanks.

Replies are listed 'Best First'.
Re: split post match
by cLive ;-) (Prior) on Jun 21, 2001 at 02:31 UTC
    There are several. Here's a couple off the top of my head...
    # 1 my @array = split /pattern/, $file_content; shift @array; # 2 my ($dummy_var,@array) = split /pattern/, $file_content;
    cLive ;-)
      Just a thought, instead of using a dummy variable, you would be better off saying
      my (undef,@array) = split /pattern/, $file_content;


      -Lee

      "To be civilized is to deny one's nature."
(Ovid) Re: split post match
by Ovid (Cardinal) on Jun 21, 2001 at 02:40 UTC
    grep to verify that each element is not empty:
    use strict; use Data::Dumper; my $string = '=123=123=435'; my @array = grep { $_ ne '' } split '=', $string; print Dumper \@array;

    You may need to adjust the grep for your needs.

    Oh, and in case anyone considers the golfish grep{$_}, remember that this disallows all false values, such as 0 and 0.0.

    I am wondering about the first two answers to your post. They match your problem spec closer than my solution. However, mine appears more robust (what happens if the first element isn't the pattern?). Bad data is always to be expected. I think my solution would be slower, but is the robustness enough to offset the that? Depends upon the speed and your needs.

    Cheers,
    Ovid

    Update:

    Oops. Abigail is right: '0.0' is true. Good point about the robustness, too. Ugh.

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      Robustness is in the eye of the beholder. While you don't throw away the first line if the file doesn't start with the pattern, you throw away information if the pattern appears twice in succession. And while it's given that the file starts with the pattern; it isn't given a pattern cannot appear twice. Hence your claim of "more robust" is at best dubious.

      About your remark of grep {$_}, note that the result of split is an array of strings, and "0.0" is true</code>, so the grep {$_} will not filter out the 0.0.

      -- Abigail

Re: split post match
by Aighearach (Initiate) on Jun 21, 2001 at 02:37 UTC

    Perhaps if you post some code it would be easier to see what you are doing wrong.

    I am guessing the correct version might look something like this:

    #!/usr/bin/perl use strict; use warnings; open( my $file, "/tmp/bighurkinfile" ) or die "failed to open bighurki +nfile: $!"; chomp( my $pattern = <$file> ); # assumes pattern on first line by its +elf my @records; { local $/ = undef; @records = split /$pattern/, <$file>; # probably should do sanity +check on the pattern first... }
    Is it anything like this?
    --
    Snazzy tagline here