JPaul has asked for the wisdom of the Perl Monks concerning the following question:

Greetings all,
I'm writing a code parser of very simple proportions (thusly why I'm not using any CPAN modules). My solution works perfectly, but its very long winded and I'm sure there's a tidier (read: Short but preferably not obfu) method of doing it.
The code is separated into sections, each headered with something like ";section1". The object of the code is to place each section (minus its header) into individual arrays. Thusly:
my $section = 0; my @sectone; my @secttwo; while(<TEXTFILE>) { if (/;section1/) { $section = 1; } elsif (/;section2/) { $section = 2; } else { if ($section == 1) { push(@sectionone, $_); } elsif ($section == 2) { push(@sectiontwo, $_); } else { print "No section defined for: $_\n"; } } }
There has to be a nicer way! :)

JP,
-- Alexander Widdlemouse undid his bellybutton and his bum dropped off --

Replies are listed 'Best First'.
Re: Tidier and more efficient parsing code
by Fletch (Bishop) on Jan 24, 2002 at 08:58 UTC

    One of MJD's red flags: whenever you've got variables named $fooone and $footwo you probably want an array. Likewise here, you've got multiple arrays named with indicies which means you probably want a hash.

    my $cur_section = 0; my %sections; while( <TEXTFILE> ) { if( /^;section(\d+)/ ) { $cur_section = $1; next; } unless( $cur_section ) { warn "No section defined for `$_'\n"; next } push @{ $sections{ $cur_section } }, $_ }
      Technically, the arrays are called @predecl, @precode, @header, @main and @trailer.
      But I get the point :)

      JP,
      -- Alexander Widdlemouse undid his bellybutton and his bum dropped off --

(crazyinsomniac) Re: Tidier and more efficient parsing code
by crazyinsomniac (Prior) on Jan 24, 2002 at 09:05 UTC
    Incantation one:
    my $section = 0; my @sectone; my @secttwo; while(<TEXTFILE>) { if (/;section(1)/) { push(@sectionone, $_); } elsif (/;section(2)/) { push(@sectiontwo, $_); } else { print "No section defined for: $_\n"; } }
    Incantation two:
    my $section = 0; my @sectone; my @secttwo; while(<TEXTFILE>) { if (/;section([12])/) { push( $1 == 1? @sectionone : @sectiontwo, $_); } else { print "No section defined for: $_\n"; } }
    Incantation three:
    my $section = 0; my %sections = (1 =>[],2=>[]); while(<TEXTFILE>) { if (/;section([12])/) { push @{$sections{$1}},$_; } else { print "No section defined for: $_\n"; } }
    update: Incantation four (kinda hesitated on this one):
    my $section = 0; my %sections = (1 =>[],2=>[]); while(<TEXTFILE>) { push @{$sections{$1}},$_ and next if /;section([12])/; warn "No section defined for: $_\n"; # cause warn gives you $. in <TEXTFILE> }
    incantation huh?:
    I dunno... I guess none of them work ... until now maybe

    Incantation two:

    my $section = 0; my @sectone; my @secttwo; while(<TEXTFILE>) { if (/;section([12])/) { $section = ( $1 == 1 ? @sectionone : @sectiontwo); } else { push @{$section},$_ and next if $section; warn "No section defined for: $_\n"; } }
    Incantation three:
    my $section = 0; my %sections = (1 =>[],2=>[]); while(<TEXTFILE>) { if (/;section([12])/) { $section = $1 and next } push @{$sections{$section|| warn "No section defined for: $_\n" and next()}} ,$_; }
    Incantation four:
    my $section = 0; my %sections = (1 =>[],2=>[]); while(<TEXTFILE>) { $sections=$1 and next if /;section([12])/; warn "No section defined for: $_\n" unless $sections; push @{$sections{$sections}},$_ }
    Interesting aint isn't it ... no matter how relatively simple a task, it's easily fudged

    update:
    I lobster, but I never flounder

    my $section = 0; my %sections = (1 =>[],2=>[]); while(<TEXTFILE>) { push @{$sections{( /;section([12])/ and $section = $1 and next() ) or ( $section || warn "No section defined for: $_\n" and next() )}} ,$_; }
    ** note, I tested none of these, but they should work ;)

     
    ______crazyinsomniac_____________________________
    Of all the things I've lost, I miss my mind the most.
    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

      Im confused dude. Does that fourth one actually work?

      I just ran it and it seems not to... As far as I can tell the push will only happen when there is a ;section on the line.

      Sorry.

      Yves / DeMerphq
      --
      When to use Prototypes?

Re: Tidier and more efficient parsing code
by demerphq (Chancellor) on Jan 24, 2002 at 15:06 UTC
    Similar to crazyinsomniacs 4th version (Which im not so sure about... But thats a different matter)
    my $section = 0; my @sections=(undef,[],[]); while(<TEXTFILE>) { $section = $1 and next if /;section([12])/; die "Bad section $section" unless $section; push @{$sections[$section]},$_; }
    Update: The 'and' in the code was added. It used to be incorrectly '&&'

    Yves / DeMerphq
    --
    When to use Prototypes?

Re: Tidier and more efficient parsing code
by flocto (Pilgrim) on Jan 25, 2002 at 18:41 UTC
    Try this solution if it works for you :)
    my $data = []; my $section = 0; while (<TEXT>) { if (/^section(\d)/) { $section = $1; } else { next unless $section; push (@{$data->[($section - 1)]}, $_); } } #my @section_one = @{$data->[0]}; # if absolutely #my @section_two = @{$data->[1]}; # neccessary only

    Sure, there are easier ways to do this, but I thought it was quite easy to read like this..
    -octo-
Re: Tidier and more efficient parsing code
by lirm (Novice) on Jan 26, 2002 at 00:05 UTC
    Will this work?

    It's kinda verbose, but I couldn't figure out how to get rid
    of the first element during split.

    { undef $/; @temp = split /;/, <DATA>; # slurp file shift @temp; # Get rid of everything before first semicolon %sections = map { my($sect, @param) = split /[\n\r\f]+/;($sect, \@ +param) } @temp; } # Just to test it foreach (keys %sections) { print "$_ => [@{$sections{$_}}]\n"; } __DATA__ ;section1 foo bar ;section2 bar baz
      well, to get rid of the first element of split, change the block to look like so:

      { local $/; undef $/; (undef, @temp) = split /;/, <DATA>; # slurp file %sections = map { my($sect, @param) = split /[\n\r\f]+/;($sect, \@ +param) } @temp; }
      and don't forget to localize $/ !

      ~Particle