ginju75 has asked for the wisdom of the Perl Monks concerning the following question:

hello wise Monks,

I am trying to read lines from a file which looks like this:

Description:
yada
yada
yada
yada
Data:
yada
yada
yada
yada

I need to put the lines between "Description:" and "Data:" in one Array and the lines after "Data:" and before EOF into a Second Array.

I use regular expressions to do this, but this does not seem to be efficient enough since I have to read the file into a string and then split it up as in...

open(R, filename); while(<R>) { $temp .= $_; } close(R); if($temp =~ /Description:(.*)Data:(.*)/s) { $result1 = $1; push (@array1, $result1); $result2 = $2; push (@array2, $result2); }

Instead I want to parse the file line by line and try to put the lines from the file into the arrays without first reading the whole file into a scalar...

Like maybe using some flags (someone suggested), but how?

Please not that the file will be of the format as mentioned above, but new tags maybe added, then the number of arrays to put the lines... will increase....

Thank you all for time and help...

A learning Monk...

Replies are listed 'Best First'.
Re: How to read lines from a file which is....
by Juerd (Abbot) on Jan 04, 2002 at 02:53 UTC
    Maybe references are neater:
    #!/usr/bin/perl -w use strict; my ($desc, $data, $ref); while (<DATA>){ if (/^Description:$/){ $ref = \$desc; next; }elsif (/^Data:$/){ $ref = \$data; next; } $$ref .= $_; } print "Desc:\n$desc\n---\nData:\n$data\n"; __DATA__ Description: yada_d1 yada_d1 yada_d1 yada_d1 Data: yada_d2 yada_d2 yada_d2 yada_d2


    This would probably be a good one for the .. flip-flop operator, but I have yet to learn how it works :)

    2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$

      The flip flop operator isn't that hard to use. Not in it's simple form anyway. ;) It's ideal spot is for soaking up data from between two identifiable strings. eg
      while(<>) { if(/string1/ .. /string2/) { print; } }
      This will print:
      string1 ..... string2
      Note that string1 and string2 remain in the output. shift and pop can help you get rid of them if you don't like them. (Or use splice).

      There are far nicer solutions to this problem than using a flip-flop operator, however for demonstrational purposes, this is how I'd solve this using the flip flop. Note that I've added in a third tag to make it more interesting.

      #!/usr/bin/perl -w use strict; my (@desc, @data, @new); while(<>) { chomp; if(($_ eq "Description:") .. ($_ eq "Data:")) { push @desc, $_; next unless $_ eq "Data:"; } # if you have another tag if(($_ eq "Data:") .. ($_ eq "NewTag:")) { push @data, $_; next unless $_ eq "NewTag:"; } # and so on until the last tag push @new, $_; } shift @desc; # get rid of "Description:" from front. pop @desc if @data; # get rid of "Data:" as last element. shift @data; # get rid of "Data:" from front. pop @data if @new; # get rid of "NewTag:" as last element. shift @new; # get rid of "NewTag:" from front. print "desc: @desc\n"; print "data: @data\n"; print "new: @new\n";

      The reason this works (for those people who didn't do or hated electrical or computer engineering) is that the flip flop operator flips to true once the first condition is fulfilled and stays true until the second condition is true where it then flops to false.

      So you can have as much gunk above "Description:" as you like and the .. operator will not change. Once "Description:" is seen the .. operator becomes true and the if condition is therefore also true. Once "Data:" is seen the .. operator becomes false (but after the if condition has evaluated to true). If "Description:" should be seen again, then the .. operator would once again flip to true and the if condition would be active.

      There are some interesting problems that come up with using the flip flop operator. For starters, if you had another condition in this list:

      if(($_ eq "NewTag:") .. ($_ eq "Somethingelse:")) { ... }
      then it would be possible for two conditions to be active at one time even though only the highest on the list would get any data. This could occur if your file looked like this:
      NewTag: # sets the third flip flop to true ... Description: # sets the first flip flop to true ... Somethingelse: # is a data line for Description .... Data: # now the first flip flop will be .... # false and the second true NewTag: # sets second flip flop to false ..... ..... # these get given to the third ..... # option finally.
      This of course means that if your data file has missing sets, wierd things will happen. In fact, if a necessary condition does not occur you'll either get too much data or not enough. For example processing a log file with something like:
      if(/$date1/../$date2/) { }
      should never even be attempted unless you are 300% sure that both $data1 and $date2 will be in that log file and every single other log file you might ever deal with.

      I got bitten by that one.

      Jacinta

      ++ for you solution... for exact matching eq is faster than m// :)

      Greetz
      Beatnik
      ... Quidquid perl dictum sit, altum viditur.
Re: How to read lines from a file which is....
by dvergin (Monsignor) on Jan 04, 2002 at 04:55 UTC
    There are some wonderfully inventive solutions already posted. But I would offer something a bit simpler...
    #!/usr/bin/perl -w use strict; my (@desc, @data); while (<DATA>) { next if /Description:/; last if /Data:/; push @desc, $_; } @data = <DATA>; # DONE. Check results: print "desc array\n", @desc; print "data array\n", @data; __DATA__ Description: yada_d1 yada_d1 yada_d1 yada_d1 Data: yada_d2 yada_d2 yada_d2 yada_d2
    Update: Or, more concisely (but less readable at-a-glace)...
    my (@desc, @data); my $line = <DATA>; # Throw away first line push @desc, $line while ($line = <DATA>) !~ /Data:/; @data = <DATA>;

    ------------------------------------------------------------
    "Perl is a mess and that's good because the
    problem space is also a mess.
    " - Larry Wall

      i like that! ++
Re: How to read lines from a file which is....
by Anonymous Monk on Jan 04, 2002 at 03:50 UTC
    Juerd solution is a good one
    But if you want an easier to understand
    solution you could use good old fashion
    flag variables

    #!/usr/bin/perl -w use strict; my (@desc, @data); my $pushOnToDescFlag = 0; my $pushOnToDataFlag = 0; while (<DATA>){ if (/^Description:$/){ $pushOnToDescFlag = 1; next; } if (/^Data:$/){ $pushOnToDescFlag = 0; $pushOnToDataFlag = 1; next; } if ( $pushOnToDescFlag == 1 ) { push( @desc,$_); } if ( $pushOnToDataFlag == 1 ) { push( @data,$_); } } foreach my $d (@desc) { print "Desc = $d"; } foreach my $d (@data) { print "Data = $d"; } __DATA__ Description: yada_d1 yada_d1 yada_d1 yada_d1 Data: yada_d2 yada_d2 yada_d2 yada_d2

    ... I believe to be a matter of style
    and what is more important to you
    Streamline code vs. readability
    IMHO

Re: How to read lines from a file which is....
by dmmiller2k (Chaplain) on Jan 04, 2002 at 03:45 UTC

    How about something like this:

    my @regex = qw( Desription Data ); # regexen to match my %rhash = (); my $which = 0; # which regexp to look for while (<>) { /^$regex[$which]:/ && do { # found one of them if ($which == 0 ) { # if it was the first one .. # process Description and Data arrays (if any) process_arrays( @rhash{ @regex } ); # reinitialize by associating an anon array with each regex @rhash{ @regex } = ( [] ) x @regex; } $which = 0 if (++$which >= @regex); next; } push @{$rhash[$which]}, $_; } # Since the last Data block won't be terminated with a # Description line, need to cleanup here process_arrays( @rhash{ @regex } ); # ... sub process_arrays { my ( $arrayref1, $arrayref2 ) = @_; # return immediately if both arrays are empty return unless $arrayref1 && @$arrayref1 && $arrayref2 && @$arrayref2 +; # ... }

    Update: Oops! my first version was plagued with Off-By-One bugs. This should work (untested).

    dmm

    You can give a man a fish and feed him for a day ...
    Or, you can
    teach him to fish and feed him for a lifetime
Re: How to read lines from a file which is....
by belg4mit (Prior) on Jan 04, 2002 at 05:05 UTC
    Sounds like a job for Inline::Files.

    --
    perl -pe "s/\b;([st])/'\1/mg"

Re: How to read lines from a file which is....
by seattlejohn (Deacon) on Jan 04, 2002 at 11:08 UTC
    You say that the number of tags might increase, so I would probably plan for that from the start and write something generalized like this:

    #!/usr/bin/perl -w use strict; die "Usage: save-into-arrays.pl input-file.txt\n" unless @ARGV == 1; my @description; # where we'll store contents of Description sectio +n my @data; # where we'll store contents of Data section my %sections = ('Description:' => \@description, 'Data:' => \@data); open(INPUT_FILE, "<$ARGV[0]") or die "Unable to open $ARGV[0]: $!\n"; my $array_ref; # keeps reference to array currently in use while (my $line = <INPUT_FILE>) { chomp $line; if (defined($sections{$line})) { # is this is a section hea +d? $array_ref = $sections{$line}; # yes, so point to the arr +ay for this section } else { push @$array_ref, $line if $array_ref; # no, save this line in th +e active array } } close(INPUT_FILE);

    Basically what this does is create a hash that pairs each possible section head with an array where you'll store the data from that section. When a section head is encountered, $array_ref gets updated to point to the right array for subsequent lines. This way when you have new section tags, you can just create a new array and a new entry for it in the hash -- no need to alter the underlying logic with more if statements, etc.

    Cheers...