How to read lines from a file which is....

ginju75 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: How to read lines from a file which is.... by Juerd (Abbot) on Jan 04, 2002 at 02:53 UTC
Maybe references are neater: `#!/usr/bin/perl -w use strict; my ($desc, $data, $ref); while (<DATA>){ if (/^Description:$/){ $ref = \$desc; next; }elsif (/^Data:$/){ $ref = \$data; next; } $$ref .= $_; } print "Desc:\n$desc\n---\nData:\n$data\n"; __DATA__ Description: yada_d1 yada_d1 yada_d1 yada_d1 Data: yada_d2 yada_d2 yada_d2 yada_d2` [download] This would probably be a good one for the `..` flip-flop operator, but I have yet to learn how it works :) `2;0 juerd@ouranos:~$ perl -e'undef christmas' Segmentation fault 2;139 juerd@ouranos:~$` [download]	[reply] [d/l] [select]
Flip - flops by jarich (Curate) on Jan 04, 2002 at 13:11 UTC
The flip flop operator isn't that hard to use. Not in it's simple form anyway. ;) It's ideal spot is for soaking up data from between two identifiable strings. eg `while(<>) { if(/string1/ .. /string2/) { print; } }` [download] This will print: `string1 ..... string2` [download] Note that string1 and string2 remain in the output. shift and pop can help you get rid of them if you don't like them. (Or use splice). There are far nicer solutions to this problem than using a flip-flop operator, however for demonstrational purposes, this is how I'd solve this using the flip flop. Note that I've added in a third tag to make it more interesting. #!/usr/bin/perl -w use strict; my (@desc, @data, @new); while(<>) { chomp; if(($_ eq "Description:") .. ($_ eq "Data:")) { push @desc, $_; next unless $_ eq "Data:"; } # if you have another tag if(($_ eq "Data:") .. ($_ eq "NewTag:")) { push @data, $_; next unless $_ eq "NewTag:"; } # and so on until the last tag push @new, $_; } shift @desc; # get rid of "Description:" from front. pop @desc if @data; # get rid of "Data:" as last element. shift @data; # get rid of "Data:" from front. pop @data if @new; # get rid of "NewTag:" as last element. shift @new; # get rid of "NewTag:" from front. print "desc: @desc\n"; print "data: @data\n"; print "new: @new\n"; [download] The reason this works (for those people who didn't do or hated electrical or computer engineering) is that the flip flop operator flips to true once the first condition is fulfilled and stays true until the second condition is true where it then flops to false. So you can have as much gunk above "Description:" as you like and the .. operator will not change. Once "Description:" is seen the .. operator becomes true and the if condition is therefore also true. Once "Data:" is seen the .. operator becomes false (but after the if condition has evaluated to true). If "Description:" should be seen again, then the .. operator would once again flip to true and the if condition would be active. There are some interesting problems that come up with using the flip flop operator. For starters, if you had another condition in this list: `if(($_ eq "NewTag:") .. ($_ eq "Somethingelse:")) { ... }` [download] then it would be possible for two conditions to be active at one time even though only the highest on the list would get any data. This could occur if your file looked like this: `NewTag: # sets the third flip flop to true ... Description: # sets the first flip flop to true ... Somethingelse: # is a data line for Description .... Data: # now the first flip flop will be .... # false and the second true NewTag: # sets second flip flop to false ..... ..... # these get given to the third ..... # option finally.` [download] This of course means that if your data file has missing sets, wierd things will happen. In fact, if a necessary condition does not occur you'll either get too much data or not enough. For example processing a log file with something like: `if(/$date1/../$date2/) { }` [download] should never even be attempted unless you are 300% sure that both $data1 and $date2 will be in that log file and every single other log file you might ever deal with. I got bitten by that one. Jacinta	[reply] [d/l] [select]
Re: Re: How to read lines from a file which is.... by Beatnik (Parson) on Jan 04, 2002 at 04:23 UTC
++ for you solution... for exact matching `eq` is faster than `m//` :) Greetz Beatnik ... Quidquid perl dictum sit, altum viditur.	[reply] [d/l] [select]
Re: How to read lines from a file which is.... by dvergin (Monsignor) on Jan 04, 2002 at 04:55 UTC
There are some wonderfully inventive solutions already posted. But I would offer something a bit simpler... `#!/usr/bin/perl -w use strict; my (@desc, @data); while (<DATA>) { next if /Description:/; last if /Data:/; push @desc, $_; } @data = <DATA>; # DONE. Check results: print "desc array\n", @desc; print "data array\n", @data; __DATA__ Description: yada_d1 yada_d1 yada_d1 yada_d1 Data: yada_d2 yada_d2 yada_d2 yada_d2` [download] Update: Or, more concisely (but less readable at-a-glace)... `my (@desc, @data); my $line = <DATA>; # Throw away first line push @desc, $line while ($line = <DATA>) !~ /Data:/; @data = <DATA>;` [download] ------------------------------------------------------------ "Perl is a mess and that's good because the problem space is also a mess." - Larry Wall	[reply] [d/l] [select]
Re: Re: How to read lines from a file which is.... by rbc (Curate) on Jan 04, 2002 at 05:04 UTC
i like that! ++	[reply]
Re: How to read lines from a file which is.... by Anonymous Monk on Jan 04, 2002 at 03:50 UTC
Juerd solution is a good one But if you want an easier to understand solution you could use good old fashion flag variables #!/usr/bin/perl -w use strict; my (@desc, @data); my $pushOnToDescFlag = 0; my $pushOnToDataFlag = 0; while (<DATA>){ if (/^Description:$/){ $pushOnToDescFlag = 1; next; } if (/^Data:$/){ $pushOnToDescFlag = 0; $pushOnToDataFlag = 1; next; } if ( $pushOnToDescFlag == 1 ) { push( @desc,$_); } if ( $pushOnToDataFlag == 1 ) { push( @data,$_); } } foreach my $d (@desc) { print "Desc = $d"; } foreach my $d (@data) { print "Data = $d"; } __DATA__ Description: yada_d1 yada_d1 yada_d1 yada_d1 Data: yada_d2 yada_d2 yada_d2 yada_d2 [download] ... I believe to be a matter of style and what is more important to you Streamline code vs. readability IMHO	[reply] [d/l]
Re: How to read lines from a file which is.... by dmmiller2k (Chaplain) on Jan 04, 2002 at 03:45 UTC
How about something like this: my @regex = qw( Desription Data ); # regexen to match my %rhash = (); my $which = 0; # which regexp to look for while (<>) { /^$regex[$which]:/ && do { # found one of them if ($which == 0 ) { # if it was the first one .. # process Description and Data arrays (if any) process_arrays( @rhash{ @regex } ); # reinitialize by associating an anon array with each regex @rhash{ @regex } = ( [] ) x @regex; } $which = 0 if (++$which >= @regex); next; } push @{$rhash[$which]}, $_; } # Since the last Data block won't be terminated with a # Description line, need to cleanup here process_arrays( @rhash{ @regex } ); # ... sub process_arrays { my ( $arrayref1, $arrayref2 ) = @_; # return immediately if both arrays are empty return unless $arrayref1 && @$arrayref1 && $arrayref2 && @$arrayref2 +; # ... } [download] Update: Oops! my first version was plagued with Off-By-One bugs. This should work (untested). dmm You can give a man a fish and feed him for a day ... Or, you can teach him to fish and feed him for a lifetime	[reply] [d/l]
Re: How to read lines from a file which is.... by belg4mit (Prior) on Jan 04, 2002 at 05:05 UTC
Sounds like a job for Inline::Files. `-- perl -pe "s/\b;([st])/'\1/mg"`	[reply]
Re: How to read lines from a file which is.... by seattlejohn (Deacon) on Jan 04, 2002 at 11:08 UTC
You say that the number of tags might increase, so I would probably plan for that from the start and write something generalized like this: #!/usr/bin/perl -w use strict; die "Usage: save-into-arrays.pl input-file.txt\n" unless @ARGV == 1; my @description; # where we'll store contents of Description sectio +n my @data; # where we'll store contents of Data section my %sections = ('Description:' => \@description, 'Data:' => \@data); open(INPUT_FILE, "<$ARGV[0]") or die "Unable to open $ARGV[0]: $!\n"; my $array_ref; # keeps reference to array currently in use while (my $line = <INPUT_FILE>) { chomp $line; if (defined($sections{$line})) { # is this is a section hea +d? $array_ref = $sections{$line}; # yes, so point to the arr +ay for this section } else { push @$array_ref, $line if $array_ref; # no, save this line in th +e active array } } close(INPUT_FILE); [download] Basically what this does is create a hash that pairs each possible section head with an array where you'll store the data from that section. When a section head is encountered, $array_ref gets updated to point to the right array for subsequent lines. This way when you have new section tags, you can just create a new array and a new entry for it in the hash -- no need to alter the underlying logic with more if statements, etc. Cheers...	[reply] [d/l]