in reply to RP: Finding the second line an item appears on
One very useful trick with Perl is to set the input record separator in such a way that instread of reading line by line you read RECORD by RECORD where each record may have many lines. Once you have discrete records you can generally manipulate them very easily. Using a data set like what you showed yesterday and setting the input record separator to TWO newlines (only seen between tables) we get:
# set imput record separator to end of table string $/ = "\n\n"; my %hash; # now we are reading data a table at a time while(<DATA>) { my ( undef, $header, $data ) = split /\+\-+\+\s*/, $_; $header =~ s/\s*\|\s*//g; @data = $data =~ m/\|([^\|]+)\|/g; push @{$hash{$header}}, @data; } use Data::Dumper; print Dumper \%hash; __DATA__ +---------+ | formula | +---------+ | dat1 | | dat2 | | dat3 | +---------+ +---------+ | formula | +---------+ | dat4 | | dat5 | | dat6 | +---------+ +---------+ | flubber | +---------+ | dat11 | | dat22 | | dat33 | +---------+ +---------+ | dubber | +---------+ | dat111 | | dat222 | | dat333 | +---------+ +---------+ | dubber | +---------+ | dat1111 | | dat2222 | | dat3333 | +---------+ __END__ $VAR1 = { '' => [], 'flubber' => [ ' dat11 ', ' dat22 ', ' dat33 ' ], 'formula' => [ ' dat1 ', ' dat2 ', ' dat3 ', ' dat4 ', ' dat5 ', ' dat6 ' ], 'dubber' => [ ' dat111 ', ' dat222 ', ' dat333 ', ' dat1111 ', ' dat2222 ', ' dat3333 ' ] };
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
|
|---|