in reply to Re^2: parse file per customized separator / block / metadata
in thread parse file per customized separator / block / metadata

It would really help if there were some = equal signs separating data3 and header4. Seems like File::Stream doesn't handle lookaheads that well. Nevertheless, here's an example that may help:
use File::Stream; my $lookahead_regex = qr/\w+[=:]/; my ($handler, $stream) = File::Stream->new( *DATA, separator => qr/\n=*\n$lookahead_regex/, ); my $lookahead = ""; while (my $block = <$stream>) { $block =~ s/($lookahead_regex)$//; $block = $lookahead . $block; $lookahead = $1; print $block; print "-" x 60, "\n"; } __END__ header1=val1 header1b=val1b data1 ================== header2: val2 header2b: val2b data2 =============== header3: val3 header3b: val3b data3 header4= val4 header4b= val4b data4

Output:
header1=val1 header1b=val1b data1 ================== ------------------------------------------------------------ header2: val2 header2b: val2b data2 =============== ------------------------------------------------------------ header3: val3 header3b: val3b data3 ------------------------------------------------------------ header4= val4 header4b= val4b data4 ------------------------------------------------------------

Replies are listed 'Best First'.
Re^4: parse file per customized separator / block / metadata
by raiten (Acolyte) on Mar 28, 2010 at 14:27 UTC

    Sorry, it's working great. The data file need to be dos2unix-ed. Great thanks for the code, nearly perfect shot :-)

    I still need to find if there are things to optimize to handle multiple big files or pass to multithreading.

Re^4: parse file per customized separator / block / metadata
by raiten (Acolyte) on Mar 25, 2010 at 22:02 UTC

    Thanks a lot for this code and sorry for the delayed feedback.

    I try to made some tests today and the code covers most needs. The only point which fails is matching block on /^[=]+$/ (note this regexp is not accepted for block matching). my $lookahead_regex = qr/\w+[=:]/; or my $lookahead_regex = qr/[=:][=:][=:]+/; both fail.

    I can't manage to match block separator as 'headerX:' (work) AND '=========[=]+' (don't work for now)

    advices ? I'll try to continue to work on it in the next days.

    Thanks a lot

      What test case(s) do you have that failed? Do provide it.