Re: Best way to parse multiline data?

If your files are small, then slurping and using a regex with /s and a lookahead is the easiest way:

#! perl -slw
use strict;

my $data = do{ local $/; <DATA> };

print "'$1'$2'\n" while $data =~ m[([a-z-]+):(.*?)\n(?=[a-z-]+:|$)]sg;

__DATA__
aut-num:       AS19710
as-name:       ASN
descr:         S4R
admin-c:       SNE1
tech-c:        SNE1
import:        from AS3356 63.215.71.1 at 63.215.71.2 action pref=20; 
+med=50;
               from AS3356 63.215.86.133 at 63.215.86.134 action pref=
+50; med=150;
               accept ANY
import:        from AS3847 action pref=10;
               accept ANY
export:        to AS3847
               announce AS19710
export:        to AS3356
               announce AS19710
notify:        nwcontact@email
mnt-by:        S4R
changed:       andy@email 20010502
source:        LEV
[download]

At each iteration of the while loop, $1 will be the section header, and $2 the body of the section with all the whitespace intact. You can further process $2 to remove or reduce the whitespace as required.

P:\test>448390
'aut-num'       AS19710'

'as-name'       ASN'

'descr'         S4R'

'admin-c'       SNE1'

'tech-c'        SNE1'

'import'        from AS3356 63.215.71.1 at 63.215.71.2 action pref=20;
+ med=50;
               from AS3356 63.215.86.133 at 63.215.86.134 action pref=
+50; med=150;
               accept ANY'

'import'        from AS3847 action pref=10;
               accept ANY'

'export'        to AS3847
               announce AS19710'

'export'        to AS3356
               announce AS19710'

'notify'        nwcontact@email'

'mnt-by'        S4R'

'changed'       andy@email 20010502'

'source'        LEV'
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco.

Rule 1 has a caveat! -- Who broke the cabal?

Comment on Re: Best way to parse multiline data? Select or Download Code

Replies are listed 'Best First'.
Re^2: Best way to parse multiline data? by jalewis2 (Monk) on Apr 16, 2005 at 02:43 UTC
The files can be large, up to 600MB in one case. I hadn't considered slurping. Thanks for the ideas!	[reply]