Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I need a little help here

How to parse this bit of multi-line data.

object-group service DM_SERVICE_7 service-object tcp eq 1433 service-object tcp eq 49160 service-object tcp eq 8086 object-group network Employees description Employees network-object 10.10.12.0 255.255.255.0 network-object 10.11.12.0 255.255.255.0

Should parse into a hash like

'DM_SERVICE_7' => { 'type' => 'service ', 'array_of_entries' => ["tcp eq 1433", "tcp eq 49160", "tcp eq 8086 +" ] }, 'Employees' => { 'type' => 'network', 'array_of_entries' => ["10.10.12.0 255.255.255.0", "10.11.12.0 255 +.255.255.0"] }

I have got as far as here, but not sure how to proceed

undef $/; while (<>) { @array = split /object-group/; #array of all interesting data }

Replies are listed 'Best First'.
Re: Help on multiline regex
by ikegami (Patriarch) on Jun 01, 2011 at 17:50 UTC
    my %objects; my $object; while (<>) { chomp; if ( my ($object_type, $object_id) = /^\s*object-group\s+(\S+)\s+(\S+)/ ) { $objects{$object_id} = $object = { type => $object_type, entries => [], }; } elsif (my ($entry) = /^\s*\S+-object\s+(.*)/) { push @{ $object->{entries} }, $entry; } }
      better  elsif (my ($entry) = /^\s*${object_type}-object\s+(.*)/) {

      Cheers Rolf

        That should be
        elsif (my ($entry) = /^\s*\Q$object->{type}\E-object\s+(.*)/) {
        but I fixed it with something simpler.
Re: Help on multiline regex
by wind (Priest) on Jun 01, 2011 at 20:38 UTC
    use strict; use warnings; my %hash; my $key; while (<DATA>) { if (/object-group\s+(\w+)\s+(.*)/) { $key = $2; $hash{$key}{type} = $1; } elsif (/\Q$hash{$key}{type}\E-object\s+(.*)/) { push @{$hash{$key}{array_of_entries}}, $1; } } use Data::Dumper; print Dumper(\%hash); __DATA__ object-group service DM_SERVICE_7 service-object tcp eq 1433 service-object tcp eq 49160 service-object tcp eq 8086 object-group network Employees description Employees network-object 10.10.12.0 255.255.255.0 network-object 10.11.12.0 255.255.255.0
      Thanks for the complete code!
Re: Help on multiline regex
by LanX (Saint) on Jun 01, 2011 at 17:29 UTC
    I don't think your mapping of input to output is unambiguous.

    I would loop over the input and try to match the head-lines and fieldelements of the object-groups.

    Then either call specialized subroutines which do the parsing of the following indented lines of the recognized group or raise a warning for an unknown group.

    Like this your data will also be evaluated for correctness.

    Sorry no code.

    Cheers Rolf

Re: Help on multiline regex
by Cristoforo (Curate) on Jun 02, 2011 at 00:56 UTC
    Setting the INPUT_RECORD_SEPARATOR to 'object-group'

    #!/usr/bin/perl use strict; use warnings; use 5.012; use Data::Dumper; my %data; { local $/ = 'object-group'; while (<DATA>) { chomp; next unless $_; my ($type, $key) = /\A (\S+) (\S+)$/m; my @data = /^ $type-object (.+)$/mg; $data{ $key } = {type => $type, array_of_entries => \@data}; } } print Dumper \%data; __DATA__ object-group service DM_SERVICE_7 service-object tcp eq 1433 service-object tcp eq 49160 service-object tcp eq 8086 object-group network Employees description Employees network-object 10.10.12.0 255.255.255.0 network-object 10.11.12.0 255.255.255.0
    Update: edited line of code. my ($type, $key) = /\A (\S+) (\S+)$/m;
      I like this method!