I am trying to parse a text file, which has a certain structure, i would say its recursive; I tried to parse it using an iterative approach, going over each line and trying to figure out what action to take based on the current line and some state variables that I maintain/update as my program crunches over the lines. The problem I have is that I cannot figure out the correct logic to parse the input to the output format I desire. Could someone help me figure out the logic ( an algorithm, perhaps ) for parsing this structured text file ? I am not necessarily asking for code, just maybe a text description or ideas how to do it. I will attach my code ( which is not producing a correct output ), and try to explain it a few steps how it operates.
max_reformation_duration=300 cfs_online_timeout_changes=200 max_configured_packages=100 node:nodeA|name=nodeA node:nodeA|status=up node:nodeA|interface:lan900|name=lan900 node:nodeA|interface:lan900|status=up node:nodeA|interface:lan900|ip_address:IP_A|name=IP_A node:nodeA|interface:lan900|ip_address:IP_A|netmask=255.255.255.0 node:nodeA|interface:lan901|name=lan901 node:nodeA|interface:lan901|status=up node:nodeA|interface:lan901|ip_address:IP_C|name=IP_C node:nodeA|interface:lan901|ip_address:IP_C|netmask=255.255.255.0 node:nodeA|subnet:SUBNET_A|name=SUBNET_A node:nodeA|subnet:SUBNET_A|status=up node:nodeA|subnet:SUBNET_B|name=SUBNET_B node:nodeA|subnet:SUBNET_B|status=up node:nodeB|name=nodeB node:nodeB|status=up node:nodeB|interface:lan900|name=lan900 node:nodeB|interface:lan900|status=up node:nodeB|interface:lan900|ip_address:IP_B|name=IP_B node:nodeB|interface:lan900|ip_address:IP_B|netmask=255.255.255.0 node:nodeB|interface:lan901|name=lan901 node:nodeB|interface:lan901|status=up node:nodeB|interface:lan901|ip_address:IP_D|name=IP_D node:nodeB|interface:lan901|ip_address:IP_D|netmask=255.255.255.0 node:nodeB|subnet:SUBNET_A|name=SUBNET_A node:nodeB|subnet:SUBNET_A|status=up node:nodeB|subnet:SUBNET_B|name=SUBNET_B node:nodeB|subnet:SUBNET_B|status=up
%config = { max_reformation_duration => 300, cfs_online_timeout_changes => 200, max_configured_packages => 100, node => { nodeA => { name => nodeA, status => up, interface => { lan900 => { name => lan900, status => up, ip_adress => { IP_A => { name => IP_A, netmask => 255.255.255.0, }, }, }, lan901 => { name => lan901, status => up, ip_adress => { IP_C => { name => IP_C, netmask => 255.255.255.0, }, }, }, }, subnet => { SUBNET_A => { name => SUBNET_A, status => up, }, SUBNET_B => { name => SUBNET_B, status => up, }, }, }, nodeB => { name => nodeB, status => up, interface => { lan900 => { name => lan900, status => up, ip_adress => { IP_A => { name => IP_B, netmask => 255.255.255.0, }, }, }, lan901 => { name => lan901, status => up, ip_adress => { IP_C => { name => IP_D, netmask => 255.255.255.0, }, }, }, }, subnet => { SUBNET_A => { name => SUBNET_A, status => up, }, SUBNET_B => { name => SUBNET_B, status => up, }, }, }, }, }
#!/usr/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Indent = 1; $Data::Dumper::Sortkeys = 1; sub read_input { open(my $io, '<', 'input.txt') or die $!; return <$io>; } my @lines = read_input(); my %config; my $hashref = \%config; # stack of hash references my @ref; # count of | characters in a respective line my $level = 0; # count of | characters in the previous line my $prev_level = 0; my $key; my $fkey; for (@lines) { chomp; my @f = split(/\|/); $level = @f - 1; if ($level == 0) { my @f = split(/=/); $hashref->{ $f[0] } = $f[1]; } else { my @q = split(/:/, $f[$level - 1]); if ($key && $level <= $prev_level) { if (/\Q$key\E/) { my @f = split(/=/, $f[$level]); $hashref->{ $f[0] } = $f[1]; $prev_level = $level; } elsif (/\Q$fkey\E/) { $hashref = $ref[-1]; $hashref->{$q[1]} = {}; $hashref = $hashref->{$q[1]}; $key = join(':', @q); $prev_level = $level; redo; } else { pop @ref; if ($level < $prev_level) { my $diff = $prev_level - $level; while ($diff--) { pop @ref; pop @ref; } } $hashref = pop @ref; $key = ''; $prev_level = $level; redo; } } else { if (exists $hashref->{$q[0]}) { push @ref, $hashref; $hashref->{$q[0]}->{$q[1]} = {}; $hashref = $hashref->{$q[0]}->{$q[1]}; } else { push @ref, $hashref; $hashref->{$q[0]} = {}; $hashref = $hashref->{$q[0]}; push @ref, $hashref; $hashref->{$q[1]} = {}; $hashref = $hashref->{$q[1]}; } $key = join(':', @q); $fkey = $q[0]; $prev_level = $level; redo; } } } print Dumper( \%config );
$VAR1 = { 'cfs_online_timeout_changes' => '200', 'max_configured_packages' => '100', 'max_reformation_duration' => '300', 'node' => { 'nodeA' => { 'interface' => { 'lan900' => { 'ip_address' => { 'IP_A' => { 'name' => 'IP_A', 'netmask' => '255.255.255.0' } }, 'name' => 'lan900', 'status' => 'up' }, 'lan901' => { 'ip_address' => { 'IP_C' => { 'name' => 'IP_C', 'netmask' => '255.255.255.0' } }, 'name' => 'lan901', 'status' => 'up' } }, 'name' => 'nodeA', 'status' => 'up' }, 'subnet' => { 'SUBNET_A' => { 'name' => 'SUBNET_A', 'status' => 'up' }, 'SUBNET_B' => { 'name' => 'SUBNET_B', 'status' => 'up' } } } };
In reply to Parsing structured text, problem with parsing logic by hperange
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |