in reply to Re^3: Need help with complex hash of hashes.
in thread Need help with complex hash of hashes.

So my original code contained the following.

my $twig = 'XML::Twig'->new; $twig->xparse($input_xml);

And POJ's code contains the following.

my $twig = 'XML::Twig'->new( twig_handlers => { result => \&record }) +; $twig->parse($input_xml);

POJ's code when I do 10000 records is blowing up in the Twig module as follows.

DB<2> c Exiting subroutine via next at vance.pl line 91. at vance.pl line 91. main::record(XML::Twig=HASH(0x6003ead30), XML::Twig::Elt=HASH(0x60 +2c20e90)) called at /usr/lib/perl5/site_perl/5.22/XML/Twig.pm line 23 +48 XML::Twig::_twig_end(XML::Parser::Expat=HASH(0x6003fafe0), "result +") called at /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads/XML/ +Parser/Expat.pm line 474 XML::Parser::Expat::parse(XML::Parser::Expat=HASH(0x6003fafe0), "< +?xml version=\"1.0\" encoding=\"UTF-8\"?><response><result><chec"...) + called at /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads/XML/Pa +rser.pm line 187 eval {...} called at /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-t +hreads/XML/Parser.pm line 186 XML::Parser::parse(XML::Twig=HASH(0x6003ead30), "<?xml version=\"1 +.0\" encoding=\"UTF-8\"?><response><result><chec"...) called at /usr/ +lib/perl5/site_perl/5.22/XML/Twig.pm line 767 eval {...} called at /usr/lib/perl5/site_perl/5.22/XML/Twig.pm lin +e 767 XML::Twig::parse(XML::Twig=HASH(0x6003ead30), "<?xml version=\"1.0 +\" encoding=\"UTF-8\"?><response><result><chec"...) called at vance.p +l line 33 Exiting subroutine via next at vance.pl line 91.

It is deep in the Twig module when I single step, but I think its tied to some sort of overflow condition. Any ideas as to how to fix this? POJ, why did you change those lines? Here is the current code.

use strict; use warnings; use XML::Twig; use Text::CSV; use Data::Dump 'pp'; use Data::Dumper; use REST::Client; use MIME::Base64; use Text::Unidecode; use utf8; my @header = (); my @csv_rows = (); my $order_by = 'u_ci_id'; my %seen = (); my %arx_data = (); # parse xml in chunks my $twig = 'XML::Twig'->new( twig_handlers => { result => \&record }) +; my $client = REST::Client->new(host => 'https://staplessb.service-now. +com'); my $encoded_auth = encode_base64("1774580:Natalie1", ''); $client->GET("/api/now/table/cmdb_ci?sysparm_limit=10000", {'Authorization' => "Basic $encoded_auth", 'Accept' => 'application/xml'}); my $input_xml = $client->responseContent(); $twig->parse($input_xml); # data dump pp \%arx_data; #pp \@csv_rows; #pp \@header; # sort and dump csv my @sorted = sort { $b->[0] cmp $a->[0] } @csv_rows; unshift @sorted,[@header]; # add header my $outfile = 'xmldump.csv'; my $csv = Text::CSV->new ( { binary => 1 } ) or die "Cannot use CSV: ".Text::CSV->error_diag (); $csv->eol("\n"); if ($outfile){ open my $fh, ">:encoding(utf8)", $outfile or die "$outfile: $!"; for (@sorted){ shift @$_; # remove sort field $csv->print ($fh, $_); } close $fh or die "new.csv: $!"; } else { for (@sorted){ shift @$_; $csv->print (\*STDOUT, $_); } } # parse 1 record sub record { my ($e,$t) = @_; # csv records unless (@header){ my @info_tags = $t->children; @header = map{ $_->name } @info_tags; unshift @header,$order_by; # add sort field } # extract data in same order as header row my @data = map{ $t->field($_) } @header; push @csv_rows,clean(@data); #returns array ref # build hash structure my $app = $t->field('u_application_id'); my $u_ci = $t->field('u_ci_id'); my $class = $t->field('sys_class_name'); my $name = $t->field('name'); my $ip = $t->field('ip_address'); my $fqdn = $t->field('fqdn'); my $dns = $t->field('dns_domain'); my $mac = $t->field('mac'); if ( $class eq 'cmdb_ci_appl'){ $arx_data{$u_ci}{'name'} = $name; } else { next if $seen{$app}{$ip}++; push @{$arx_data{$app}{'members'}},$ip; } } sub clean { my @f = @_; for (@f){ s/\r|\n//g; # Cleanup Carraige Returns s/, / /g; # Cleanup Comma Space s/,/ /g; # Cleanup Comma s/"//g; # Cleanup Parentheses s/^\s+|\s+$//g; # Trim spaces s/([^[:ascii:]]+)/unidecode($1)/ge; } return \@f; }

Replies are listed 'Best First'.
Re^5: Need help with complex hash of hashes.
by poj (Abbot) on Jan 28, 2016 at 20:04 UTC

    Try changing last lines of record subroutine to this

    . . if ( $class eq 'cmdb_ci_appl'){ $arx_data{$u_ci}{'name'} = $name; } elsif (! $seen{$app}{$ip}++) { push @{$arx_data{$app}{'members'}},$ip; } $t->purge ; # add this line to free memory }
    poj

      Thanks, that helped. Also you were right to purge memory. The line below really cleaned it up.

      $t->purge ; # add this line to free memory

      OK, I figured out how the record subroutine was blowing up inside the Twig module. It was the "next if" line that hung it.

      if ( $class eq 'cmdb_ci_svr' ){ next if $seen{$app}{$ip}++; push @{$arx_data{$app}{'members'}},$ip; }
      So I had to change it to this, seems I am ending up with CI's with members but no name. I am working to resolve that.
      if ( $class eq 'cmdb_ci_appl'){ push @{$arx_data{$u_ci}{'name'}}, $name; } elsif ( $app ne "" && $ip ) { if ( exists $arx_data{$u_ci}{'name'} ) { push @{$arx_data{$app}{'members'}},$ip; } else { push @{$arx_data{$u_ci}{'name'}}, $null; push @{$arx_data{$app}{'members'}},$ip; } }

      Now what I want to add is one of the sections that cleans up the data first. This validates the IP address and the name that should be its hostname from DNS records. Now I am trying to add this to your codebase which uses $ip instead of $data$ip_address_pos, but I was using the positional value to splice the corrected data. This is to create the CSV file with the corrected data, and I will also use it to fix ip addresses in the member arrays too. Any suggestion?

      if ( $data[$ip_address_pos] eq "" && $data[$name_pos] ne "" && is_hostname($data[$name_pos])) { my $address = inet_aton($data[$name_pos]) || "Error: Can't resolve."; if ( $address ne "" && $address ne "Error: Can't resolve." ) { $address = inet_ntoa($address); splice @data, ($ip_address_pos), 1, $address; my $cmdb_name = $data[$name_pos]; my $cmdb_lc_name = lc $cmdb_name; splice @data, ($name_pos), 1, $cmdb_lc_name; } }