Re^3: Need help with complex hash of hashes.

I need the positions

I don't understand this. With XML the order of the tags shouldn't matter. Anyway, here is my best guess at what you are trying to do, create a CSV of all the data and a hash structure of some of it.

#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
use Text::CSV;
use Data::Dump 'pp';

my @header   = ();
my @csv_rows = ();
my $order_by = 'u_ci_id';

my %seen = ();
my %arx_data = ();

# parse xml in chunks
my $twig = 'XML::Twig'->new( 
  twig_handlers =>  { result => \&record }
);
my $input_xml = get_xml(); # change as required
$twig->parse($input_xml);

# data dump
pp \%arx_data;
#pp \@csv_rows;
#pp \@header;

# sort and dump csv
my @sorted = sort { $b->[0] cmp $a->[0] } @csv_rows;
unshift @sorted,[@header]; # add header

my $outfile = 'xmldump.csv';
my $csv = Text::CSV->new ( { binary => 1 } )  
    or die "Cannot use CSV: ".Text::CSV->error_diag ();
$csv->eol("\n"); 

if ($outfile){        
  open my $fh, ">:encoding(utf8)", $outfile or die "$outfile: $!";
  for (@sorted){
    shift @$_; # remove sort field
    $csv->print ($fh, $_);
  }
  close $fh or die "new.csv: $!"; 
} else {
  for (@sorted){
    shift @$_;
    $csv->print (\*STDOUT, $_);
  }
}  

# parse 1 record
sub record {
  my ($e,$t) = @_;
  
  # csv records
  unless (@header){
    my @info_tags = $t->children;
    @header = map{ $_->name } @info_tags;
    unshift @header,$order_by; # add sort field
  }
  
  # extract data in same order as header row
  my @data = map{ $t->field($_) } @header;
  push @csv_rows,clean(@data); #returns array ref
 
  # build hash structure
  my $app   = $t->field('u_application_id');
  my $u_ci  = $t->field('u_ci_id');
  my $class = $t->field('sys_class_name');
  my $name  = $t->field('name');
  my $ip    = $t->field('ip_address');
  
  if ( $class eq 'cmdb_ci_appl'){
    $arx_data{$u_ci}{'name'} = $name;  
  } 
  
  if ( $class eq 'cmdb_ci_svr' ){
    next if $seen{$app}{$ip}++;
    push @{$arx_data{$app}{'members'}},$ip;
  }
  
}

sub clean {
  my @f = @_;
  for (@f){
    s/\r|\n//g;  # Cleanup Carraige Returns
    s/, / /g;    # Cleanup Comma Space
    s/,/ /g;     # Cleanup Comma
    s/"//g;      # Cleanup Parentheses
    s/^\s+|\s+$//g;    # Trim spaces
    s/([^[:ascii:]]+)/unidecode($1)/ge;
  }
  return \@f;
}
[download]

# get xml
sub get_xml {
  return '<?xml version="1.0" encoding="UTF-8"?>
<response>
 <result>
  <fqdn>fqdnB</fqdn>
  <ip_address>ip1</ip_address>
  <u_display_name>display1</u_display_name>
  <name>name1, with comma "and" quote</name>
  <mac_address>mac5</mac_address>
  <u_ci_id>CI10334984</u_ci_id>
  <sys_class_name>cmdb_ci_appl</sys_class_name>
  <u_application_id></u_application_id>
 </result>
 <result>
  <fqdn>fqdnA</fqdn>
  <ip_address>ip2</ip_address>
  <u_display_name>display2</u_display_name>
  <name>name2</name>
  <mac_address>mac, 4</mac_address>
  <u_ci_id>CI56789</u_ci_id>
  <sys_class_name>cmdb_ci_svr</sys_class_name>
  <u_application_id>CI10334984</u_application_id>
 </result> 
 <result>
  <fqdn>fqdnC</fqdn>
  <ip_address>ip3</ip_address>
  <u_display_name>display3</u_display_name>
  <name>name2</name>
  <mac_address>mac5</mac_address>
  <u_ci_id>CI56789</u_ci_id>
  <sys_class_name>cmdb_ci_svr</sys_class_name>
  <u_application_id>CI10334984</u_application_id>
</result>
</response>';
}
[download]

poj

Comment on Re^3: Need help with complex hash of hashes. Select or Download Code

Replies are listed 'Best First'.
Re^4: Need help with complex hash of hashes. by vlturner (Sexton) on Jan 27, 2016 at 16:23 UTC
I need to use that final hash to build out this CSV file. Now your code is great because it properly dumps the XML.csv, and now while I have the data in the big hash build out this secondary CSV file too. You can see where the list of member IP's need to go separated bu semicolons. member1;member2;member3 `#,,,,,,,, # Console Version: ,9.6.0,,,,,,, # Version: ,12.0.0,,,,,,, # Timestamp: ,2015-11-03 13:25:53 EST,,,,,,, #,,,,,,,, # Source: ,host1,,,,,,, # StartTime: ,2015-11-03 09:12:00 EST,,,,,,, # EndTime: ,2015-11-03 10:12:00 EST,,,,,,, # Data Type: ,Business Groups,,,,,,, #,,,,,,,, Status,Name,Description,Members,Id,InboundUtilizationBandwidth,Outboun +dUtilizationBandwidth,Appliances,CollectOnDirector W,Application1,,10.10.10.1;10.10.10.2;10.4.23.34,68,0,0,,` [download]	[reply] [d/l]
Re^4: Need help with complex hash of hashes. by vlturner (Sexton) on Jan 28, 2016 at 18:35 UTC
So my original code contained the following. `my $twig = 'XML::Twig'->new; $twig->xparse($input_xml);` [download] And POJ's code contains the following. `my $twig = 'XML::Twig'->new( twig_handlers => { result => \&record }) +; $twig->parse($input_xml);` [download] POJ's code when I do 10000 records is blowing up in the Twig module as follows. DB<2> c Exiting subroutine via next at vance.pl line 91. at vance.pl line 91. main::record(XML::Twig=HASH(0x6003ead30), XML::Twig::Elt=HASH(0x60 +2c20e90)) called at /usr/lib/perl5/site_perl/5.22/XML/Twig.pm line 23 +48 XML::Twig::_twig_end(XML::Parser::Expat=HASH(0x6003fafe0), "result +") called at /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads/XML/ +Parser/Expat.pm line 474 XML::Parser::Expat::parse(XML::Parser::Expat=HASH(0x6003fafe0), "< +?xml version=\"1.0\" encoding=\"UTF-8\"?><response><result><chec"...) + called at /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-threads/XML/Pa +rser.pm line 187 eval {...} called at /usr/lib/perl5/site_perl/5.22/x86_64-cygwin-t +hreads/XML/Parser.pm line 186 XML::Parser::parse(XML::Twig=HASH(0x6003ead30), "<?xml version=\"1 +.0\" encoding=\"UTF-8\"?><response><result><chec"...) called at /usr/ +lib/perl5/site_perl/5.22/XML/Twig.pm line 767 eval {...} called at /usr/lib/perl5/site_perl/5.22/XML/Twig.pm lin +e 767 XML::Twig::parse(XML::Twig=HASH(0x6003ead30), "<?xml version=\"1.0 +\" encoding=\"UTF-8\"?><response><result><chec"...) called at vance.p +l line 33 Exiting subroutine via next at vance.pl line 91. [download] It is deep in the Twig module when I single step, but I think its tied to some sort of overflow condition. Any ideas as to how to fix this? POJ, why did you change those lines? Here is the current code. use strict; use warnings; use XML::Twig; use Text::CSV; use Data::Dump 'pp'; use Data::Dumper; use REST::Client; use MIME::Base64; use Text::Unidecode; use utf8; my @header = (); my @csv_rows = (); my $order_by = 'u_ci_id'; my %seen = (); my %arx_data = (); # parse xml in chunks my $twig = 'XML::Twig'->new( twig_handlers => { result => \&record }) +; my $client = REST::Client->new(host => 'https://staplessb.service-now. +com'); my $encoded_auth = encode_base64("1774580:Natalie1", ''); $client->GET("/api/now/table/cmdb_ci?sysparm_limit=10000", {'Authorization' => "Basic $encoded_auth", 'Accept' => 'application/xml'}); my $input_xml = $client->responseContent(); $twig->parse($input_xml); # data dump pp \%arx_data; #pp \@csv_rows; #pp \@header; # sort and dump csv my @sorted = sort { $b->[0] cmp $a->[0] } @csv_rows; unshift @sorted,[@header]; # add header my $outfile = 'xmldump.csv'; my $csv = Text::CSV->new ( { binary => 1 } ) or die "Cannot use CSV: ".Text::CSV->error_diag (); $csv->eol("\n"); if ($outfile){ open my $fh, ">:encoding(utf8)", $outfile or die "$outfile: $!"; for (@sorted){ shift @$_; # remove sort field $csv->print ($fh, $_); } close $fh or die "new.csv: $!"; } else { for (@sorted){ shift @$_; $csv->print (\*STDOUT, $_); } } # parse 1 record sub record { my ($e,$t) = @_; # csv records unless (@header){ my @info_tags = $t->children; @header = map{ $_->name } @info_tags; unshift @header,$order_by; # add sort field } # extract data in same order as header row my @data = map{ $t->field($_) } @header; push @csv_rows,clean(@data); #returns array ref # build hash structure my $app = $t->field('u_application_id'); my $u_ci = $t->field('u_ci_id'); my $class = $t->field('sys_class_name'); my $name = $t->field('name'); my $ip = $t->field('ip_address'); my $fqdn = $t->field('fqdn'); my $dns = $t->field('dns_domain'); my $mac = $t->field('mac'); if ( $class eq 'cmdb_ci_appl'){ $arx_data{$u_ci}{'name'} = $name; } else { next if $seen{$app}{$ip}++; push @{$arx_data{$app}{'members'}},$ip; } } sub clean { my @f = @_; for (@f){ s/\r\|\n//g; # Cleanup Carraige Returns s/, / /g; # Cleanup Comma Space s/,/ /g; # Cleanup Comma s/"//g; # Cleanup Parentheses s/^\s+\|\s+$//g; # Trim spaces s/([^[:ascii:]]+)/unidecode($1)/ge; } return \@f; } [download]	[reply] [d/l] [select]
Re^5: Need help with complex hash of hashes. by poj (Abbot) on Jan 28, 2016 at 20:04 UTC
Try changing last lines of `record` subroutine to this `. . if ( $class eq 'cmdb_ci_appl'){ $arx_data{$u_ci}{'name'} = $name; } elsif (! $seen{$app}{$ip}++) { push @{$arx_data{$app}{'members'}},$ip; } $t->purge ; # add this line to free memory }` [download] poj	[reply] [d/l] [select]
Re^6: Need help with complex hash of hashes. by vlturner (Sexton) on Feb 02, 2016 at 01:45 UTC
Thanks, that helped. Also you were right to purge memory. The line below really cleaned it up. `$t->purge ; # add this line to free memory` [download]	[reply] [d/l]
Re^6: Need help with complex hash of hashes. by vlturner (Sexton) on Jan 29, 2016 at 23:36 UTC
OK, I figured out how the record subroutine was blowing up inside the Twig module. It was the "next if" line that hung it. `if ( $class eq 'cmdb_ci_svr' ){ next if $seen{$app}{$ip}++; push @{$arx_data{$app}{'members'}},$ip; }` [download] So I had to change it to this, seems I am ending up with CI's with members but no name. I am working to resolve that. `if ( $class eq 'cmdb_ci_appl'){ push @{$arx_data{$u_ci}{'name'}}, $name; } elsif ( $app ne "" && $ip ) { if ( exists $arx_data{$u_ci}{'name'} ) { push @{$arx_data{$app}{'members'}},$ip; } else { push @{$arx_data{$u_ci}{'name'}}, $null; push @{$arx_data{$app}{'members'}},$ip; } }` [download] Now what I want to add is one of the sections that cleans up the data first. This validates the IP address and the name that should be its hostname from DNS records. Now I am trying to add this to your codebase which uses $ip instead of $data$ip_address_pos, but I was using the positional value to splice the corrected data. This is to create the CSV file with the corrected data, and I will also use it to fix ip addresses in the member arrays too. Any suggestion? `if ( $data[$ip_address_pos] eq "" && $data[$name_pos] ne "" && is_hostname($data[$name_pos])) { my $address = inet_aton($data[$name_pos]) \|\| "Error: Can't resolve."; if ( $address ne "" && $address ne "Error: Can't resolve." ) { $address = inet_ntoa($address); splice @data, ($ip_address_pos), 1, $address; my $cmdb_name = $data[$name_pos]; my $cmdb_lc_name = lc $cmdb_name; splice @data, ($name_pos), 1, $cmdb_lc_name; } }` [download]	[reply] [d/l] [select]