Morning Monks!

I have been trying to figure out the differences between two XML files for a little while now, and I have gotten pretty far, but I am stuck at one point. Basically, I am comparing the files, and I am trying to make sure that all host-aliases that exist in file1 also exist in file2. Where I am stuck is that I need to make sure that if host-alias www.foo.com exists under hostid "bobjones" in file1, then the host-alias www.foo.com exists under the same hostid in file2.

More domains can exist in file2 than file1, but all domains that are in file1 must be in file2, and they must be under the same hostid.

Here is a sample of the the input XML
<host id="bobjones" root-directory="."> <host-alias>www.foo.com</host-alias> <host-alias>www.bar.com</host-alias> <host-alias>www.dj.com</host-alias> </host>
And below is the code that I have. It's not finished, but it's what I have thus far:
#!/usr/bin/perl use strict; use warnings; use Getopt::Long; use Pod::Usage; my %alias_hash; my %host_hash; my %host_contents; my %seen; my $host_id; my $dbh; my $file1; my $file2; GetOptions( 'h|help' => sub { pod2usage( { -verbose => 1, -input = +> \*DATA, } ); exit; }, 'm|man' => sub { pod2usage( { -verbose => 2, -input = +> \*DATA, } ); exit; }, 'f1|file1=s' => \$file1, 'f2|file2=s' => \$file2, ); pod2usage( -verbose => 1 ) unless $file1 and $file2; open(my $file1_handle, '<', $file1) or die "Could not open $file1 ($!) +\n"; while (my $line=<$file1_handle>) { chomp $line; if ($line =~ /host id="(.*?)"/) { $host_id = $1; $host_hash{$host_id} = -1; } if ($line =~ m{<host-alias>(.*?)</host-alias>}) { $alias_hash{$host_id}{$1} = -1; } } close $file1_handle; open(my $file2_handle, '<', $file2) or die "Could not open $file2 ($!) +\n"; while (my $line=<$file2_handle>) { chomp $line; if ($line =~ /host id="(.*?)"/) { $host_id = $1; $host_hash{$host_id}++; } if ($line =~ m{<host-alias>(.*?)</host-alias>}) { $alias_hash{$host_id}{$1}++; } } close $file1_handle; for my $k1 ( keys %host_hash ) { if ($host_hash{$k1} == -1) { print "$k1\n"; } } for my $k1 ( keys %alias_hash ) { for my $k2 ( keys %{ $alias_hash{$k1} } ) { if ($alias_hash{$k1}{$k2} == -1) { print "$k2\n"; } } }

In reply to Difficulty Mapping Data by walkingthecow

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.