in reply to XML File Creation in Perl

G'day documents9900,

This is fairly straightforward in XML::Simple; however, be aware that for more complex work, this is often not the best choice.

#!/usr/bin/env perl use strict; use warnings; use Inline::Files; use XML::Simple qw{:strict}; my %xml_hash = (Data => {}); my $xml_data = $xml_hash{Data}; my (%db1, %db2); while (<DB1>) { my ($root, $table, $key) = split; push @{$db1{$root}{$table}}, $key; } while (<DB2>) { my ($root, $table, $key) = split; push @{$db2{$root}{$table}}, $key; } while (<DIFF>) { my ($root, $table, $key, $col, $old, $new) = split; $xml_data->{$root}{$table}{NEW1}{KEY} = $db1{$root}{$table} if exists $db1{$root}{$table}; $xml_data->{$root}{$table}{NEW2}{KEY} = $db2{$root}{$table} if exists $db2{$root}{$table}; $xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{oldvalue} = +[$old]; $xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{newvalue} = +[$new]; } print XMLout(\%xml_hash, KeepRoot => 1, KeyAttr => {KEY => 'name'}); __DIFF__ Root1 TBLA KEY1 COLA A B Root1 TBLA KEY1 COLB D E Root1 TBLA KEY3 COLX M N Root2 TBLB KEY4 COLX M N Root2 TBLB KEY4 COLD A B Root3 TBLC KEY5 COLD A B __DB1__ Root1 TBLA KEY6 Root2 TBLB KEY7 Root3 TBLC KEY8 __DB2__ Root1 TBLA KEY9 Root1 TBLA KEY10 Root3 TBLC KEY11

Output:

$ pm_xml_db_diff.pl <Data> <Root1> <TBLA> <MODIFIED> <KEY name="KEY1"> <COLA> <newvalue>B</newvalue> <oldvalue>A</oldvalue> </COLA> <COLB> <newvalue>E</newvalue> <oldvalue>D</oldvalue> </COLB> </KEY> <KEY name="KEY3"> <COLX> <newvalue>N</newvalue> <oldvalue>M</oldvalue> </COLX> </KEY> </MODIFIED> <NEW1> <KEY>KEY6</KEY> </NEW1> <NEW2> <KEY>KEY9</KEY> <KEY>KEY10</KEY> </NEW2> </TBLA> </Root1> <Root2> <TBLB> <MODIFIED> <KEY name="KEY4"> <COLD> <newvalue>B</newvalue> <oldvalue>A</oldvalue> </COLD> <COLX> <newvalue>N</newvalue> <oldvalue>M</oldvalue> </COLX> </KEY> </MODIFIED> <NEW1> <KEY>KEY7</KEY> </NEW1> </TBLB> </Root2> <Root3> <TBLC> <MODIFIED> <KEY name="KEY5"> <COLD> <newvalue>B</newvalue> <oldvalue>A</oldvalue> </COLD> </KEY> </MODIFIED> <NEW1> <KEY>KEY8</KEY> </NEW1> <NEW2> <KEY>KEY11</KEY> </NEW2> </TBLC> </Root3> </Data>

-- Ken

Replies are listed 'Best First'.
Re^2: XML File Creation in Perl
by documents9900 (Initiate) on Apr 16, 2013 at 17:09 UTC
    Thanks Ken for your help. This is working fine. However while testing i found that while populating $xml_data for NEW1 and NEW2, if clause is there which checks that if the combination of $root,$table exists in db1 (similarly db2) There are cases in which the combination (root,table) exists only in DB1 or DB2 or DIFF file. So in those cases the data is not coming as expected. I am thinking of populating data directly into xml_data from each DB1/DB2/DIFF loop. What do you suggest on this
      "I am thinking of populating data directly into xml_data from each DB1/DB2/DIFF loop."

      I suspect that's probably the best option. You'll get some duplicate assignments but I don't imagine that will be a problem. Getting around that would likely involve setting and testing flags that indicate whether a particular assignment has occurred. Benchmark if you feel it's important.

      I see you've added some realistic data. Below, you'll see I've added some more to simulate these new scenarios you've described ("... only in DB1 or DB2 ..."). Where you have data like "A B C", I've plugged the spaces with underscores (i.e. "A_B_C"); this was just so I didn't have to rework how I was handling the data with Inline::Files and split: I'm assuming you're already getting your resultsets as array data. One thing I noticed was that I'm getting <KEY>A_B_C</KEY> while your output is showing <KEY>'A B C'</KEY>. The start and end tags act as delimiters so additional quotes aren't generally necessary in XML; they could potentially cause issues like the data being converted to &apos;A B C&apos;: my instinct would be to remove the quotes before populating the XML — you may have a valid reason for leaving them in.

      Here's the updated code (includes handling the <null> field):

      #!/usr/bin/env perl use strict; use warnings; use Inline::Files; use XML::Simple qw{:strict}; my %xml_hash = (Data => {}); my $xml_data = $xml_hash{Data}; my (%db1, %db2); while (<DB1>) { my ($root, $table, $key) = split; push @{$db1{$root}{$table}}, $key; $xml_data->{$root}{$table}{NEW1}{KEY} = [$key]; } while (<DB2>) { my ($root, $table, $key) = split; push @{$db2{$root}{$table}}, $key; $xml_data->{$root}{$table}{NEW2}{KEY} = [$key]; } while (<DIFF>) { my ($root, $table, $key, $col, $old, $new) = map { $_ eq '<null>' ? '' : $_ } split; $xml_data->{$root}{$table}{NEW1}{KEY} = $db1{$root}{$table} if exists $db1{$root}{$table}; $xml_data->{$root}{$table}{NEW2}{KEY} = $db2{$root}{$table} if exists $db2{$root}{$table}; $xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{oldvalue} = +[$old]; $xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{newvalue} = +[$new]; } print XMLout(\%xml_hash, KeepRoot => 1, KeyAttr => {KEY => 'name'}); __DIFF__ EMPLOYEE EMPLOYEE XYZ TITLE <null> Mr EMPLOYEE EMPDETAILS DEF CITY California New York CUSTOMER CUSTOMER ABC CAPTION Regular Premium __DB1__ EMPLOYEE EMPLOYEE NEW_EMPLOYEE_1 EMPLOYEE EMPLOYEE NEW_EMPLOYEE_9 EMPLOYEE EMPDETAILS NEW_EMPLOYEE1-DETAILS EMPLOYEE EMPDETAILS NEW_EMPLOYEE9-DETAILS EMPLOYEE EMPDETAILS NEW_EMPLOYEE16-DETAILS IN_DB1_ONLY IN_DB1_ONLY IN_DB1_ONLY IN_DB1+DB2 IN_DB1+DB2 IN_DB1+DB2 __DB2__ EMPLOYEE EMPLOYEE NEW_EMPLOYEE_6 EMPLOYEE EMPDETAILS NEW_EMPLOYEE6-DETAILS CUSTOMER CUSTOMER NEW_CUSTOMER IN_DB2_ONLY IN_DB2_ONLY IN_DB2_ONLY IN_DB1+DB2 IN_DB1+DB2 IN_DB1+DB2

      Output:

      $ pm_xml_db_diff2.pl <Data> <CUSTOMER> <CUSTOMER> <MODIFIED> <KEY name="ABC"> <CAPTION> <newvalue>Premium</newvalue> <oldvalue>Regular</oldvalue> </CAPTION> </KEY> </MODIFIED> <NEW2> <KEY>NEW_CUSTOMER</KEY> </NEW2> </CUSTOMER> </CUSTOMER> <EMPLOYEE> <EMPDETAILS> <MODIFIED> <KEY name="DEF"> <CITY> <newvalue>New</newvalue> <oldvalue>California</oldvalue> </CITY> </KEY> </MODIFIED> <NEW1> <KEY>NEW_EMPLOYEE1-DETAILS</KEY> <KEY>NEW_EMPLOYEE9-DETAILS</KEY> <KEY>NEW_EMPLOYEE16-DETAILS</KEY> </NEW1> <NEW2> <KEY>NEW_EMPLOYEE6-DETAILS</KEY> </NEW2> </EMPDETAILS> <EMPLOYEE> <MODIFIED> <KEY name="XYZ"> <TITLE> <newvalue>Mr</newvalue> <oldvalue></oldvalue> </TITLE> </KEY> </MODIFIED> <NEW1> <KEY>NEW_EMPLOYEE_1</KEY> <KEY>NEW_EMPLOYEE_9</KEY> </NEW1> <NEW2> <KEY>NEW_EMPLOYEE_6</KEY> </NEW2> </EMPLOYEE> </EMPLOYEE> <IN_DB1+DB2> <IN_DB1+DB2> <NEW1> <KEY>IN_DB1+DB2</KEY> </NEW1> <NEW2> <KEY>IN_DB1+DB2</KEY> </NEW2> </IN_DB1+DB2> </IN_DB1+DB2> <IN_DB1_ONLY> <IN_DB1_ONLY> <NEW1> <KEY>IN_DB1_ONLY</KEY> </NEW1> </IN_DB1_ONLY> </IN_DB1_ONLY> <IN_DB2_ONLY> <IN_DB2_ONLY> <NEW2> <KEY>IN_DB2_ONLY</KEY> </NEW2> </IN_DB2_ONLY> </IN_DB2_ONLY> </Data>

      -- Ken

        I did some correction in the code
        while (<DB1>) { chomp; my ($root, $table, $header,$key) = split /\t/; $xml_data->{$root}{$table}{NEW1}{$key} = [$key]; } while (<DB2>) { chomp; my ($root, $table, $header,$key) = split /\t/; $xml_data->{$root}{$table}{NEW2}{$key} = [$key]; } while (<DIFF>) { chomp; my ($root, $table, $key, $col, $old, $new) = split /\t/; $xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{oldvalue} = [$ol +d]; $xml_data->{$root}{$table}{MODIFIED}{KEY}{$key}{$col}{newvalue} = [$ne +w]; }
        Changes which i did

        1) Removed the db1 and db2 code as it is not required.

        2. Old code

        $xml_data->{$root}{$table}{NEW2} = [$key];
        New code
        $xml_data->{$root}{$table}{NEW2}{$key} = [$key];
        This was required as if there are multiple rows were there for a table which were overwritten with last value as the root, table combination already exists. So from the result set it displayed only the last row in the xml. Now the last problem is
        $xml_data->{$root}{$table}{NEW2}{$key} = [$key];
        the data is displayed as (Suppose Key value is ABC)
        <ABC>ABC</ABC>
        but it should be
        <KEY>ABC</KEY>
        This is coming because i added the key value for key only meaning code is working as expected, but i want KEY to be displayed instead of value