http://qs1969.pair.com?node_id=11116742


in reply to removing perldata , hashref from XML file

I assume this is a followup to your previous post "Run a perl code on all input files and have the results in different output files", so what you've shown here is the Perl data structure returned by TAP3::Tap3edit's $tap3->structure run through XML::Dumper. That module allows me to parse back the XML into the data structure I show below (anonymized, in case this happens to be real data? Update: SaraMirabi confirmed in the CB that it is fake data) - it would have been better if you'd showed us the Perl data structure to begin with, using Data::Dumper or Data::Dump.

I think it would be best if you didn't rely on munging the output of XML::Dumper, but instead build the XML output from the Perl data structure yourself, because that gives you full control over the produced XML. Personally, although it's a bit more verbose, I prefer to write XML with XML::LibXML and some helper functions. The following is an example of how one might go about that, the idea being that you can adapt this however you need.

Update: Significantly simplified toxml and made it more flexible (output is unchanged).

#!/usr/bin/env perl use warnings; use strict; use open qw/:std :utf8/; use XML::LibXML; my $data = { employee => [ { "************" => "M", age => { dob => "01-04-1993" }, department => { departmentname => "Operations", title => "Manager" }, location => { town => { county => "Somewhere", name => "Someplace" } }, name => { forename => "John", surname => "Doe" }, }, { "************" => "M", age => { dob => "12-12-1979" }, department => { departmentname => "Internet", title => "Developer" }, location => { town => { county => "Somewhere", name => "Othertown" } }, name => { forename => "Jane", surname => "Doe" }, } ] }; my $doc = XML::LibXML::Document->createDocument('1.0', 'UTF-8'); toxml($doc, 'data', $data); print $doc->toString(1); sub toxml { my ($parent, $name, $data) = @_; my @args = $name=~/\A\w+\z/ ? ($name) : ('value', name=>$name); if ( ref $data eq 'HASH' ) { my $el = newel($parent, @args); toxml($el, $_, $data->{$_}) for sort keys %$data; } elsif ( ref $data eq 'ARRAY' ) { toxml(ref eq 'ARRAY' ? newel($parent, @args) : $parent, $name, $_) for @$data; } elsif ( ref $data ) { die "Can't handle $data (yet)" } else { newel($parent, @args)->appendText($data) } } sub newel { my ($parent, $name, %attrs) = @_; my $el = $parent->ownerDocument->createElement($name); $el->setAttribute( $_ => $attrs{$_} ) for keys %attrs; if ( $parent->nodeType==XML_DOCUMENT_NODE ) { $parent->setDocumentElement($el) } else { $parent->appendChild($el) } return $el; } __END__ <?xml version="1.0" encoding="UTF-8"?> <data> <employee> <value name="************">M</value> <age> <dob>01-04-1993</dob> </age> <department> <departmentname>Operations</departmentname> <title>Manager</title> </department> <location> <town> <county>Somewhere</county> <name>Someplace</name> </town> </location> <name> <forename>John</forename> <surname>Doe</surname> </name> </employee> <employee> <value name="************">M</value> <age> <dob>12-12-1979</dob> </age> <department> <departmentname>Internet</departmentname> <title>Developer</title> </department> <location> <town> <county>Somewhere</county> <name>Othertown</name> </town> </location> <name> <forename>Jane</forename> <surname>Doe</surname> </name> </employee> </data>

Replies are listed 'Best First'.
Re^2: removing perldata , hashref from XML file
by SaraMirabi (Novice) on May 13, 2020 at 10:45 UTC

    Many Thanks for your quick response. My exact problem is about introducing each tag in perl code, i mean:

    my $data = { employee => [ { "************" => "M", age => { dob => "10-02-1917" }, department => { departmentname => "Operations", title => "Manager" }, location => { town => { county => "East Ay1", name => "Auchinleck" } }, name => { forename => "John", surname => "Doe" },

    My xml file is more than 2000 lines with different tags and also it is encoded, I decode it and make it as XML file, introducing each tag in my perl code is not possible.

      You don't have to introduce the structure. Just use the return value of $tap3->structure.

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

        Hi,

        Many Thanks for answering me, would you mind please give me a guide where in the code i should add the return value?

        !/usr/bin/perl -w use strict; use warnings; use Data::Dumper; use XML::Dumper; use TAP3::Tap3edit; $Data::Dumper::Indent=1; $Data::Dumper::Useqq=1; my $dump = new XML::Dumper; use File::Basename; my $perl=''; my $xml=''; my $tap3 = TAP3::Tap3edit->new(); foreach my $file(glob 'X*') { my $files= basename($file); my $filename=$files.".xml\n"; print $filename; $tap3->decode($files) || die $tap3->error; $perl = $tap3->structure; $dump->pl2xml($perl, $filename); }
      My exact problem is about introducing each tag in perl code ... introducing each tag in my perl code is not possible.

      Sorry, I don't understand what you mean here. Please see Short, Self-Contained, Correct Example and I know what I mean. Why don't you?. As I said, you can adapt the code I've posted (which I just updated) however you need, but you also haven't shown your expected output.

      Update: Ah, I see choroba may have interpreted your issue better than I have. Yes, my $data is meant to be the same thing as the $perl variable from your previous node.

        Yes, You are right.
Re^2: removing perldata , hashref from XML file (updated)
by SaraMirabi (Novice) on May 18, 2020 at 07:52 UTC
    Dear Haukex,

    I tried to replace my $data to your Perl code and I faced with below error:

    perl DECODEi.pl Can't handle Math::BigInt=HASH(0x22bcc10) (yet) at DECODEi.pl line 904 +07.
      Can't handle Math::BigInt=HASH(0x22bcc10) (yet) at DECODEi.pl line 90407.

      This error message is surprising in that Math::BigInt overloads stringification, and the error message should not be showing the object name like that. You'd have to show a SSCCE that reproduces this issue to investigate further.*

      Given this, I don't know if the following will work in your case, but my code can be extended to support objects that overload stringification by adding the following to the top of the file, just under the other use statements:

      use Scalar::Util qw/blessed/; use overload ();

      And this just before the line "elsif ( ref $data ) { die "Can't handle $data (yet)" }":

      elsif ( blessed($data) && overload::Overloaded($data) && overload::Method($data,'""') ) { newel($parent, @args)->appendText("$data") }

      * Update: It can be explained if you're reading a dump of a Perl data structure into a script that doesn't load the corresponding module. In this case, my code above won't work either unless you load the module.