in reply to sort order of imported xml data?
What about using a different module?
Using this you may handle hundreds of megabytes large XMLs of this structure as this 1) stores only what it needs to store of the subtags of <trans-unit> and 2) as soon as it's done with a <trans-unit> tag it forgets all its data.#!/bin/perl use strict; use warnings; use XML::Rules; my $parser = XML::Rules->new( start_rules => [ file => sub { my ($tag_name, $attrs, $context, $parent_data, $parser) = +@_; $parser->{pad}{file} = $attrs->{original}; $parser->{pad}{file} =~ s{game/stringtable/}{}i; }, ], rules => [ source => 'content', target => 'content', _default => '', 'trans-unit' => sub { my ($tag_name, $attrs, $context, $parent_data, $parser) = +@_; print EXTR qq{"$parser->{pad}{file}","$attrs->{id}","$attr +s->{source}","$attrs->{target}"\n}; return; }, ], ); open(EXTR, ">meep.csv") or die $!; $parser->parsefile( "meep.xlf");
What the script does is that when the parser parses the opening tag of <file> the script tweaks and remembers the original attribute in a "pad" - an attribute of the parser object designated to hold the script specific data. Then whenever it parses the complete <source> or <target> tag it remembers just the content and makes it readily available in the atribute hash of the parent tag and then whenever it parses the complete <trans-unit> tag it prints the remembered file name, the id attribute and the contents of the <source> and <target> subtags. And forgets the data of that tag.
Here is a version without using the pad and using a lexical filehandle:
#!/bin/perl use strict; use warnings; use XML::Rules; my $parser = XML::Rules->new( start_rules => [ file => sub { my ($tag_name, $attrs) = @_; $attrs->{original} =~ s{game/stringtable/}{}i; return 1; }, ], rules => [ 'trans-unit' => sub { my ($tag_name, $attrs, $context, $parent_data, $parser) = +@_; my $file = $parent_data->[-2]{original}; print {$parser->{parameters}{FH}} qq{"$file","$attrs->{id} +","$attrs->{source}","$attrs->{target}"\n}; # or # my $FH = $parser->{parameters}{FH}; #print $FH qq{"$file","$attrs->{id}","$attrs->{source}","$ +attrs->{target}"\n}; return; }, source => 'content', target => 'content', _default => '', ], ); open(my $EXTR, ">meep.csv") or die $!; $parser->parsefile( "meep.xlf", {FH => $EXTR}); close $EXTR;
|
|---|