in reply to Re: help with regular expression required
in thread help with regular expression required

Thank you Anonymous Monk for the code snippet.

Your help is much appreciated. The code works fine on the command line.

However I've been getting this nasty error message.

Wide character in print at /usr/lib/perl5/site_perl/5.14/XML/Twig.pm l +ine 8403. Wide character in print at /usr/lib/perl5/site_perl/5.14/XML/Twig.pm l +ine 8403. Wide character in print at /usr/lib/perl5/site_perl/5.14/XML/Twig.pm l +ine 8403.

I think this is due to the encoding.

Is there any chance I could force utf8 output along the lines of

use warnings; use strict; use XML::Twig; use File::Slurp; use utf8; use File::Slurp qw(read_file write_file); my $infile = shift; my $filename = $ARGV[2]; XML::Twig->new( keep_spaces => 1, twig_print_outside_roots => 1, twig_roots => { tu => sub { my ($twig, $elt) = @_; $elt->set_att('creationid','Simon Simonsen'); $elt->print; } }, )->parsefile( $infile ); write_file $filename, {binmode => ':utf8'}, $infile;

Thanks in advance for encourageing comments

Kind regards and many thanks

C

Replies are listed 'Best First'.
Re^3: help with regular expression required
by Anonymous Monk on Aug 14, 2014 at 21:22 UTC

    Since XML::Twig is writing to STDOUT, I'm not sure what your write_file is supposed to be doing...

    Encodings are covered in the XML::Twig docs. Here's some code that works for me:

    use warnings; use strict; use XML::Twig; open my $ofh, '>:utf8', '1097276_out.xml' or die $!; XML::Twig->new( keep_spaces => 1, twig_print_outside_roots => $ofh, twig_roots => { tu => sub { my ($twig, $elt) = @_; $elt->set_att('creationid',"Simon Sim\xF6nsen"); $elt->print($ofh); } }, )->parsefile('1097276.xml'); close $ofh;

    Also have a look at the aptly-named output_encoding option in XML::Twig.

      Dear Monks,

      Thank you so much for your help

      The code worked fine, but curiosity got the better of me and I tried to find a way how specify both infile and outfile on the command line.

      This was the point when I realised that I have not yet proper understood how functions and parameters work, especially how to pass arguments to a function. I guess it is back to books for me.

      While reading the code, I was thinking that parsefile was the crucial element and this is where my infile should go through. However I could not establish the connection between parsefile and the  $ofh open in the first lines.

      However when I tried the new code I always got the following message:

      .pm line 773. Couldn't open : No such file or directory at replace_tmx_ori.pl line 19. at replace_tmx_ori.pl line 19.

      I figure the programm does not know which file is passed for input

      Is there a possibility to specify the in and outfile in the code? Please find my endeavours below

      Comments and explanations are appreciated.

      my $infile = $ARGV[0]; my $outfile = $ARRGV[2]; open my $ofh, '>:utf8', $outfile or die $!; XML::Twig->new( keep_spaces => 1, twig_print_outside_roots => $ofh, twig_roots => { tu => sub { my ($twig, $elt) = @_; $elt->set_att('creationid',"Simon Simonsen"); $elt->print($ofh); } }, )->parsefile( $infile ); close $ofh;

      Thanks a mil in advance for helping us noobs out here

      kind regards C.
        my $outfile = $ARRGV[2];

        Perhaps you meant $ARGV[1]?

        I can only assume you removed the use warnings; use strict; from the top of the script, otherwise, Perl would have told you about the typo. So don't do that!