Maintain the translation of text documents can be a hard thing... Specially to keep it up-to-date, to notice the changes.

So, I made a pair of scripts to help me in this task... The first takes a text file and generates a XML file ready to the translator. The second, takes the XML file and the Original file as Input and generate the Translated file. This is usefull because it will contain the most up-to-date content. After all, it's better to not have it translated than to have it outdated...

The first script

#!/usr/bin/perl use strict; use warnings; use Symbol; use XML::Writer; my $filename = $ARGV[0]; my $fh = gensym(); open $fh, $filename or die $!; my $writer = new XML::Writer(DATA_MODE => 1, DATA_INDENT => 0); $writer->startTag("translate", filename => $filename); my $this_para = undef; while (my $line = <$fh>) { if ($line eq $/) { paragraph($this_para); $this_para = undef; next; } else { $this_para .= $line; } } if ($this_para) { paragraph($this_para); } $writer->endTag("translate"); $writer->end(); sub paragraph { my $p = shift; $writer->startTag("p"); $writer->startTag("orig"); print $/; $writer->characters($p); $writer->endTag("orig"); $writer->startTag("trans"); print $/; $writer->characters($p); $writer->endTag("trans"); $writer->endTag("p"); }

The second script

#!/usr/bin/perl use strict; use warnings; use Symbol; use XML::Parser; local %Traduzir::ThisP; local %Traduzir::Data; local $Traduzir::Original; local $Traduzir::State; my $parser = new XML::Parser(Style => 'Traduzir::Parser'); $parser->parsefile($ARGV[0]); my $fh = gensym(); open $fh, $Traduzir::Original or die $!; my $this_para = undef; while (my $line = <$fh>) { if ($line eq $/) { paragraph($this_para); $this_para = undef; next; } else { $this_para .= $line; } } if ($this_para) { paragraph($this_para); } sub paragraph { my $p = shift; if (not defined $p) { print $/; return; } if (exists $Traduzir::Data{$p}) { print $Traduzir::Data{$p}.$/; } else { print $p.$/; } } package Traduzir::Parser; sub Start { my ($expat, $elem, %attrs) = @_; $Traduzir::State = $elem; if ($elem eq 'translate') { $Traduzir::Original = $attrs{filename}; %Traduzir::Data = (); } elsif ($elem eq 'p') { %Traduzir::ThisP = (); } } sub End { my ($expat, $elem) = @_; if ($elem eq 'p') { $Traduzir::Data{$Traduzir::ThisP{orig}} = $Traduzir::ThisP{tra +ns}; %Traduzir::ThisP = (); } elsif ($elem eq 'orig') { $Traduzir::ThisP{orig} =~ s/^\n//; $Traduzir::State = ''; } elsif ($elem eq 'trans') { $Traduzir::ThisP{trans} =~ s/^\n//; $Traduzir::State = ''; } } sub Char { my ($expat, $str) = @_; if ($Traduzir::State eq 'orig') { $Traduzir::ThisP{orig} .= $str; } elsif ($Traduzir::State eq 'trans') { $Traduzir::ThisP{trans} .= $str; } }
daniel

In reply to Maintaining text translations by ruoso

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.