in reply to chunking up texts correctly for online translation

Hello Aldebaran,

haukex provided a solution to your question.

I have a suggestion regarding user interaction: Getopt::Long can parse parameters passed on to a script via the command line, i.e. the well known pattern translate.pl --from-lang XYZ --to-lang ZYX --config ABC --outfile 123 or even translate.pl --help to show available languages.

This is straight forward to implement and will help you to abstract and automate even more, for example, by creating higher level bash scripts, e.g. translate.bash "hello there" or translate.bash < /dev/telephone1 > /dev/telephone2

Here is an example

use Getopt::Long; my $outfile = undef; my $configfile = undef; my $infile = undef; if( ! Getopt::Long::GetOptions( "outfile=s", \$outfile, "infile=s", \$infile, "configfile=s", \$configfile, "help", sub { print "Usage : $0 --configfile C [--outfile O] [--infi +le I] [--help]\n"; exit 0; }, ) ){ die "error, commandline" } die "configfile is needed (via --configfile)" unless defined $configfi +le; my $inFH = <STDIN>; # read input from stdin by default, unless an in f +ile is provided if( defined $infile ){ open $inFH, '<', $infile or die "opening input +file $infile, $!"; } my $instr; {local $/ = undef; $instr = <$inFH> } if( defined ($infile){ close $inFH } # do similar for outfile and STDOUT ... # and call your module translate, input text is in $instr ...

Replies are listed 'Best First'.
Re^2: chunking up texts correctly for online translation
by Aldebaran (Curate) on Jul 02, 2019 at 21:39 UTC

    Struggling to get basic functionality here. I'm removing the option to use STDIN for input, as I have a translate shell already that covers this functionality for me. Parts are working, for example, I die if no value for config is supplied. I can't seem to get to our favorite example text of late:

    $ ./1.get_opt.pl --configfile C --outfile /home/bob/Documents/meditati +ons/Algorithm-Markov-Multiorder-Learner-master/output/1.txt --infile +/home/bob/Documents/meditations/Algorithm-Markov-Multiorder-Learner-m +aster/data/2.short.shelley.txt Use of uninitialized value $inFH in <HANDLE> at ./1.get_opt.pl line 34 +. readline() on unopened filehandle at ./1.get_opt.pl line 34. $ cat 1.get_opt.pl #!/usr/bin/perl -w use 5.011; use Getopt::Long; my $outfile = undef; my $configfile = undef; my $infile = undef; if ( !Getopt::Long::GetOptions( "outfile=s", \$outfile, "infile=s", \$infile, "configfile=s", \$configfile, "help", sub { print "Usage : $0 --configfile C [--outfile O] [--infile I] [--h +elp]\n"; exit 0; }, ) ) { die "error, commandline"; } die "configfile is needed (via --configfile)" unless defined $configfi +le; my $inFH; if ( defined($infile) ) { open my $inFH, '<', $infile or die "opening input file $infile, $!"; } my $instr; { local $/ = undef; $instr = <$inFH> } if ( defined($instr) ) { say "input is $instr"; } $
      my $inFH; if ( defined($infile) ) { open my $inFH, '<', $infile or die "opening input file $infile, $!"; } my $instr; { local $/ = undef; $instr = <$inFH> } if ( defined($instr) ) { say "input is $instr"; }

      $inFH is being abused! You declare it, then you declare it again in an inner scope. Whatever value it gets from that inner scope is forgotten as soon as it comes out (of that scope). Then you slurp it ... but it's already closed (I mean $inFH) because a filehandle exiting the scope is closed automatically, see When do filehandles close?.

      So perhaps something like this?:

      my $instr; if ( defined($infile) ) { my $inFH; open $inFH, '<', $infile or die "opening input file $infile, $!"; { local $/ = undef; $instr = <$inFH> } close $inFH; # just polite to + close it } if ( defined($instr) ) { say "input is $instr"; } else { die "sorry, you did not specify an input either via a file or + via [other ways of specifying an instr] " }