http://qs1969.pair.com?node_id=1073009

reqnode has asked for the wisdom of the Perl Monks concerning the following question:

Hello, my question is how to avoid writing 'use utf8' in every child script, which main script does include with do()? Imagine that i have japanese function names and all scripts around are using these functions. From estetical point of view, putting 'use utf8' to every script is useless, when i can tell perl that all my script sources use utf8. Any suggestions about how to do it?
  • Comment on avoid writing 'use utf8' in every script

Replies are listed 'Best First'.
Re: avoid writing 'use utf8' in every script
by Jim (Curate) on Feb 02, 2014 at 02:13 UTC

    Use a Unicode byte order mark in your Perl script files instead.

    As perldoc utf8 explains…

    Because it is not possible to reliably tell UTF-8 from native 8 bit encodings, you need either a Byte Order Mark at the beginning of your source code, or "use utf8;", to instruct perl.

    You can't bequeath or inherit Perl's comprehension of a source code file's character encoding to or from another file because the two files might be in two different encodings.

    Jim

Re: avoid writing 'use utf8' in every script
by kcott (Archbishop) on Feb 02, 2014 at 01:52 UTC

    G'day reqnode,

    What you really want may be possible; however, what you describe is illegal syntax.

    Consider this example script (see Note below):

    #!/usr/bin/env perl
    
    use utf8;
    
    sub smiley {
        print "Smiley\n";
    }
    
    sub ☺ {
        print "Smiley\n";
    }
    

    Here's what happens when you run this:

    $ pm_script_with_utf8.pl Illegal declaration of anonymous subroutine at ./pm_script_with_utf8.p +l line 9.

    The documentation for the utf8 pragma even tells you this:

    "One can have Unicode in identifier names, but not in package/class or subroutine names."

    If you provide a short code example, that might serve to illustrate exactly what you're trying to achieve.

    [Note: I've used <pre>, instead of <code>, tags for the code containing Unicode characters. This is to allow the actual characters, instead of character entity references, to be displayed. Please do the same with any such code you post.]

    -- Ken

      I took Japanese lessons in primary school. That was a long time ago, and I've forgotten much of it. However, I'm pretty sure that "☺" is not a Japanese word.

      use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

      Q. In your UTF-8 Perl script, which characters in the subroutine name "smiley" aren't Unicode characters?

      A. None of them. In fact, every character in your Perl script is a Unicode character.

      (The Perl documentation is inaccurate. It uses the word "Unicode" incorrectly in the sentence you quoted.)

      Jim

        Update: I replied to the wrong post. It should've been to Re^2: avoid writing 'use utf8' in every script, making this lame joke even lamer. If anyone with "movenodeability" wants to transplant this reply, it would be appreciated!

        tobyink:

        Odd, I always thought

        resembled the Kanji for Simiru, as in Simiru, Mirikuru gaaru!

        </end_of_lame_joke>

        ...roboticus

        Unicode: Making a job hard to do correctly (like i18n) nearly freakin' impossible.

      Lets say i have 2 scripts:
      main.pl code:
      #!perl
      use utf8;
      do 'main_include.pl';
      sub über
      {
      return shift;
      }
      main_include.pl code:
      use utf8;
      über('hi');
      

      So let's say i have a lot of files like main_include.pl and every is using über() function.
      I am trying to find a way tell perl that all inside scripts will use utf8-named functions and there is no need to indicate it every time i call them
Re: avoid writing 'use utf8' in every script (eval)
by Anonymous Monk on Feb 02, 2014 at 01:28 UTC

    doed( $foo ); use Path::Tiny qw/ path /; sub doed { my( $filetoeval ) = @_; my $toeval = "use utf8; " . path( $filetoeval )->slurp_utf8 ; eval $toeval; ## do-ed }
Re: avoid writing 'use utf8' in every script
by hdb (Monsignor) on Feb 03, 2014 at 08:05 UTC

    Teach your editor to add "use utf8;" whenever you create a ".pl" file.

      I am old jerk and love creating programs with less code possible to run on 2 MB RAM

        That is a sentiment I have a lot of sympathy for, but then, would it not be better to stick to ASCII (7 bits)?

        use utf8; does not use result in any code.

        >perl -MO=Concise,-exec -e"use utf8; $x='abc';" 1 <0> enter 2 <;> nextstate(main 7 -e:1) v:U,{ 3 <$> const[PV "abc"] s 4 <#> gvsv[*x] s 5 <2> sassign vKS/2 6 <@> leave[1 ref] vKP/REFC -e syntax OK

        It has a compile-time effect.