do to read from UTF-8 files/array.

nikolay has asked for the wisdom of the Perl Monks concerning the following question:

Have a good time of the day.

Am i correct in supposing that there is no way to read an array (in UTF-8 encoding) w/ «do» -- in case the array contains non-english characters?

I tried this way:

File «q» contains:

q~(?^u:йцу(\W))~, qq~фыв$1~

Script:

use utf8::all;
use Encode;
# This reads unreadable characters.
@a=do 'q';
# So, i decode this (notwithstanding utf8::all usage):
$a[0]=decode( 'UTF-8', $a[0] );
$a[1]=decode( 'UTF-8', $a[1] );
# I save the decoded data (for later usage):
open SVITOK, '>q';
$sod='q~'.$a[0].'~, qq~'.$a[1].'~'."\n";
print SVITOK $sod;
close SVITOK;
# Now, i read it again. -- That reads unreadable characters again.
@a=do 'q';
[download]

Comment on do to read from UTF-8 files/array. Download Code

Replies are listed 'Best First'.
Re: do to read from UTF-8 files/array. by choroba (Cardinal) on Sep 16, 2015 at 13:57 UTC
do evaluates the file as code. If it contains UTF-8 characters, it should start with `use utf8;` [download] Then, no decoding is needed; but you have to prepend the clause to the output file, too: `print SVITOK "use utf8;$sod";` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l] [select]
Re^2: do to read from UTF-8 files/array. by nikolay (Beadle) on Sep 18, 2015 at 09:54 UTC
Thank you very much, Choroba! -- I didn't think, that the «use utf8;» could be placed write in the array body!	[reply]
Re^3: do to read from UTF-8 files/array. by choroba (Cardinal) on Sep 18, 2015 at 10:19 UTC
It's not the array body, it's the code that produces the array. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply]
Re: do to read from UTF-8 files/array. by Anonymous Monk on Sep 16, 2015 at 13:34 UTC
The outermost character-encoding that you show here is HTML Entities ... are you first decoding that? I don't see it.	[reply]
Re^2: do to read from UTF-8 files/array. by choroba (Cardinal) on Sep 16, 2015 at 13:46 UTC
PerlMonks can't display some characters in the `<code>` tags. You should use `<pre>` instead, or write the code in a way it doesn't contain the characters themselves. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l] [select]