Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Weird encoding after grabing filenames

by Nik (Initiate)
on Jun 16, 2009 at 19:42 UTC ( [id://772123]=perlquestion: print w/replies, xml ) Need Help??

Nik has asked for the wisdom of the Perl Monks concerning the following question:

Hello i have the following code:
#!/usr/bin/perl -w use strict; use CGI::Carp qw/fatalsToBrowser/; use CGI qw/:standard/; use DBI; use Encode; print "Content-Type: text/html\n\n"; my @files = glob "$ENV{'DOCUMENT_ROOT'}/data/text/*.txt"; my @menu_files = map {/([^\/]+)\.txt$/} @files; Encode::from_to(@menu_files, 'ISO-8859-7', 'utf8'); print "@files"; print "@menu_files";
I don't know why but when i get the whole bunch of files form '/data/text' folder and print it although the path that preceed each of the files appear ok the filename itself appear liek squares(weird encoding)
here is the output of this code: http://tech-nikos.gr/cgi-bin/test.pl
i tried to switch the encoding from greek to utf8 but if i use it or don't use it output produces remains the same. Any ideas why?

Replies are listed 'Best First'.
Re: Weird encoding after grabing filenames
by moritz (Cardinal) on Jun 16, 2009 at 19:48 UTC
    print "Content-Type: text/html\n\n";

    Always include the encoding, ie Content-Type: text/html; charset=utf-8\n\n (or let CGI generate that for you, you use it anyway).

    Also you have to pass a scalar as the first argument to from_to, not an array. Read the docs for usage information.

      yes i do use the cgi equivalent in longer scripts that is 'print header( -charset=>'utf-8' );'

      Yes but i need to re-encode all filenames(@menu_files) to utf-8 at one step, can't it be done without a repeatiton loop?
      Why the path of files is taken correctly while the filaname appears like this?

        can't it be done without a repeatiton loop?

        You want to repeat an action without a loop?

        Well, I suppose you could do

        from_to($menu_files[0], 'ISO-8859-7', 'UTF-8') if @menu_files >= 1; from_to($menu_files[1], 'ISO-8859-7', 'UTF-8') if @menu_files >= 2; from_to($menu_files[2], 'ISO-8859-7', 'UTF-8') if @menu_files >= 3; from_to($menu_files[3], 'ISO-8859-7', 'UTF-8') if @menu_files >= 4; die("Need more!") if @menu_files >= 5;

        Does it count as a loop if the repeating is done by the person rather than the computer?

        Or if all you want to do is hide the loop

        sub from_to_multi { my $fr = shift; my $to = shift; from_to($_, $fr, $to) for @_; } from_to_multi('ISO-8859-7', 'UTF-8', @menu_files);

        But then you end up with two loops. One to place the elements on the stack, and one to process the elements on the stack.

        When I visited your web page (http://tech-nikos.gr/cgi-bin/test.pl), I was able to get a sensible display by telling my browser to treat the page as iso-8859-7 (greek). But I gather you want the text to be in utf8, which I think would be a good idea.

        As you follow moritz's good advice, you have to respect the docs regarding Encode::from_to(). Here's an easy way to do the required loop in a single line of code:

        Encode::from_to($_, 'ISO-8859-7', 'utf8') for (@menu_files);
        The reason why the path strings are showing up fine is because they are just plain ascii characters; it's only the file names that are non-ascii, and if the web server and browser don't agree on what the encoding is for those non-ascii characters, it's just noise.

        (updated to fix grammar in first sentence)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://772123]
Approved by Perlbotics
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-04-20 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found