If its possible and someone knows a way to actually implement this please let me know

Here's the sort of thing I had in mind -- it's limited but simple, and will trap the most likely problems (but you'll need to figure out what to do in your cgi application when those problems come up). I haven't tested it, except to confirm that it compiles, and to make sure that this sort of operation works as hoped for (at least, it did on macosx):

my_open( FH, ">", "foo.bar" ) or die "foo.bar: $!"; #... sub my_open { my ( $fh, $mode, $name ) = @_; open( $fh, $mode, $name ); }
Unfortunately, if the caller tries to pass a lexically scoped scalar as the filehandle arg, that doesn't work. There's a way around that, but I haven't tried to look it up. (Maybe other monks know how off the top of their heads.) Since the OP code appears to be using the old UPPERCASE style file handles, the module as provided should do okay.

To work this into your cgi apps, store the code as "GreekFile.pm" in one of the @INC paths, and edit your cgi scripts that do file i/o so they include:

use GreekFile qw/gr_open gr_opendir gr_readdir gr_glob/; # or just the relevant subset of these functions
Then, wherever you have  open( FH, "<$filename" ) simply change that to  gr_open( FH, "<", $filename ) assuming that $filename is a utf8 string. Similarly for opendir, readdir and glob calls. Just use utf8 strings in your app -- all the conversion to and from CP1253 for file names is handled inside this module.

package GreekFile; =head1 NAME GreekFile -- for transliterating Greek file names in MS-Windows =head1 SYNOPSIS gr_open( FILEHANDLE, $mode, $utf8name ); gr_opendir( DIRHANDLE, $utf8name ); $utf8_name = gr_readdir( DIRHANDLE ); @utf8_names = gr_readdir( DIRHANDLE ); @utf8_names = gr_glob( $utf8glob ); =head1 DESCRIPTION On a Windows system that uses single-byte CP1253 Greek characters (similar to ISO-8859-7) for naming files and directories, the functions provided by this module will allow a utf8-based application to work smoothly, by automatically converting file name strings between these two encodings as needed. This is presented as a "trial" or "proof-of-concept" version; it is limited in many ways, and does not support a lot of the flexibility of Perl's "open" and "opendir" functions. For example, it does not support the use of lexically-scoped scalar variables as file handles. The limitations could be fixed with some looking up in manuals... The gr_open and gr_opendir return the same success or failure values that the normal "open" and "opendir" functions would return. Likewise, gr_readdir behaves like normal readdir: it will return either a single file name or a list of file names, depending on whether it is called in a scalar or array context. The functions that take utf8 strings as input parameters (gr_open, gr_opendir and gr_glob), will do the conversion to CP1253 inside an eval block. If the conversion fails (either because the input string was not valid utf8, or because it contained valid characters that fall outside the CP1253 character set), they will return undef, and $! will contain an error message from the failure (i.e. the value of $@ that resulted from the failed eval). Error checking is not done on the file names that are read via readdir and glob. At worst, if a file name on disk contains single-byte characters that are not defined in the CP1253 character map, the conversion to utf8 will include "\x{FFFD}" for each such character. =cut use Exporter; use Encode qw(from_to); @ISA = qw(Exporter); @EXPORT_OK = qw(gr_open gr_opendir gr_readdir gr_glob); use strict; use warnings; sub gr_open { my ( $fh, $mode, $name ) = @_; eval { from_to( $name, "utf8", "cp1253", Encode::FB_CROAK ) }; if ( $@ ) { $! = $@; return; } open( $fh, $mode, $name ); } sub gr_opendir { my ( $dh, $name ) = @_; eval { from_to( $name, "utf8", "cp1253", Encode::FB_CROAK ) }; opendir( $dh, $name ); } sub gr_readdir { my ( @names, $name ); my ( $dh ) = @_; if ( wantarray ) { @names = readdir( DH ); from_to( $_, "cp1253", "utf8" ) for ( @names ); return @names; } else { $name = readdir( DH ); from_to( $name, "cp1253", "utf8" ); return $name; } } sub gr_glob { my ( $glb ) = @_; eval { from_to( $glb, "utf8", "cp1253", Encode::FB_CROAK ) }; if ( $@ ) { $! = $@; return; } my @names = glob( $glb ); from_to( $_, "cp1253", "utf8" ) for ( @names ); return @names; }

In reply to Re^3: Encodings problem by graff
in thread Encodings problem by Nik

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.