A while back I asked about using encodings with filtering from the command line. Now I am wondering how to handle modules that have encodings.

It looks like when a use (or require/import) is processed the encoding/filtering is set back to the default while that module is compiled. That's fine unless the module has an encoding and possibly requires a filter.

Any ideas? Thanks!

Note: If there's no way to do it "aboveboard" I am willing to hack Perl itself. I've been poking around in the source looking for a place to have it insert a "use encoding qw(whatever),Filter=>1" in the token stream going to the compiler.


UPDATE 30 August 2005

I didn't think there was going to be an easy way so I went the hack Perl route. Here's what I did.

I examined the code for handling "preambles" (i.e., stuff that works with PL_preambleav) to see how it works. From some tracing of how tokens are handled in toke.c I saw that I couldn't just hook into this as-is because the preamble code isn't done at the right time. Normally it is handled just before the main script file is processed. But the code is all there to insert arbitrary strings into the token stream so I copied it to where it would work and tweaked it slightly.

I looked at where the file name is handled for requires in PP(pp_require) in pp_ctl.c. I added some code there to query the file and see if it needed a non-ASCII encoding. If so, it sticks the right "use encoding..." into PL_preambleav.

And that's all it took; it works just great (with minimal testing done). This probably isn't the best (or even a good) way of handling it but it works for my purposes.


In reply to encoding with filtering for modules by BillSeurer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.