I have a set of HTML files which have some markings in it, like {{menu}}. See it as a template file. No, don't tell me I should use a real template engine, it is just a temporary work-around.

Now the fact is, that the first line of the HTML file is #!perl parse.pl. Now, the webserver knows to send the file to perl. And it does so. And perl, and parse.pl, receive the file.

Now two things have to be done: the shebang line has to be removed, and {{menu}} has to be replaced with the menu. Here is the code.
#!perl # Obligate use strict; use warnings; binmode STDOUT; binmode STDOUT, ":utf8"; print "Content-Type: text/html\n\n"; my $menu = do { open MENU, "<menu.htm"; local $/ = 0; <MENU> }; while (my $input = <>) { $input =~ s/\{\{menu\}\}/$menu/ieg; $input =~ s/^#!(.+?)$//g; print $input; }
When this script has to process a normal text file, everything goes just fine. #! disappears, {{menu}} gets replaced by the menu.
Now, when the file is encoded utf-8, the script will print the page, but none of the regexes work.
As from how I interpret the documentation, regexes should work on both normal bytes and characters. Apperently, it doesn't.

What do I do wrong?
Or, what should I do different?

Oh yeah: This is perl, v5.8.3 built for MSWin32-x86-multi-thread





"2b"||!"2b";$$_="the question"
Besides that, my code is untested unless stated otherwise.
magnum unum bovem audivisti

In reply to Regexp not performed when presented utf-8 data by muba

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.