in reply to Special & Accented chars in nodes titles ==> [à la française]

Titles are Latin-1 text (and not HTML). Do you need any non-Latin-1 characters for French? Link descriptions [that come after the pipe (|) in links] are HTML.

So don't use HTML entities in titles. If you have a hard time typing accented characters on your keyboard, then you can type HTML entities someplace other than the title and then cut'n'paste the rendered characters into the title.

A worse problem would be if your browser tried to send UTF-8 to PerlMonks instead of Latin-1, but that doesn't appear to be the case.

Note how the title for my reply displays correctly but the title of your node displays as HTML entities.

Update: Another problem could be if you have your browser set to override the content encoding that PerlMonks sends out with every page we serve. Our pages are served in Latin-1, but if you tell your browser to use Latin-2 or something despite what the web site tells it to do, then some accented characters will be displayed as the wrong character.

- tye        

  • Comment on Re: Special & Accented chars in nodes titles ==> [à la française] (!ents)

Replies are listed 'Best First'.
Re^2: Special & Accented chars in nodes titles ==> [à la française] (!ents)
by dfaure (Chaplain) on Jun 28, 2004 at 17:04 UTC
    Do you need any non-Latin-1 characters for French?

    Hopefully, non-Latin-1 chars are not required for French. Anyway, this is nice to have PerlMonks/Everything encoding behaviour precised somewhere (I may add a note to the SDC Master Plan Wiki about it)...

    Thanks

    ____
    HTH, Dominique
    My two favorites:
    If the only tool you have is a hammer, you will see every problem as a nail. --Abraham Maslow
    Bien faire, et le faire savoir...

      Do you need any non-Latin-1 characters for French?
      Hopefully, non-Latin-1 chars are not required for French.

      Well, there is Œ / œ ... but there does not seem to be a consensus on whether they are truely required for French. ;-)

      print "Just another Perl ${\(trickster and hacker)},"
      The Sidhekin proves Sidhe did it!

        there is Œ / œ ... but there does not seem to be a consensus on whether they are truely required for French

        That depends on what your requirement is. If typography is of a concern, then they are mandatory. œuf, cœur and œuvre spring to mind. This is actually a very good litmus test for see how your server and browser speak to each other. Sometimes you see little diamonds, sometimes nothing, sometimes an OE ligature. You can also use either ISO Latin-9, or œ if those alternatives are available.

        There's also the AE ligature, that appears in both English and French, but fortunately that's part of ISO Latin-1. Unfortunately it's a rarer beast in French, and probably now considered archaic in English, apart from ægis and præternatual. Encyclopædia seems pretty archaic these days.

        Then of course there is the problem of the correct use of space around French punctuation characters. Guillemets, question and exclamation marks, semi-colons and probably a few other glyphs should have a thin non-breaking space before them (or after them in the case of the left guillemet).

        In the olde days this rendered with the   ISO entity ( but then your renderer needs to be programmed to deal with it ). Otherwise the modern alternatives appear to be the Unicode THIN SPACE (   ) or NARROW NO-BREAK SPACE (   ) entities.

        Note that the three different entities have been used to add spaces inside the three parentheses in the above paragraph (but no spacing here). What you see is what your browser gives you.

        Did I say typography is fun?

        - another intruder with the mooring of the heat of the Perl

      nice to have PerlMonks/Everything encoding behaviour

      This is specific to PerlMonks. I don't know what other Everything installations use these days, but PerlMonks used to interpret titles as HTML until I fixed it because it was causing problems and had the potential for even more abuses.

      - tye        

        Well, to sum up all that stuff, it seems that PM was initially designed with html in mind then patched several times ending up to support latin-1 encoding on input/output but nothing else. Do I am right?

        I suspect the storage of PM made with default table charsets (which is latin-1). Do I am right again?

        ____
        HTH, Dominique
        My two favorites:
        If the only tool you have is a hammer, you will see every problem as a nail. --Abraham Maslow
        Bien faire, et le faire savoir...