Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hey all: I have a quick question/problem. I have a program that converts files (not important) into HTML. The problem is when these files are created into html by this prog, what should be an li
  • shows up as ï ascii(357). I tried a simple replace but I am not too good with regex yet. How can I change this funny character into the ul or li that I want? Any help or hints is greatly appreciated!
  • Replies are listed 'Best First'.
    Re: funny character
    by Abigail-II (Bishop) on Jun 06, 2002 at 17:30 UTC
      Well, it cannot be ASCII 0357, and there are only 128 (0200) code points defined in ASCII....

      What you want is a simple replace:

      s/\0357/•/g
      (Or whatever entity you want).

      Abigail

        The character he is talking about is character value 239. That is 0357 in octal notation.
        Update: changed "ASCII". Good catch, belg4mit.

        Makeshifts last the longest.

          Errm, that wasn't Abigail-II's point exactly (octal vs. decimal). ASCII only defines 0-127, 128-255 are 8th bit and non-standard (in as much as sometimes random countries did random things. Heck, Germany even mucked about with []\{}|~). The char in question is probably from ISO-8859-1. More on character sets.

          --
          perl -pew "s/\b;([mnst])/'$1/g"

        Greetings,

        Looks you did not test...

        c:\temp>perl -e"s/\357/<li>/g&&print $_ while(<>)" funny.txt <li><li><li> c:\temp>perl -e"s/\0357/<li>/g&&print $_ while(<>)" funny.txt c:\temp>
        funny.txt contains 3 samples of the offending character.
        Cheers,
        alf
        You can't have everything: where would you put it?