In one of my hobby related tasks I get to create schedules for workers working rings at a dog trial. One of the annoyances I run into is when I have cells that have something like

20" - 24" walk through
the quotes and the dashes end being morphed into wide characters when I save the spreadsheet as HTML. When those wide characters get sent to a browser they end up looking like garbage instead of the dashes and quotes that I started with.

Using the od -c command I found that there was an offending set of octals that showed up repeatedly everywhere that this offense occured. What follows is a one liner that removes them quite nicely.

perl -spi -e 's/\342\200\235/"/g' dvgsdc.jsp

Replies are listed 'Best First'.
Re: Saving spreadsheets to HTML that have quotes in them.
by jhourcle (Prior) on Sep 22, 2006 at 16:29 UTC

    You'd be better off just declaring the correct character set for your page:

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

    Otherwise, all other non-ascii characters are going to give you problems as well. (em dashes, en dashes, other 'smart' quotes, etc.)

          You'd be better off just declaring the correct character set for your page:

      Didn't really want to have to hand-edit something that shouldn't need editing in the first place. I'm not sure if I didn't try that at one point and have it not work.


      Peter L. Berghold -- Unix Professional
      Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
        Didn't really want to have to hand-edit something that shouldn't need editing in the first place. I'm not sure if I didn't try that at one point and have it not work.

        Who said anything about hand editing?

        perl -pi -e 's#</head>#<meta http-equiv="content-type" content="text/html; charset=utf-8"></head>#' dvgsdc.jsp

        You could also adjust the server to send the correct HTTP header, so it's not left to the browser to assume.

        If you're already sending a charset other than utf-8, then yes, it might take a little more effort, but odds are, it could be replaced just as easily with a script.