in reply to Mangling HTML to protect content, and finding stolen HTML content

One solution may be to reduce the content in the HTML. That can be done by using graphics for much of the content. This seems to be a growing trend on sites I've visited recently. It requires some creative work for search engine submission, but if the important text is in a graphic, it is possibly less vulnerable to theft, particularly if the graphic contains the a proper copyright notice. You can also encode data in the graphic using a process called steganography. See this site for some tools to help you out. I could not find a CPAN module for this.

If you have encoded data in the image, and it is stolen, you should be able to use a decoding tool to show that it is indeed your image. Combined with a copyright embossed on the image, you are probably much safer.

(You could also display your catalog as PDF, but there may be issues regarding plug-ins, load time, etc.)

HTH, --traveler

  • Comment on Re: Mangling HTML to protect content, and finding stolen HTML content

Replies are listed 'Best First'.
Re: Re: Mangling HTML to protect content, and finding stolen HTML content
by isotope (Deacon) on Nov 08, 2002 at 18:30 UTC
    I'm fortunate enough to have DSL, but for the majority of Americans (dunno about the rest of the world), waiting 5 minutes for all the 'text' graphics to load over their good old analog modems might make them think twice about shopping there. Broadband is nowhere near universal in the US.

    --isotope
    http://www.skylab.org/~isotope/
      I don't have DSL, either. I know its an issue, but if the graphics have low enough resolution, they can be pretty fast. In fact, I have seen some graphic sites faster than some HTML if the HTML has lots of complex rendering to do. It may take some experimentation to find the best mix of graphics and html, but these days some graphics seem to load very fast, even over slow links.

      --traveler