Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: (nrd) Mangling HTML to protect content, and finding stolen HTML content

by newrisedesigns (Curate)
on Nov 08, 2002 at 16:48 UTC ( [id://211467]=note: print w/replies, xml ) Need Help??


in reply to Mangling HTML to protect content, and finding stolen HTML content

What makes you think it's something so sophisticated as a robot? It's probably someone with a browser cutting and pasting your copy. CSS and HTML entities won't help you there.

Use Server-Side Includes to #include virtual a perl script that will log visits to the page. Log the frequency, IP address (for comparing net-blocks), and the User-Agent. Also, include in a visual text footer and a comment in the source of each page a disclaimer on how "all copyright violators will be prosecuted" and other relevant legal-ese.

If your client base is a select few, use authentication to prevent the general populace from viewing the content.

Post a follow up if this doesn't cover what you want.

John J Reiser
newrisedesigns.com

  • Comment on Re: (nrd) Mangling HTML to protect content, and finding stolen HTML content

Replies are listed 'Best First'.
Re: Re: (nrd) Mangling HTML to protect content, and finding stolen HTML content
by earthboundmisfit (Chaplain) on Nov 08, 2002 at 17:21 UTC
    It's probably someone with a browser cutting and pasting your copy.

    I agree. We call these types of distributors 'trunk slammers' (mostly because before the advent of the web, they sold their products from the trunk of their cars and offered zero after market support). Most of them are not too bright and would view automated copy theft as something akin to reading ancient Greek.

    One of the strategies we've adopted to thwart unwanted viewing of our product info is to offer preferred customer discounts and require login before we serve up the goodies. On the stuff we do allow the general public to view, we pepper the HTML with custom tags and CSS class ids. You'd be surpised how infrequently the thieves bother to remove something like <p class="DD15893wankerbeans"> text </p> -- more proof in my mind that they are not too sophisticated in or concerned about their thievery. Hunting down stolen text is simply a matter of creating our own robots to search out these custom class names.

      earthboundmisfit++

      The CSS would work, if they copied your source, which I doubt the real idiots would do. Other than that, that's a great idea. You could go so far to include a
      <div style="display: none;">Don't be an idiot and steal this page. randomtexteasilyfoundviasearchengine </div>

      Good stuff. You don't even need a "discount" to compel someone to sign in. From my experience, most web users will sign up for anything, as long as the process isn't too complicated. And if the copy theft signs in/makes an account, you have his or her personal information. Crafty.

      Of course, you (generally speaking) shouldn't do anything more than use this to counter-act theft; if you do, outline it in the company's privacy policy, so users know exactly what's going on. I doubt your business wants a PR black eye for "stealing user information." </disclaimer>

      John J Reiser
      newrisedesigns.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://211467]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-25 02:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found