in reply to Untainting URLs and their descriptions

For your URL's about all you can do is make sure that they are valid syntactically (they look like URL's) and you can check them with the LWP module to make sure they actually retrieve a page. If the URL passes these 2 tests it should be acceptable. (who knows what the content of the page it links to is, but at least it "works" as an URL).

There is more danger from accepting text from users. It could contain many harmful (either directly by doing naughty things, or indirectly by just screwing up your formatting) tags. A good technique is to have a limited set of HTML tags that a user could enter and dump any others (HTML::Parser is be good for this). Definitely dropping the tags that could be a major source of problems (script, frame, form, iframe, etc...). If you want to be paranoid, just drop all HTML tags from users and you will probably be OK.

  • Comment on Re: Untainting URLs and their descriptions