in reply to An internet garbage filter

Haven't tried running it, but i would suggest converting your banned_sites hash into an array of regexes ... that way you don't have the seperate lists of fixed hosts in the hash, and host regexes in the is_banned_site method.

Replies are listed 'Best First'.
Re: Re: An internet garbage filter
by pg (Canon) on Oct 27, 2003 at 07:21 UTC

    For anyone use this program, my suggestion is to have a big hash for banned sites, but only a much smaller array for banned sites expressed with regexp.

    For example, if there are four sites you want to ban:

    • a.foo.com
    • b.foo.com
    • c.foo.com
    • d.foo.com

    It is better to put them all in the hash for fixed site, instead of using regexp, unless that site has a rich variety of names. If it only has three or four different names, put them in hash.