slloyd has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write a proxy that will only proxy a preset list of URLs (for my kids). Can I do that with Http::Proxy? For some reason, I cannot wrap my brain around how...
#!perl use HTTP::Proxy; my $port=8080; # initialisation my $proxy = HTTP::Proxy->new( port => $port ); # this is a MainLoop-like method print "Running proxy on port $port\n"; $proxy->start;
UPDATE: Thanks for all the replies but, with the exception of tirwhan , no one really seemed to want to answer my question. I guess it is much funner to advise me on my parenting skills. Anyway, for anyone else who might be looking.. Here is the finished code..
#!perl use strict; use HTTP::Proxy; use HTTP::Proxy::Engine; use HTTP::Proxy::Engine::NoFork; use HTTP::Proxy::HeaderFilter::simple; my $port=8080; my %Allowed=( 'pbskids.org' => 1, 'www.basgetti.com' => 1, 'disney.com' => 1, ); my $proxy = HTTP::Proxy->new( port => $port ); $proxy->push_filter ( request => HTTP::Proxy::HeaderFilter::simple->new( sub { my ($self, $headers, $message) = @_; my $host=$headers->{host}; $host=lc(strip($host)); my $uhost=getUniqueHost($host); $uhost=lc(strip($uhost)); my $allow=$Allowed{$host} || $Allowed{$uhost} || 0; if(!$allow && $host !~/localhost$/is){ print "$host not allowed\n"; $headers->{host} = "localhost"; $headers->{cookie} = ''; } }) ); print "Running proxy on port $port\n"; $proxy->start; ######################### sub getUniqueHost{ my $inhost=shift || return; my $uhost; if($inhost=~/\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/){$uhost=$inhost;} elsif($inhost=~/([A-Z0-9\-]+)\.([A-Z0-9\-]+)\.([A-Z0-9\-]+)/is){$u +host=$2 . '.' . $3;} else{$uhost=$inhost;} $uhost=lc($uhost); return $uhost; } ############### sub strip{ my $str=shift; if(length($str)==0){return;} $str=~s/^[\r\n\s\t]+//s; $str=~s/[\r\n\s\t]+$//s; return $str; }

-------------------------------
by me
http://www.basgetti.com
http://www.kidlins.com

Replies are listed 'Best First'.
Re: Can I use Http::Proxy to intercept and deny URLs?
by tirwhan (Abbot) on Feb 26, 2006 at 07:37 UTC

    Did you read the documentation? Take another look and read the section on FILTERS, this should answer your question.

    Also, be aware that filtering proxies are really just a weak band-aid and relatively easy to get around.


    All dogma is stupid.
      Also, be aware that filtering proxies are really just a weak band-aid and relatively easy to get around.

      Blacklist proxies and content-filtering proxies are weak. Whitelist proxies, such as he's talking about using, are not so weak, although their usefulness is limited to situations where it's acceptable to block pretty much the whole internet with a few exceptions.

      (I'm assuming here that he's going to run the proxy on the firewall, not on the desktop, and that the firewall will be set up to drop any unproxied traffic. Otherwise of course they'll just change the browser setting so it doesn't use the proxy.)

      This wouldn't be my approach to internet access for children, granted. My approach would be to keep the PC in the living room, where they can't use it without being observed. That assumes there's ALWAYS an adult with them, but there isn't any other sane way to raise children, IMO. In any case, if they're left unsupervised there is *NOTHING* you can do to prevent them from viewing random content on the internet (or, worse, on television), because they'll view it at a friend's house.


      Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. Why, I've got so much sanity it's driving me crazy.
        Whitelist proxies, such as he's talking about using, are not so weak,

        I largely agree with your post, just to clarify, for a whitelist proxy to be effective you need to run it on a separate gateway host which firewalls your network from the Internet (as you say). You also need to

        • Drop all egress traffic at the firewall (not just HTTP), and run a filtering proxy for any services you wish to use (e.g. DNS, SMTP, POP/IMAP, FTP)
        • Disallow encrypted connections (no HTTPS).
        • Be very careful in your list of sites to allow (e.g. no search engines or sites which allow posting of HTML)

        At that point you've crippled the Internet connection to the point of very limited usefulness and set yourself up for a whole lot of work (and you're still not 100% secure, those are just the more obvious avenues of circumvention). Internet censorship is really hard, very seldom reasonably justifiable and a really stupid thing to do in the context of a family IMO.


        All dogma is stupid.
Re: Can I use Http::Proxy to intercept and deny URLs?
by spiritway (Vicar) on Feb 26, 2006 at 07:36 UTC

    I can well understand your concern over having your kids exposed to some of the unsavory parts of the Internet, but it seems that it might do you better to talk to your kids and keep in good communication with them, than trying to block their access to certain Websites. As you've mentioned previously, your kids are pretty good with the computer, so they would likely be able to bypass your efforts.

    It seems to me that if you used a proxy via Perl, it would be fairly simple to stop it. A CTRL-ALT-Delete would activate the Task Manager, and your kids could then either kill the application, or simply kill the processes that are running that have to do with proxies. AFAIK, it's also fairly simple to just use the browser's preferences to disable or bypass a proxy.

    Having said all that, you might want to try the documentation found at HTTP::Proxy, which appears to have some of the missing pieces for your script. Good luck in your endeavors.

      A CTRL-ALT-Delete would activate the Task Manager, and your kids could then either kill the application, or simply kill the processes that are running that have to do with proxies.

      Not if the proxy is run as administrator and the kids accounts do not have that privilege.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        ...or if the proxy lives on another machine on the intranet, and only that proxy can head out the firewall.

        --MidLifeXis

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Can I use Http::Proxy to intercept and deny URLs?
by madizen (Sexton) on Feb 26, 2006 at 19:22 UTC

    I've been through the basic filtering pros and cons arguments many times (and will be many times again I'm sure), as custodian of network connectivity at a public library system. Without rehashing good points already raised, let me just throw in that where I work the feds gave us an ultimatum "filter or lose certain funds" and we chose to lose those funds on two grounds: (a) there is no technology that actually puts us in compliance with the laws as written, and (b) trying would be more expensive than losing said funds. We just can't put a federal judge inside each of our computers (we tried, but it was a tight fit, and their robes kept igniting on the heat sinks ;).

    That said, we never considered white listing a viable solution. In the case of a parent trying to put parental controls on their home network, I'd personally agree that's a fine thing (though the method of making the access points themselves public and well-supervised is far better, as noted by others). If you must white list, consider a reasonably simple canned solution like Public Web Browser.

    OMG... I cannot believe I just recommended not using Perl.

    Assuming your kids computer is Windows, you can easily type your list of allowed sites into its proxy settings (as Administrator) and put it into kiosk mode to let regular users access your preferred sites only.

      Wow! Thanks for all your replies! Now that you have all determined what kind of parent I am, let me fill you in a bit more.

      I have four children, all girls, ages 11,9,6, and 4. Our computers are in the living room. Our internet provider also provides a pornography filter on our internet connection.

      I teach my kids about the dangers of the internet all the time. My intention for a whitelist program is to provide a protection until they understand. Even my 4 year old gets on the internet -- there is no reason to expose her to garbage if I can prevent it.

      Now for me. I love to figure things out and build solutions where I can... I could use an existing program perhaps but then if I wanted it to work differently, I would not have control to fix it. So far, except for the lack of a good threading model, Perl has been my language of choice for hundreds of solutions I have wrote.

      Now you have the inside story folks... Thanks again for your comments. :)

      -------------------------------
      by me
      http://www.basgetti.com
      http://www.kidlins.com

        Well, it seems like you're a Windows person, but are pretty well versed with technology. Have you thought about setting up a linux box as a proxy? There are plenty of software out there that do this sort of thing (iptables is the one that comes to mind now), and most modern distributions have GUI interfaces, so are pretty easy to learn. If firewall/proxy is on a separate machine, it'd be harder to get past.
Re: Can I use Http::Proxy to intercept and deny URLs?
by blue_cowdawg (Monsignor) on Feb 27, 2006 at 21:52 UTC
        I am trying to write a proxy that will only proxy a preset list of URLs (for my kids).

    First off, let me just say that I am not a big fan of using technology to solve "discipline issues." Having said that let me move on. I think that I do understand your motivation even if I don't 100% agree with it.

    I would be more inclined to use either an Apache web server as a proxy, Privoxy, or some other "out of the box" solution rather than rolling my own. You could create a logic tree with the proxy that says "allow these sites, but drop requests elsewhere. "

    An anecdote: a major telecommunications firm that I used to work for asked that I put up a proxy filter for employees wanting to surf the internet from work based on keywords. A long story short, that was all well and good (not really) until the American Cancer Society ran a special program about women's health. You guessed it, when they tried to go there the filter blocked them since the site had the word "breast" in it and that was on the exclusion list at the orders of the Executive Committe :-D


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg