LWP and Proxy

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: LWP and Proxy by Joost (Canon) on Oct 14, 2006 at 18:01 UTC
why have LWP connect to a proxy? I'm asuming its for people who want to hide their identity. Many companies have proxies as the only way to connect to the www. There are many reasons to use a proxy, for example to cache requests and improve performance, to monitor and/or fix pages, block suspicious content etc. When it connects to a proxy and fetches, lets say a file----does the proxy server actually do the requests and not the computer your on? yes, the proxy is the computer doing the request. some proxies supply an "X-Forwarded-For" header that should show the request chain, but anonymizing proxies obviously don't do that and besides there is no way to guarantee that the X-Forwarded-For header is correct. I'm asking this, not to build an automation script or any kind like that but to understand how easy it may be for someone It's extremely easy. all you have to do is set the "http_proxy" environment variable to a proxy address. "What should it profit a man, if he should win a flame war, yet lose his cool?"	[reply]
Re: LWP and Proxy by ikegami (Patriarch) on Oct 14, 2006 at 18:01 UTC
why have LWP connect to a proxy? To escape a firewalled network, mainly. They can also provide caching. ( For big web sites, sometimes you're actually connecting to a proxy. The proxy serves and caches the static content, while requests for dynamic content is forwarded to a server farm. ) When it connects to a proxy and fetches, lets say a file----does the proxy server actually do the requests and not the computer your on? Your computer sends a request to the proxy. Your computer waits while the proxy repeats the request to the server. Finally, the proxy repeats to your computer the answer from the server. would my access logs record the IP address from the proxy server or from the person's computer IP. The proxy's. The user often isn't even on the internet, which is why he's using a proxy.	[reply]
Re: LWP and Proxy by jbert (Priest) on Oct 14, 2006 at 18:30 UTC
You get a (mostly-reliable) IP address from a network (TCP) connection. If the client is connecting to the proxy and the proxy is connecting to your site, you'll see the IP address of the proxy. As noted by others, most proxies will supply the IP address of the client in an HTTP header, but this isn't what will get logged to your web server logs (I think - maybe apache and IIS have that as an option?) But for your case - I don't think you actually care whether someone being abusive is connecting via a proxy or not. If an IP address is being abusive, you probably want to suspend that IP address or take some other sanctions (disable signups from that IP, rate-limiting etc). In fact, if the IP you suspend is a proxy you're hurting everyone else who might be legitimately using it, you should probably be slightly less willing to ban the IP if you happen to know it is a proxy. If you're concerned about mass sign-ups, you may want to try other countermeasures. If you notice a spike in signups, perhaps disable them. Some character recognitiion (CAPTCHA-style) appears to be the state-of-the-art test for a human at the moment, but I'm told (fairly unreliably, but the idea seems sound) even these are being foiled by people who (quite cleverly) re-present the captcha to humans visiting their own site. e.g. someone runs a porn site and to get some freebies people have to type in the answer to a captcha. So you can serve the captcha from your sign-in page to a human, get the answer posted back to their server and then on to yours. Adding in a challenge/response over email is probably worthwhile too, if you're able to ask your users for an email address.	[reply]
Re: LWP and Proxy by Anonymous Monk on Oct 14, 2006 at 18:20 UTC
Makes so much more sense to me now. I didn't expect the concept to be that simple. Argh, no wonder why its so easy for people to send out spam and build those brute force/ automation scripts. thanks for the replies, Pete	[reply]