Kanishka has asked for the wisdom of the Perl Monks concerning the following question:

I am a Perl novice, I have writtrn a program to retrieve stock quotes from www.CSE.lk. We are using a proxy sever to browse the web. Every time i run this program the result is retrieved from the proxy cache, not from the web. I need to know how to get around the proxy. Any one who could help me?????
use LWP::Simple; use File::stat; print "CSE Stock Quote v1.1\n"; while ($company == ""){ FLAG: print "ENTER COMPANY CODE:"; chop ($company = <STDIN>); # Get the Company code from the command lin +e if ($company =~ /EXIT/i){ last; #exit the program when typed 'EXIT' } else{ $reponse = get("http://203.115.0.6/servlet/RealIndicesServlet"); # +get the information from this URL open(OUTFILE, ">c:\\CSE.txt") or die "Can't open out File: $!\n"; +# Make and OPen a file print OUTFILE $reponse; # SAve information In a file close OUTFILE; open(WORKFILE, "c:\\CSE.txt") or die "Can't open work File: $!\n"; + #Re open CSE.txt for get information $lineNo = 0; while(($ligne=<WORKFILE>) ne ""){ #Read file line by line $lineNo++; if ($ligne =~ /^$company$/ix){ #Match the Company code $CompanyLNo = $lineNo; #Get the line number of th +e company code print "$CompanyLNo\n"; open(WORKFILE2, "c:\\COMLIST.txt") or die "Can't open work F +ile: $!\n"; while(($comlist=<WORKFILE2>) ne ""){ if ($comlist =~ /$company/ix ){ @comname = split (/\,/, $comlist); } } } elsif($company =~ /\d\s/){ # avoids invalid company names with d +igits goto FLAG; } } close(WORKFILE); open FILE, "c:\\CSE.txt"; # re open CSE.txt @lines = (<FILE>)[$CompanyLNo+1..$CompanyLNo+3]; # get data in 2nd to +4th lines which preceeds the company code line close (FILE); $filepath = "c:\\CSE.txt"; stat($filepath); unlink $filepath or die "Couldn't delete $filepath: $!"; if (@comname[1] =~ /NULL/){ print "STOCK PRICE OF Company Name N/A \n"; } else{ print ("STOCK PRICE OF ", @comname[1]); } print ("Now Price Rs: ", $NowPrice = @lines[0]); if ($NowPrice =~ /\+/){ print ("Price UP By Rs: ", $PriceChange = @lines[1]); } elsif ($NowPrice =~ /\-/){ print ("Price DOWN By Rs: ", $PriceChange = @lines[1]); } elsif ($NowPrice =~ /\~/){ print ("Price UNCHANGED By Rs: ", $PriceChange = @lines[1]); } print ("Quantity Traded ", $Quantity = @lines[2],"\n"); } }

Replies are listed 'Best First'.
Re: How to Get around Proxy
by tachyon (Chancellor) on Oct 27, 2003 at 07:48 UTC

    Use LWP::UserAgent for more fine grained control. You can also fake being IE (agent) rather than LWP which makes it just that bit harder to catch you (see below). If you can't bypass the proxy one reasonably reliable trick to make proxies refresh is to add something to the query string each time either ?random_stuff=12345 or &random_stuff=12345 depending if there is already a query_string. The proxy sees this as a new URL so goes to fetch it, the target server will *usually* ignore the extra data (although some scripts *will* implode deliberately or accidentally). As always YMMV but this seems the most reliable method of making sure you have fresh data.

    If you are trying to rip data from most live feed stock price websites they will almost certainly block you at some stage. You are neither the first nor the last to want something for nothing. There are plenty of relatively low cost XML data feeds about. If you are parsing the data out it generally breaks when the web designers update the site.

    use LWP::UserAgent; use Data::Dumper; my $domain = 'http://blah.com'; my $ua = LWP::UserAgent->new; $ua->agent('Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Reading + Logs - You must be bored ;-)'); # use proxy.... my $proxy = 'http://some.proxy.au:8080'; $ua->proxy( 'http', $proxy ); # or don't use proxy...... $ua->no_proxy($domain,...) my $request = HTTP::Request->new( 'GET', $URL ); my $response = $ua->request( $request ); print Dumper $response;

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: How to Get around Proxy
by inman (Curate) on Oct 27, 2003 at 14:56 UTC
    Try setting the If-Last-Modified header in the request to the current time. This should make your proxy get the latest version rather than use a cache version.

    The code below uses LWP in debug mode to work through the problem.

    #! /usr/bin/perl -w use strict; use warnings; use LWP::Debug qw(+); #for debugging info use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use HTTP::Date; # Create a user agent that uses the proxy info from the environment my $ua = LWP::UserAgent->new(env_proxy=>1); # Create the request my $request = HTTP::Request->new('GET'=>'http://www.perlmonks.com/'); # Add the If-Modified-Since header $request->push_header (if_modified_since=>time2str(time)); #Submit your request my $response = $ua->request($request); if ($response->is_success) { print "Retrieved URL OK\n"; } else { print $response->status_line . "\n"; }

    inman

      It is probably a good idea to use (time - 86400) or some time yesterday just to make sure that time sync errors don't cause any issues. ++ as this method is what is supposed to work. As we all know however some (if not all) proxies play fairly fast and loose with caching which is why I suggested the ugly query string hack.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        Somehow ur tricks doesnt seem to wrok..... will some one check my code instead. And give me an idea where have gone wrong
Re: How to Get around Proxy
by Art_XIV (Hermit) on Oct 27, 2003 at 13:42 UTC
    Another alternative might be to google for some low-price/free SOAP/Web Service stock tickers. It would probably save you some parsing work, too.