Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow monks. I've having issues with the REST::Google::Search module. I'm trying to iterate throught the pages of google search results based on the query I enter and then determine on which page a specific site is located. It might be my lack of understanding of the cursor which seems to contain the next iteration of results I need to view. I've tried doing it manually but if you run the code you'll see my issue. It reports the website on page '0' when it actually is on page '4' I've posted my code below
use warnings; use strict; use Data::Dumper; use REST::Google::Search; REST::Google::Search->http_referer('http://example.com'); SearchGoogle( 0 ); sub SearchGoogle { my $start = shift; #print "Looking from position: $start\n"; my $res = REST::Google::Search->new( q => 'perl regex', start => $start, rsz => 'small', ); if ( $res->responseStatus != 200 ) { SearchGoogle( $start ); } my $data = $res->responseData; my $cursor = $data->cursor; my $pages = $cursor->pages; printf "current page index: %s\n", $cursor->currentPageIndex; my @results = $data->results; my $found = 0; foreach my $r (@results) { if ($r->url =~ /spaweditor.com/) { $found = 1; print "FOUND WEBSITE: on page " . $cursor->currentPageIndex . +"\n"; exit; } } if ( !$found ) { $start += 4; SearchGoogle( $start ); } }
Can someone provide an example or point me in the right direction?

Thanks

Replies are listed 'Best First'.
Re: REST google search issues?
by Khen1950fx (Canon) on Jan 07, 2010 at 02:37 UTC
    I tried to find a definitive answer for you, but no luck. Instead I went with the example in the pod for REST::Google. Here's the example:
    #!/usr/bin/perl use strict; use warnings; use REST::Google; REST::Google->service('http://ajax.googleapis.com/ajax/services/search +/web'); REST::Google->http_referer('http://example.com'); my $res = REST::Google->new( q => 'perl regex', start => 4, ); die "response status failure" if $res->responseStatus != 200; my $data = $res->responseData; use Data::Dumper::Concise; warn Dumper ( $data );
    OK. It gets a little tricky here. start => $start doesn't do anything except return an index of 0. So, I figured the problem was indeed the "start' value. As I understand it now, the start value is something like this:

    0 = 0
    1 = 4
    2 = 8
    3 = 12
    4 = 16

    As it stands, your start value is 0, so it returns 0. If you want the index for the 4th page, then

    my $res = REST::Google->new( q = 'perl regex' start => 16, );
    It will give you an index for the fourth page.
      hmm, not sure I still understand and the documentation isn't that helpful. If I understand you correctly the cursor represents a hash table of page references? (e.g. 16 => page 4)

      Thanks for having a look at my code.

      Another issue I have (don't understand) is that the param rsz => 'small', returns only 4 search results from the page and 'large' returns 8. How would I see all 10?

      Anyone?

        For the number of results pages, quoting the googledocs:

        Note: The maximum number of results pages is based on the type of searcher. Local search supports 4 pages (or a maximum of 32 total results) and the other searchers (Blog, Book, Image, News, Patent, Video, and Web) support 8 pages (for a maximum total of 64 results).