in reply to Re: running an example script with WWW::Mechanize* module
in thread running an example script with WWW::Mechanize* module
"If the site doesn't like what your script is doing when you're signed in, they can let you know. I really would like to work up this example, but with WMC instead. My partner and I will watch Netflix, HBO, Amazon, and we're always trying to match the actors up to where we've seen them last, so I will get on my android and make the actual keystrokes on other occasions."
Have you looked at the many utilities on cpan for scraping data from IMDB? While packages exist here is a short proof of concept for accessing data, just a short (sub optimal) example to get you started:
#!/usr/bin/perl use strict; use warnings; use feature 'say'; use Mojo::URL; use Mojo::Util qw(trim); use Mojo::UserAgent; my $imdburl = 'http://www.imdb.com/search/title?title=Caddyshack'; # pretend to be a browser my $uaname = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 ( +KHTML, like Gecko) Chrome/40.0.2214.93 Safari/537.36'; my $ua = Mojo::UserAgent->new; $ua->max_redirects(5)->connect_timeout(20)->request_timeout(20); $ua->transactor->name( $uaname ); # find search results my $dom = $ua->get( $imdburl )->res->dom; # assume first match my $filmurl = $dom->find('a[href^=/title]')->first->attr('href'); # extract film id my $filmid = Mojo::URL->new( $filmurl )->path->parts->[-1]; # get details of film $dom = $ua->get( "https://www.imdb.com/title/$filmid/" )->res->dom; # print film details say trim( $dom->at('div.title_wrapper > h1')->text ) . ' (' . trim( $d +om->at('#titleYear > a')->text ) .')'; # print actor/character names foreach my $cast ( $dom->find('table.cast_list > tr:not(:first-child)' +)->each ){ say trim ($cast->at('td:nth-of-type(2) > a')->text ) . ' as ' . trim + ( $cast->at('td.character')->all_text ); }
Output:
Caddyshack (1980) Chevy Chase as Ty Webb Rodney Dangerfield as Al Czervik Ted Knight as Judge Elihu Smails Michael O'Keefe as Danny Noonan Bill Murray as Carl Spackler Sarah Holcomb as Maggie O'Hooligan Scott Colomby as Tony D'Annunzio Cindy Morgan as Lacey Underall Dan Resin as Dr. Beeper Henry Wilcoxon as The Bishop Elaine Aiken as Mrs. Noonan Albert Salmi as Mr. Noonan Ann Ryerson as Grace Brian Doyle-Murray as Lou Loomis Hamilton Mitchell as Motormouth
See also Mojo::UserAgent, Mojo::DOM, Mojo::URL, ojo (should you want one liners). You could adapt the above to print all matches and prompt for which one you want, rather than assume the first one (since remakes, sequels/prequels etc..), allow you to select the actor and return the details of all the other films/shows they have been in.
Update: If you would prefer some sort of web interface to the results wrap the above around Mojolicious::Lite
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: running an example script with WWW::Mechanize* module
by Aldebaran (Curate) on Apr 30, 2020 at 04:32 UTC | |
by marto (Cardinal) on Apr 30, 2020 at 10:31 UTC | |
by Aldebaran (Curate) on Apr 30, 2020 at 19:43 UTC |