in reply to Re: getting content of an https website
in thread getting content of an https website
Thanks, tangent, that's got it. With a little help from HTML::Tree, this suffices:
use strict; use warnings; use feature 'say'; use LWP::UserAgent; use HTML::Tree; my $url = 'https://berniesanders.com/issues/racial-justice/'; my $ua = LWP::UserAgent->new(); $ua->agent( 'Mozilla/5.0 (X11; Linux i586; rv:31.0) Gecko/20100101 Fire +fox/31.0' ); my $response = $ua->get($url); my $content = $response->content; if ( $content =~ m/enemy/i ) { say "enemy found"; } else { my $tree = HTML::Tree->new(); $tree->parse($content); print $tree->as_text; }
I've seen code like this before, and I thought I actually needed to have the browser in question, but apparently not. Am I correct to think that that string need to have nothing to do with the actual machine it runs on? Does the string you used make a good overall choice for such queries?
I'd like to consider a related question, given that we're barely warmed up here. I've always wanted the funtionality of having mechanized events happen and then having an actual browser opened. I don't know if one browser works better than another for this, but I use Chrome for most of my day-in and day-out surfing, viewing or whatever. Clearly, I would have to define a path to the executable, which I believe is here:
Directory of C:\Program Files (x86)\Google\Chrome\Application 08/22/2015 03:42 AM <DIR> . 08/22/2015 03:42 AM <DIR> .. 08/14/2015 12:43 PM <DIR> 44.0.2403.155 08/22/2015 03:42 AM <DIR> 44.0.2403.157 08/17/2015 10:23 PM 813,896 chrome.exe 06/03/2013 04:26 PM 18,546 master_preferences 06/19/2014 02:37 AM <DIR> Plugins 08/22/2015 03:42 AM 399 VisualElementsManifest.xml
How might I open the url from the original post in this browser?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: getting content of an https website
by Anonymous Monk on Sep 01, 2015 at 03:40 UTC | |
by Aldebaran (Curate) on Sep 01, 2015 at 07:54 UTC | |
|
Re^3: getting content of an https website
by Anonymous Monk on Sep 07, 2015 at 17:46 UTC |