in reply to Re^2: regex to get random quote
in thread regex to get random quote

Hello Datz_cozee75,

I'm getting nothing

Not surprising, really, since the (new!) target website contains no <div> tags. :-)

Removing the outer foreach and calling look_down(_tag => 'p') on $tree, I get:

17:18 >perl 1631_SoPW.pl Wide character in say at 1631_SoPW.pl line 54. You can view the sky as seen from various cities around the globe by +clicking on the name of a city below. If you don't know the latitude +and longitude of your observing site, click on the closest city in th +e table—unless you're far away from that city, the sky map will be +reasonably accurate. nix! 17:19 >

Which seems about right: a search through the source for that website finds only this one <p>...</p>-delimited paragraph.

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^4: regex to get random quote
by Aldebaran (Curate) on May 17, 2016 at 08:02 UTC

    Thx, I didn't mean to switch the value of $site in this instance. I'm able to look at output with this:

    #! /usr/bin/perl use warnings; use strict; use 5.010; use open ':std', OUT => ':utf8'; use HTML::TreeBuilder 5 -weak; my $site = 'http://motivationgrid.com/50-inspirational-quotes-to-live- +by/'; my $tree = HTML::TreeBuilder->new_from_url($site); foreach my $e ($tree->look_down(_tag => 'p')) { say $e->as_text; }

      Great! Now, with a little filtering and a bit of cleanup:

      #! perl use strict; use warnings; use open ':std', OUT => ':utf8'; use HTML::TreeBuilder 5 -weak; my $site = 'http://motivationgrid.com/50-inspirational-quotes-to-live- +by/'; my $tree = HTML::TreeBuilder->new_from_url($site); my @quotes; for ($tree->look_down(_tag => 'p')) { if ((my $t = $_->as_text) =~ m{ ^ \d+ \. \s+ }x) { $t =~ s{ \x{2019} }{'}gx; $t =~ s{ \xA0 }{ }gx; $t =~ s{ \x{2013} }{--}gx; push @quotes, $t; } } print "$_\n" for @quotes;

      you’ve got 50 motivational quotes:

      Hope that helps,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        Thanks Athanasius, this completes the task set forth in this thread. I have to confess, however, that I'm having trouble seeing why my attempt to trim it further does not alter the ultimate string:

        C:\cygwin64\home\Fred\pages2\list>perl scraper4.pl 38. Don't live your fears, live your dreams. 38. Don't live your fears, live your dreams. C:\cygwin64\home\Fred\pages2\list>type scraper4.pl #! perl use strict; use warnings; use open ':std', OUT => ':utf8'; use HTML::TreeBuilder 5 -weak; my $site = 'http://motivationgrid.com/50-inspirational-quotes-to-live- +by/'; my $tree = HTML::TreeBuilder->new_from_url($site); my @quotes; for ($tree->look_down(_tag => 'p')) { if ((my $t = $_->as_text) =~ m{ ^ \d+ \. \s+ }x) { $t =~ s{ \x{2019} }{'}gx; $t =~ s{ \xA0 }{ }gx; $t =~ s{ \x{2013} }{--}gx; push @quotes, $t; } } my $randomelement = $quotes[rand @quotes]; print "$randomelement\n"; $randomelement =~ s/^ \d+ \. \s+//; print "$randomelement\n";

        Otherwise, I'm really pleased.