An alternative would be to request a JSON object rather than the rendered web page, do this by appending .json to the end of your URL like so:

https://www.reddit.com/r/unitedkingdom/comments/58m2hs/i_danie+l_blake +_is_released_today/.json

I've no idea what tool you're using further down the line for analysis, but HTML seems like a odd format to store such data. Here is a short example, simply printing the name of the poster and the comment:

#!/usr/bin/perl use strict; use warnings; use Mojo::UserAgent; my $url ='https://www.reddit.com/r/unitedkingdom/comments/58m2hs/i_dan +ie+l_blake_is_released_today/.json'; my $ua = Mojo::UserAgent->new; my $data = $ua->get( $url )->res->json; foreach my $comment ( @{$data} ) { foreach my $child ( @{ $comment->{'data'}->{'children'} } ) { print $child->{'data'}->{'author'} . " posted:" .$/; print $child->{'data'}->{'body'} . "\n" if( $child->{'data'}-> +{'body'} ); } }

You'll need the Mojo::UserAgent module:

#install via cpan cpan Mojo::UserAgent #or cpanm cpanm Mojo::UserAgent

From the brief example above you can see how to get just what you want, or add some other bells and whistles. The example isn't particulary pretty in it's output, I'll leave that an an exercise for you. You can examine the JSON in browser (some plugins exist to prettify the content) or you can use something like json_pp to print it from the command line.

Update: So I read some other comments you made, if you're trying to do this for various sub-reddits you can easily adapt the above example to:


In reply to Re: Question regarding web scraping by marto
in thread Question regarding web scraping by Lisa1993

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.