The following is a malformed regular expression:
while ($CONTENT =~ <div class=\"usertext-body may-blank-within md-cont +ainer \"><div class=\"md\">(.+?)<\/div><\/div><\/form><ul class=\"fla +t-list buttons\"> //gs )
It is at least missing the s/ start.
Personally, I suggest that you do the content extraction by using HTML::TreeBuilder and XPath or CSS selectors (via HTML::TreeBuilder::XPath and HTML::Selector::CSS).
Also note that Reddit has an API available, so you maybe don't need to scrape at all but can get the comments in a machine readable format directly.
Also note that on CPAN, there are many Reddit modules available, and it seems that Reddit::Client is using the Reddit API.
In reply to Re: Question regarding web scraping
by Corion
in thread Question regarding web scraping
by Lisa1993
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |