This script has proven to be more difficult than originally planned and more difficult than it's worth :(

I get: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: NAME: Text: Delay: So I think I misused a part of your script incorrectly because I'm not getting errors or anything. Did I mess up anywhere with

use strict; use CGI qw/:standard/; use HTML::Tree; use LWP::Simple; print header, start_html('test printing'); #my $count; #until ($count eq "5") { #$count++; my $funky = "http://www.allpoetry.com/chat//page=1"; my $content = get($funky); my $tree = HTML::Tree->new(); $tree->parse($content); # retrieve the text and split into lines my @lines = split "<br>", $tree->as_text; local $/; my @good_lines; my $good_lines; for my $lines (@lines) { $lines =~ s/\)/\)<br>/g; while($lines =~ m/Next Chatter \>(.*?)\< Previous Chatter/gs){ $good_lines = $1; push @good_lines,$good_lines; } foreach (@good_lines){ my @lines = split /<br>/; foreach (@lines){ next unless $_; /([^:]+): (.+) \((\d+) minutes ago\)/; my( $name, $text, $delay ) = ( $1, $2, $3 ); print "NAME:$name\nText:$text\nDelay:$delay\n\n"; } } }
Sorry I keep bugging you, I promise this'll be the last time (I think I'll give up for a while if nothing else works ((note to self: this is why you stopped using HTML:: modules in the first place)) ). Thanks for your help!

In reply to Re: Re: html parsing/regex by coldfingertips
in thread html parsing/regex by coldfingertips

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.