spx2 has asked for the wisdom of the Perl Monks concerning the following question:
i want to parse a piece of html with WWW::Mechanize so i used $mech is a WWW::Mechanize object and i have like this piece of string that i want to catch and it seems all methods i tried with regex don't seem to work.
i will paste here the relevant piece of the html i want to parse,and i can say that there are no more pieces like this.
<div style="font-size: 14px; font-weight: bold; border-bottom: 1 +px solid black; margin-bottom: 5px; padding-bottom: 2px;"> Cristina's Stats <\/div>
now im kind of suspicios on that ' that its messing stuff up,but im not sure. ok,we'll talk about this later.
the following is the code ive tried to make to match what we have above
$_=qq/ <div style="font-size: 14px; font-weight: bold; border-bottom: 1 +px solid black; margin-bottom: 5px; padding-bottom: 2px;"> Cristina's Stats <\/div> /; />.(.*)s Stats.*<\/div>/s; print $1;
now what i am sure of is that it does skip over the endline and that it goes to take the name Cristina wich is actually what i want the regex to match...well im pretty close to it .im not sure how does WWW::Mechanize come up with ' , is this represented as a character or just as a it is in the $mech->content ? (i didnt check that... :| )
hmmm , look how perlmonks displays it : '
the fact is that with the code above the regex works ok, but when faced with the real web content in $mech->content it doesnt work as expected,it doesnt match anything at all.
how can i fix this regex? or what other WWW::Mechanize methods/properties could help to solve the problem?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: regex and WWW::Mechanize parsing a ->content of mechanize object
by Cody Pendant (Prior) on Jul 01, 2007 at 03:16 UTC | |
|
Re: regex and WWW::Mechanize parsing a ->content of mechanize object
by c4onastick (Friar) on Jul 01, 2007 at 02:51 UTC | |
by spx2 (Deacon) on Jul 01, 2007 at 03:07 UTC |