It seems as tho you're having the same problem with get_text as I was having. It appears that when you call the get_text method, it "massages" the contents of the text. for instance i had an item a link in it, and it removed the link and just gave me the plain text. In your case it's converting the
to something else (i got things like
á2á (WinXP, ActivePerl 5.6)). Here's a pseudo work-around:
use HTML::TokeParser;
use strict;
local $/;
my $lines = <DATA>;
my $p = HTML::TokeParser->new(\$lines);
while (my $token = $p->get_token) {
print "$1\n" if ($token->[1] =~ /^ (\d{1,2}) $/ && $toke
+n->[0] eq 'T')
}
__END__
<td>1</td>
<td> 2 </td>
<td>10</td>
<td> 20 </td>
Output:
2
20
Note: it will only work if you can guarantee that the data comes directly after the <td> tag (ie, no <div>, <p>, etc..)
HTH
Update: Better code, now. What i had was this: If you find a td tag, get the next tag, it should be text, and see if it matches the patern. Now, instead, I check to see if it's a text tag and if it matches our pattern. That should be more reliable.
Update 2: added
use strict; :P
Update 2.5: code change thanks to
Aristotle
--
Rock is dead. Long live paper and scissors!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.