Hello, Monks...

I am writing a small script to hashify an HTML table. The table is large, but completely homogenous (thank goodness). So, without further ado, I give you the html:

<tr><td><b><a href=i386/zh-xcin-2.3.04.tgz-long.html>zh-xcin-2.3.04.tgz</a></b></td> +<td>&nbsp&nbsp&nbsp <i>chinese input utility for X </i></td><td>[ <a href=ftp://ftp.openbsd.org/pub/OpenBSD/2.8/packages/ +i386/zh-xcin-2.3.04.tgz>FTP Site 1</a> ]</td><td> [ <a href=ftp://ftp1.usa.openbsd.org/pub/OpenBSD/2.8/packages/i386/zh- +xcin-2.3.04.tgz>FTP Site 2</a> ]</td></tr>
So, for simplicity I zapped the /n/r that was lurking in there and have something thats a big brick of html (which I will spare all of you, nobody ever said html was pretty). So I have the following code:
my @fields = split '<tr><td><b>', $input; foreach my $field (@fields) { # what i really wanted to do was... # (undef, $names{$1}) =~ m// but that didnt work either # so I added the $foo and $bar. my ($foo, $bar) = $field =~ m!^<a href=.*>(.*)</a></b></td><td>&nbsp{3}<i>(.*)</i>.*$!x; $names{$foo} = $bar; print "$foo == $bar\n"; }
If i print $field I do get my html, so I know $field is okay... I think the problem is the regex. In fact, im 90% sure its the regex. But where is it wrong given the data? It looks fine to me.

Thanks
brother dep.

--
transcending "coolness" is what makes us cool.


In reply to Regex Exercise by deprecated

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.