comment on

Hello, Monks...

I am writing a small script to hashify an HTML table. The table is large, but completely homogenous (thank goodness). So, without further ado, I give you the html:

<tr><td><b><a
href=i386/zh-xcin-2.3.04.tgz-long.html>zh-xcin-2.3.04.tgz</a></b></td>
+<td>&nbsp&nbsp&nbsp
<i>chinese input utility for X
</i></td><td>[ <a href=ftp://ftp.openbsd.org/pub/OpenBSD/2.8/packages/
+i386/zh-xcin-2.3.04.tgz>FTP Site
1</a> ]</td><td>
[ <a href=ftp://ftp1.usa.openbsd.org/pub/OpenBSD/2.8/packages/i386/zh-
+xcin-2.3.04.tgz>FTP Site 2</a> ]</td></tr>
[download]

So, for simplicity I zapped the /n/r that was lurking in there and have something thats a big brick of html (which I will spare all of you, nobody ever said html was pretty). So I have the following code:

my @fields = split '<tr><td><b>', $input;
foreach my $field (@fields) {
  # what i really wanted to do was...
  # (undef, $names{$1}) =~ m// but that didnt work either
  # so I added the $foo and $bar.
  my ($foo, $bar) = $field =~
    m!^<a href=.*>(.*)</a></b></td><td>&nbsp{3}<i>(.*)</i>.*$!x;
  $names{$foo} = $bar;
  print "$foo == $bar\n";
  }
[download]

If i print $field I do get my html, so I know $field is okay... I think the problem is the regex. In fact, im 90% sure its the regex. But where is it wrong given the data? It looks fine to me.

Thanks
brother dep.

--
transcending "coolness" is what makes us cool.

In reply to Regex Exercise by deprecated

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.