Greetings deprecated,

Well, this isn't the best or most compact regex in the world, but I've tested this, and it appears to work in the trials I've done. Give it a try. Someone with additional regex experience should be able to shorten my match string somewhat, I suspect.

use strict; my $input = "<tr><td><b><a href=i386/zh-xcin-2.3.04.tgz-long.html>zh-x +cin-2.3.04.tgz</a></b></td><td>&nbsp&nbsp&nbsp<i>chinese input utilit +y for X</i></td><td>[ <a href=ftp://ftp.openbsd.org/pub/OpenBSD/2.8/p +ackages/i386/zh-xcin-2.3.04.tgz>FTP Site 1</a> ]</td><td>[ <a href=ft +p://ftp1.usa.openbsd.org/pub/OpenBSD/2.8/packages/i386/zh-xcin-2.3.04 +.tgz>FTP Site 2</a> ]</td></tr>"; my %data; my @fields = split '<tr><td><b>', $input; shift @fields; foreach my $field (@fields) { ($data{fileurl}, $data{filename}, $data{description}, $data{ftp1}, + $data{ftp2}) = $field =~ m#^<a href=(.*?)>(.*?)</a></b></td><td>&nbsp&nbsp&nbsp<i>(.*?) +</i></td><td>\[ <a href=(.*?)>.*?</a> ]</td><td>\[ <a href=(.*?)>.*#; print "$2 == $3\n"; }

Yeah, I know. It's a bit clunky. Given additional known constants for your specific situation, you may be able to streamline this a bit better than me. Anyway, good luck!

-Gryphon.


In reply to Re: Regex Exercise by gryphon
in thread Regex Exercise by deprecated

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.