hi there I have some data in the format (a credit card statement) , and I've been trying to adapt a google-posted perl script to parse the data into a qif file for import into microsoft money.
<tr><td bgcolor='#DCDCDC' align='left' width='100'><font face='arial,h +elvetica' size='-2'>&nbsp;28 Jun 2001</font></td><td bgcolor='#DCDCDC +' + align='left' width='300'><font face='arial,helvetica' size='-2'>&nbs +p +;HMV UK LTD NOTTINGHAM GB</font></td><td bgcolor='#DC +D +CDC' align='right' width='75'><font face='arial,helvetica' size='-2'> +& +pound;10.99 &nbsp;</font></td></tr><tr><td bgcolor='#DCDCDC' align='l +e +ft' width='100'><font face='arial,helvetica' size='-2'>&nbsp;28 Jun 2 +0 +01</font></td><td bgcolor='#DCDCDC' align='left' width='300'><font fa +c +e='arial,helvetica' size='-2'>&nbsp;MARKS SPENCER NOTTINGHA +M + 06 GB</font></td><td bgcolor='#DCDCDC' align='right' width='75'><fon +t + face='arial,helvetica' size
and I want to find certain tags using this code snippet, and output a qif file. The problem I'm having is in matching the start and end tags.
$start="<td bgcolor=\'#DCDCDC\' align=\'left\' width=\'300\'><font fac +e='arial,helvetica' size='-2'>&nbsp;"; $end="</font></td>"; while (<>) { if (/$start(.*?)$end/g) { print "\n\n\nDOODAH:".$1."\n"; } }
It never seems to match the start tag, and if I change the start tag to something simpler like
$start="<td bgcolor";
then the perl never seems to stop when it hits something that matches the $end var. I've been going round and round this and I just can't figure it out, so any advice would be greatly appreciated. Cheers moonlord

In reply to html tag matching confusion by moonlord

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.