You're trying to parse something, that you're generating from a CGI? So you have control of what's being generated in the first place... then why are you using HTML (which is difficult to parse)? Generate an alternate output, that can be more easily parsed (or directly used by whatever it is that you're trying to do.)

This is exactly what SOAP, WDDX, XML, and all those other acronyms are for. (although, they do have some overhead, but you're sure to get your data across cleanly) Here's another simple way to pass data out of your CGI:

use Data::Dumper; print "Content-type: text/plain\n\n",Dumper($my_data);

CGIs don't have to generate HTML. XML can be your friend. So can plain text, when used right. (tab delim, CSV, etc)


In reply to Re^3: regexp text parsing issue. by jhourcle
in thread regexp text parsing issue. by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.