I want to make some sort of subroutine which parses BB tags but I'm not sure how to do it. Replacing them with html isn't hard but what if someone opens a tag and doesn't close it? (i.e [/B]) I could count the number of opened tags and closed and check if they match and if they don't I'll print an error message, or correct it.
Problem is that regexes return true or false in a scalar context - which I don't want, and list of captured stuff in array context. I could capture all tags, assign them to a list and check it but don't you guys usually rave about capturing being really slow for performance?

Also, I am using >>\d+ to reference to other posters, for example >>1 is a reference to the first poster. But I don't want this substitution (putting it in a <a> tag) to be done inside a [code] tag so how do I do this? I was thinking of doing some look-before/look-ahead thing like /(?<!\[code\])>>\d+(?!\[\\code\])/ but will that work if a user for example were to write:
'[code]print "funny text";[/code] >>4 blargh [code]5>>1 is 2[/code]'

Favorable output would be:
'<span class="code">print &quot;funny text&quot;;</span><a href="#4">&gt;&gt;4</a><span class="code">5&gt;5&gt;1 is 2</span>'

But I suspect that the ">>4" won't get substitued with an <a> tag. (Entity names are already being taken care of with escapeHTML so don't worry about that)

Also, is it possible to get all this done in XHTML (e.g. tags strictly close in a reverse way they're opened) without too much work?

2005-09-03 Retitled by Arunbear, as per Monastery guidelines
Original title: 'BBCode'


In reply to BBCode parser/validator needed by OnionKnight

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.