I'm working on some Wiki-like auto-linking code that scans text for known words or strings and when found replaces them with HTML hyperlinks. For example, if "MySQL" is in the list of known strings then the code turns that word into a hyperlink when it is found in a sentence like "using MySQL or another database".
I am trying to come up with a regex that will perform this substitution but only when the string is:
- not found in between a pair of anchor tags (<a href...> ... </a>).
- not inside a pair of angle brackets (for example if the string "foo" appears in the "alt" attribute of an "IMG" tag I obviously don't want to turn it into a hyperlink!).
Without these special provisions if someone ever manually wraps the word MySQL (or a sentence containing it) inside anchor tags then I end up with nested anchor tags which are invalid HTML.
I've seen various regexps for matching anchors or other tags, but I can't figure out how to match something that's not inside an anchor or a tag... I've tried all sorts of nasty look-behind/look-ahead stuff but nothing that works yet. Sometimes it gets so ugly that I start wondering if I have to write some kind of recursive HTML tokenizer (ugh)... Any ideas?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.