Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Tricky wrote:

In reply to Ovid and Abigail's comments : the TokeParser module is a great idea, the problem is that my remit is to investigate how regexps can be applied to reformatting HTML pages. I have a regexp for a background colour attribute, though the '#' character treats all characters following as a comment!

I'm not sure what you mean by your statement that your "remit is to investigate how regexps can be applied to reformatting HTML pages". If, by that, you mean that someone else has tasked you with this, then they have made a mistake. If someone comes to me and says "Ovid, I need you to deflea my cat. Here, use this shotgun", then I know that person made a mistake that's all too common in business. In short, the mistake is to say "here's a solution, let's see how we can make it fit our problem." That's absolutely the wrong way to go about things.

Mind you, it's an easy thing to do. I suspect that cyanide kills fleas. Therefore, I might ask a friend "how can I use cyanide to deflea my cat?" When that friend tells me to use flea powder, my first instinct shouldn't be "but I've got all of this cyanide handy, how do I use that?" Instead, a better tactic is to revisit the original problem. How do I remove the fleas from my HTML ... er ... cat? If the proposed solution is better than mine, I should be willing to swallow my pride and go with the best solution. Heck, if all politicians believed that, we'd have a much better country :)

Just for giggles, let's look at some valid HTML tags:

<a href="foobar.txt" onclick="javascript:go_boom()">stuph</a> <A HREF =foobar.txt ONCLICK='javascript:go_boom()'>stuph</a> <A HREF = 'foobar.txt' ONCLICK= 'javascript:go_boom()' > stuph </a > <font color="#FAFA519">test</font> <font color="FAFA519">test</font> <font color="fafa519">test</font> <font color=fafa519>test</font> <font color='fafa519'>test</font> <font color=fafa519 >test</font>

Do you like all of those font tags? Most browsers will render all of them identically. That's a great example of why most regular expressions will fail. They're tough to write.

But just to show you that I'm a good sport about how to deflea your cat, here's a link to Tom Christiansen's article, HTML Hacking with Regular Expressions. Enjoy!

Cheers,
Ovid

New address of my CGI Course.


In reply to How to Deflea a Cat by Ovid
in thread Regexps to change HTML tags/attributes by Tricky

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (6)
As of 2024-04-19 10:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found