Help! I'm working on a class project. We are caching copies of HTML files. The problem: HTML 'href's and 'src's need to be changed. Relative links need to be changed to hard links, so that all our database has to hold is HTML text, not images, etc. Anyway, there is a myriad of ways of making HTML href and img tags: No quotes, quotes, relative, relative with '..'s, leading slashes, trailing slashes, ones with 'http://', with only 'www', etc. I need to find all HTML 'href' and 'src' links and make them hard links. Any ideas? Is there a module that does this, or do I have to do a million regexps? I need some help...
'Mad Props' to anyone who can shed some light...
Adam
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|