This is a good suggestion, but in my case I am very limited in what I can do for the user as far as the HTML, and all comments are removed (and are to be removed by client request) from all pages processed. I go into some specifics in one of my earlier replies, but to rephrase and recap what I am doing:
- Retrieve remote document via HTTP ( LWP::UserAgent, HTTP::Request )
- Parse document for local storage and confirm that it's format isn't horribly disgusting ( HTML::TreeBuilder )
- Allow editing of title tag, meta tags, anchor tag title attribute, and img tag alt attribute.
The forms for the editing are created by relying on where each tag is located inside of the element array created by HTML::TreeBuilder. That is if a person selects alt tags as way they want to edit each img tag is located using the look_down method in an array context:
my @img = $tree->look_down('_tag', 'img');
my $count;
my $form;
foreach my $element (@img) {
# make a form element
$form .= # call to CGI function, name = "img-$count"
$count++;
}
return $form;
Then when they submit the form the $count is referenced and the appropriate img tags alt content is replaced.
But this is all moot since the issue was and is that HTML::TreeBuilder is "supposed" to handle bad HTML, since it uses HTML::Parser and one of the goals of HTML::Parser is to work with documents that are really out there, the example given should work with HTML::TreeBuilder and in fact it does, part of my problem was not turning off implicit_tags as one of my other replies above states. The implicit_tags is unique to the HTML::TreeBuilder module and it attempts to correct badly formated HTML, which 98% of the time is most likely a good thing, but at least the author designed in the ability to turn off that behavior in the 2% of the times it isn't a good thing.
I have tested my ideas and have confirmed that setting that flag allows for the conditions I need, but results in a different anomaly, which I have contacted the author of the module about.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.