Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Rebuilding changed tags with HTML::TokeParser

by swiftone (Curate)
on Dec 13, 2002 at 23:14 UTC ( [id://219777]=perlquestion: print w/replies, xml ) Need Help??

swiftone has asked for the wisdom of the Perl Monks concerning the following question:

I have some code that alters A HREF tags/attributes inside an html document, and displays the results. I have it working fine, but there's a part where I'm sure there's an easier way to do it:

I have the code:

my $tag; if($change){ #contruct modified tag # Wow this is ugly $tag = '<'.$token->[1].' '.join(' ',map({"$_ = '".$token->[2]{$_}."' +"} keys(%{$token->[2]}))).'>'; } else { #output original tag # This is nice $tag = $token->[0] eq "T" ? $token->[1] : $token->[-1]; }
where $token is an HTML::TokeParser token for a starting A tag that contains an HREF attribute (otherwise $change is not set)

Outputting unchanged tags is so elegant (idiom relayed to me by ChemBoy), surely there's a nicer way to convert a token I modify back into HTML? Thanks in advance.

Replies are listed 'Best First'.
Re: Rebuilding changed tags with HTML::TokeParser
by Ovid (Cardinal) on Dec 13, 2002 at 23:37 UTC

    Use HTML::TokeParser::Simple. First, I designed it to be a drop-in replacement, so you could use the module without any changes in your code except for the use statement and the constructor call. Then, as you refactor, you use the module to 'clean up' the nasty bits. The following is untested and assumes that the token is from HTML::TokeParser::Simple. It also requires the latest version because the "as_is" method is new.

    my $tag; if ($change) { my $tag_type = $token->return_tag; my $attributes = make_attributes($token); $tag = "<$tag_type $attributes>"; } else { $tag = $token->as_is; } sub make_attributes { my $token = shift; my $seq = $token->return_attrseq; my $attr = $token->return_attr; return join ' ', map {qq|$_="$attr{$_}"|} @$seq; }

    My version will also preserve the attribute sequence. Also note the the token, as mentioned, is still an array reference, so you can access them if you must, or do everything simpler by using the supplied methods. Since the token data is also the instance data, if you change the arrayref's data directly, you are also changing the instance data, which is why the above code still work.

    Hmm... this gives me more ideas of what could be included in the module.

    Cheers,
    Ovid

    New address of my CGI Course.
    Silence is Evil (feel free to copy and distribute widely - note copyright text)

Re: Rebuilding changed tags with HTML::TokeParser
by Ionizor (Pilgrim) on Dec 13, 2002 at 23:39 UTC

    Generally what I'll do for a situation like this is something like:

    # Copy the hash for readability # (S|C)ould be updated to reference instead of copy my %attribs = %{$token->[2]}; $tag = '<a'; foreach $name keys (%attribs) { $tag .= " $attrib=$attribs{$name}"; } $tag .= '>';

    I find this easier to read than the code you have above, though it's probably not as efficient. I'm sure one of the more experienced Monks will one-up me here but this is what works for me. The above snippet (s|c)ould be updated to reference the hash instead of copying it but I want to be fairly certain the code I give you works.

    Update: Ovid has a better solution. See "one of the more experienced Monks..." above.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://219777]
Approved by cLive ;-)
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-03-28 21:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found