Re: XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple

I'm very envious. It looks like XML::TokeParser is a much easier module to work with. The tokens returned by HTML::TokeParser are slightly different in structure depending upon whether or not someone has generated a token via get_tag or get_token, thus forcing me to constantly synchronize the tokens internally and restore them to their original state when I'm done. I should be shot for having to write this:

sub _synch_arrays {
    my $array_ref = shift;
    my $tag_func = GET_TOKEN;

    if ( ! grep { $array_ref->[0] eq $_ } keys %token ){
        $tag_func = GET_TAG;
        if ( '/' ne substr $array_ref->[0], 0, 1 ) {
            unshift @$array_ref, 'S';
        }
        else {
            unshift @$array_ref, 'E';
        }
    }
    return ( $array_ref, $tag_func );
}
[download]

I also think your code might be a bit cleaner than mine, too. Hmmm... maybe time for another update :( I also happened to break my $functions xor $methods rule. Whoops. I'm a hypocrite :)

Cheers,
Ovid

New address of my CGI Course.
Silence is Evil (feel free to copy and distribute widely - note copyright text)

Comment on Re: XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple Download Code

Replies are listed 'Best First'.

Re: Re: XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple
by PodMaster (Abbot) on Jun 08, 2003 at 11:48 UTC

D'oh.

I mean, why would you get_tag and then test to see if it's a tag, or a process instruction, since it can only be a tag.

I quickly fixed this and then I got reminded again that a XML::TokeParser::Token doesn't have a constructor -- yuck.

Then I thought maybe I should force get_tag to return a proper token, but that would break backwards compatiblity, and I sure don't wanna do that.

Then I think to myself I should forget all this nonsense, and have

XML::TokeParser::Token::StartTag
XML::TokeParser::Token::EndTag
XML::TokeParser::Token::PI
XML::TokeParser::Token::Comment
XML::TokeParser::Token::Text

package XML::TokeParser::Token;
sub is_text                { return 0; }
sub is_comment             { return 0; }
sub is_pi                  { return 0; }
sub is_tag                 { return 0; }
sub is_start_tag           { return 0; }
sub is_end_tag             { return 0; }
sub raw                    { return $_[0]->[-1]; }

package XML::TokeParser::Token::Text;
# use vars::i '@ISA' => 'XML::TokeParser::Token'; # i'll probably put 
+vars::i on cpan also
use vars '@ISA';
@ISA = 'XML::TokeParser::Token';

sub is_text                { return 1; }
sub text                   { return $_[0]->[-2]; }
[download]

sub is_end_tag {
    if( $_[0]->[0] eq 'E'
        or ( @{$_[0]} == 2 && substr( $_[0]->[0], 0, 1 ) eq '/' )
    ){
        if(defined $_[1]){
            return 1 if $_[0]->[1] eq $_[1];
        } else {
            return 1;
        }
    }
    return 0;
}
[download]

MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
** The Third rule of perl club is a statement of fact: pod is sexy.

[reply]
[d/l]
[select]

Re: Re: Re: XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple

by Ovid (Cardinal) on Jun 10, 2003 at 04:58 UTC

I've been thinking about that for a while. I was considering a few other options that I might want to toss in the code and somehow never quite get around to it. What you propose is a heck of a lot cleaner and will clear up some other issues. I guess I was the bad lazy. I hope you don't mind if I steal your code :)

Incidentally, if you haven't seen it, HTML::TokeParser::Simple is now at version 2.1 and has three HTML munging methods added that cover some very common situations that people keep wanting to deal with.

Cheers,
Ovid

New address of my CGI Course.
Silence is Evil (feel free to copy and distribute widely - note copyright text)

[reply]