in reply to XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple

I'm very envious. It looks like XML::TokeParser is a much easier module to work with. The tokens returned by HTML::TokeParser are slightly different in structure depending upon whether or not someone has generated a token via get_tag or get_token, thus forcing me to constantly synchronize the tokens internally and restore them to their original state when I'm done. I should be shot for having to write this:

sub _synch_arrays { my $array_ref = shift; my $tag_func = GET_TOKEN; if ( ! grep { $array_ref->[0] eq $_ } keys %token ){ $tag_func = GET_TAG; if ( '/' ne substr $array_ref->[0], 0, 1 ) { unshift @$array_ref, 'S'; } else { unshift @$array_ref, 'E'; } } return ( $array_ref, $tag_func ); }

I also think your code might be a bit cleaner than mine, too. Hmmm... maybe time for another update :( I also happened to break my $functions xor $methods rule. Whoops. I'm a hypocrite :)

Cheers,
Ovid

New address of my CGI Course.
Silence is Evil (feel free to copy and distribute widely - note copyright text)

  • Comment on Re: XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple
  • Download Code

Replies are listed 'Best First'.
Re: Re: XML::TokeParser::Simple - pretty much like HTML::TokeParser::Simple
by PodMaster (Abbot) on Jun 08, 2003 at 11:48 UTC
    After all this time, and finally getting closer and closer to releasing XML::TokeParser (one which has this functionality built-in), I finally took another look at this thread and realized I too need to do something like that.

    D'oh.

    I mean, why would you get_tag and then test to see if it's a tag, or a process instruction, since it can only be a tag.

    I quickly fixed this and then I got reminded again that a XML::TokeParser::Token doesn't have a constructor -- yuck.

    Then I thought maybe I should force get_tag to return a proper token, but that would break backwards compatiblity, and I sure don't wanna do that.

    Then I think to myself I should forget all this nonsense, and have

    • XML::TokeParser::Token::StartTag
    • XML::TokeParser::Token::EndTag
    • XML::TokeParser::Token::PI
    • XML::TokeParser::Token::Comment
    • XML::TokeParser::Token::Text
    Might as well take full advantage of blessed references. Something like
    package XML::TokeParser::Token; sub is_text { return 0; } sub is_comment { return 0; } sub is_pi { return 0; } sub is_tag { return 0; } sub is_start_tag { return 0; } sub is_end_tag { return 0; } sub raw { return $_[0]->[-1]; } package XML::TokeParser::Token::Text; # use vars::i '@ISA' => 'XML::TokeParser::Token'; # i'll probably put +vars::i on cpan also use vars '@ISA'; @ISA = 'XML::TokeParser::Token'; sub is_text { return 1; } sub text { return $_[0]->[-2]; }
    Thoughts/Comments? I think maybe that's what i'll do, because
    sub is_end_tag { if( $_[0]->[0] eq 'E' or ( @{$_[0]} == 2 && substr( $_[0]->[0], 0, 1 ) eq '/' ) ){ if(defined $_[1]){ return 1 if $_[0]->[1] eq $_[1]; } else { return 1; } } return 0; }
    does not look so hot. *sigh*


    MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
    I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
    ** The Third rule of perl club is a statement of fact: pod is sexy.

      I've been thinking about that for a while. I was considering a few other options that I might want to toss in the code and somehow never quite get around to it. What you propose is a heck of a lot cleaner and will clear up some other issues. I guess I was the bad lazy. I hope you don't mind if I steal your code :)

      Incidentally, if you haven't seen it, HTML::TokeParser::Simple is now at version 2.1 and has three HTML munging methods added that cover some very common situations that people keep wanting to deal with.

      Cheers,
      Ovid

      New address of my CGI Course.
      Silence is Evil (feel free to copy and distribute widely - note copyright text)