in reply to Which one is the better Regex?

It depends on what you define "better".

If we look at the first set, they are both wrong. Sure, in many cases, they will extract the title, but in some cases, they will not. For instance, in the first regex, you are assuming that any < starts a tag. This is not the case however. Furthermore, both regexes assume comments do not exist. Or CDATA marked sections. Note also that the latter one uses both (?i) and /i. One can be omitted.

As for the last set of regexes, both are so horribly wrong, that talking about which one is better carries no meaning at all. It's like asking "what's better to eat with fries? Yellow or wednesday?".

As the FAQ says, if you want to extract elements, or remove tags, PARSE the HTML, don't use trivial regexes.

Abigail

Replies are listed 'Best First'.
Re: Re: Which one is the better Regex?
by Rufnex (Novice) on Feb 27, 2003 at 14:11 UTC
    Ok ... i've to write better regex ;o) btw i'm newbie to this topic. do you have e.g. for both things?

    thx

      As I said, you have to PARSE the HTML text - you shouldn't attempt to solve it with a single regex.

      Abigail

      Check out YAPE::HTML -- It is pure perl (ie regexes)
      #!/usr/bin/perl use YAPE::HTML; use Data::Dumper; use warnings; use strict; my $content = " <html> <title> yes a title </title> <body> yes a body </body> </html> "; my $parser = YAPE::HTML->new($content); my $extor = $parser->extract( 'title' => []); while (my $chunk = $extor->()) { print Dumper $chunk; print $/,'>>>>',$chunk->text()->[0]->string(),'<<<<',$/x5; } __END__ $VAR1 = bless( { 'TYPE' => 'tag', 'ATTR' => {}, 'TAG' => 'title', 'TEXT' => [ bless( { 'TYPE' => 'text', 'TEXT' => ' yes a title ' }, 'YAPE::HTML::text' ) ], 'IMPLIED' => '', 'CLOSED' => 1 }, 'YAPE::HTML::tag' ); >>>> yes a title <<<<


      MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
      I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
      ** The Third rule of perl club is a statement of fact: pod is sexy.