Re: Which one is the better Regex?

in reply to Which one is the better Regex?

It depends on what you define "better".

If we look at the first set, they are both wrong. Sure, in many cases, they will extract the title, but in some cases, they will not. For instance, in the first regex, you are assuming that any < starts a tag. This is not the case however. Furthermore, both regexes assume comments do not exist. Or CDATA marked sections. Note also that the latter one uses both (?i) and /i. One can be omitted.

As for the last set of regexes, both are so horribly wrong, that talking about which one is better carries no meaning at all. It's like asking "what's better to eat with fries? Yellow or wednesday?".

As the FAQ says, if you want to extract elements, or remove tags, PARSE the HTML, don't use trivial regexes.

Abigail

Comment on Re: Which one is the better Regex? Select or Download Code

Replies are listed 'Best First'.

Re: Re: Which one is the better Regex?
by Rufnex (Novice) on Feb 27, 2003 at 14:11 UTC

thx

Re: Which one is the better Regex?

by Abigail-II (Bishop) on Feb 27, 2003 at 14:30 UTC

Abigail

Re: Re: Re: Which one is the better Regex?

by PodMaster (Abbot) on Feb 28, 2003 at 07:45 UTC

#!/usr/bin/perl

use YAPE::HTML;
use Data::Dumper;
use warnings;
use strict;

my $content = "
    <html>
        <title>
            yes a title
        </title>
        <body>
            yes a body
        </body>
    </html>
";

my $parser = YAPE::HTML->new($content);
my $extor = $parser->extract( 'title' => []);

while (my $chunk = $extor->()) {
    print Dumper $chunk;
    print $/,'>>>>',$chunk->text()->[0]->string(),'<<<<',$/x5;
}
__END__

$VAR1 = bless( {
                 'TYPE' => 'tag',
                 'ATTR' => {},
                 'TAG' => 'title',
                 'TEXT' => [
                             bless( {
                                      'TYPE' => 'text',
                                      'TEXT' => '
            yes a title
        '
                                    }, 'YAPE::HTML::text' )
                           ],
                 'IMPLIED' => '',
                 'CLOSED' => 1
               }, 'YAPE::HTML::tag' );

>>>>
            yes a title
        <<<<
[download]

MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
** The Third rule of perl club is a statement of fact: pod is sexy.

In Section Seekers of Perl Wisdom