Re: cmp two HTML fragments

Nice. Have you considered turning this into a module?

Another way to do this is to use HTML::PrettyPrinter or somesuch and do a string-wise comparision. That way it's easier to find how the code differes (using string diff tools) if needed, but it's probably a lot slower.

There's an (inherited) bug in your code. It leaks memory. You need to free the circular references in the tree by using the delete method:

sub cmpHtml {
    ...

    my $cmp = cmpHtmlElt ($root1, $root2);

    $_->delete for $root1, $root2;

    return $cmp;
}
[download]

As a parenthesis I'd like to share this little trick:

$cmp = EXPR;
return $cmp if $cmp;
[download]

which you use make plenty use of can be replaced with

{ return EXPR || next }
[download]

(assuming scalar context) though that may be a bit too obfuscated to use in public code. :-)

lodin

Comment on Re: cmp two HTML fragments Select or Download Code

Replies are listed 'Best First'.
Re^2: cmp two HTML fragments by GrandFather (Saint) on Feb 10, 2008 at 22:37 UTC
as it happens the code shown was pretty transient anyway. For the module test suite that I wrote the code for, I replaced it with: `my $root1 = HTML::TreeBuilder->new (); my $root2 = HTML::TreeBuilder->new (); $root1->parse_content ($rendered)->elementify () ->delete_ignorable_whitespace (); $root2->parse_content ($expected)->elementify () ->delete_ignorable_whitespace (); is ($root1->as_HTML (undef, ' ', {}), $root2->as_HTML (undef, ' ', {}), $testName);` [download] in any case so that I'd get better diagnostics (I see the two HTML fragments when the test fails). However, with a little tweaking to give a traceback the original code would be even better in the test context because it would highlight the difference by reducing the clutter. That version might almost be worth generating a module for. Perl is environmentally friendly - it saves trees	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: cmp two HTML fragments
by GrandFather (Saint) on Feb 10, 2008 at 22:37 UTC

as it happens the code shown was pretty transient anyway. For the module test suite that I wrote the code for, I replaced it with:

    my $root1 = HTML::TreeBuilder->new ();
    my $root2 = HTML::TreeBuilder->new ();

    $root1->parse_content ($rendered)->elementify ()
      ->delete_ignorable_whitespace ();
    $root2->parse_content ($expected)->elementify ()
      ->delete_ignorable_whitespace ();

    is ($root1->as_HTML (undef, '   ', {}),
        $root2->as_HTML (undef, '   ', {}), $testName);
[download]

in any case so that I'd get better diagnostics (I see the two HTML fragments when the test fails). However, with a little tweaking to give a traceback the original code would be even better in the test context because it would highlight the difference by reducing the clutter. That version might almost be worth generating a module for.

Perl is environmentally friendly - it saves trees

[reply]
[d/l]