I'm trying to use Algorithm::Diff to markup the differances between two plain text sequences (one old, one new) by striking through the deleted bits and highlighting the new bits.

Here is the code:

[snip] % my @old_body = split(/\b/, $old_body); % my @new_body = split(/\b/, $new_body); % my $diff = Algorithm::Diff->new(\@old_body, \@new_body); % while ($diff->Next()) { % if ($diff->Diff()) { <span class="old_text"> % foreach ($diff->Items(1)) { <% display_pre($_) %> % } </span> <span class="new_text"> % foreach ($diff->Items(2)) { <% display_pre($_) %> % } </span> % } else { % foreach ($diff->Same()) { <% display_pre($_) %> % } % } % } [snip]

It seems to work really quickly sometimes and really slowly other times, which is odd.

I Devel::Dprofed it and got the following:

%Time ExclSec CumulS #Calls sec/call Csec/c Name 47.5 46.88 46.888 123881 0.0000 0.0000 Algorithm::Diff::_replace +NextLarge rWith 17.3 17.05 64.094 1 17.056 64.093 Algorithm::Diff::_longest +CommonSub 5 sequence 0.31 0.304 0.304 16299 0.0000 0.0000 HTML::Mason::Request::pri +nt 0.16 0.153 64.717 1 0.1528 64.716 HTML::Mason::Request::cal +l_next 0.11 0.106 0.154 5418 0.0000 0.0000 HTML::Mason::Commands::BE +GIN 0.09 0.086 0.101 1 0.0855 0.1010 Algorithm::Diff::_withPos +itionsOfI nInterval 0.07 0.070 0.070 3 0.0233 0.0233 HTML::Mason::Interp::appl +y_escapes 0.06 0.063 0.063 8911 0.0000 0.0000 Algorithm::Diff::__ANON__ 0.05 0.050 0.060 7 0.0072 0.0086 HTML::Mason::Request::com +p 0.04 0.038 0.038 5394 0.0000 0.0000 HTML::Entities::encode_en +tities 0.03 0.030 64.947 1 0.0301 64.947 HTML::Mason::ApacheHandle +r::handle 1 r 0.03 0.030 64.124 1 0.0300 64.123 Algorithm::Diff::LCSidx 0.02 0.020 0.020 1 0.0200 0.0200 DBI::connect 0.01 0.010 0.010 9 0.0011 0.0011 vars::import 0.01 0.010 0.010 1 0.0100 0.0100 Apache::Session::DESTROY

Any ideas as to what is causing the lag?

-Andrew.


Andrew Tomazos  |  andrew@tomazos.com  |  www.tomazos.com

In reply to Optimizing Algorithm::Diff by tomazos

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.