Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: String Comparison & Equivalence Challenge

by tybalt89 (Monsignor)
on Mar 15, 2021 at 03:08 UTC ( [id://11129641]=note: print w/replies, xml ) Need Help??


in reply to String Comparison & Equivalence Challenge

#!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11129602 use warnings; use Algorithm::Diff qw(traverse_sequences); use Tk; use Tk::ROText; my $file = 'KJV_fixed.csv'; # download from 11129602 (see above) my $verses = do { local (@ARGV, $/) = $file; <> }; my $theverse = '19,14,1'; my $thebody = ''; my $font = 'times 12'; my $bestcount = 20; my $mw = MainWindow->new; $mw->geometry( '+0+200' ); $mw->title('String Comparison & Equivalence Challenge'); $mw->Button(-text => 'Exit', -command => sub {$mw->destroy}, )->pack(-side => 'bottom', -fill => 'x'); my $ro = $mw->Scrolled(ROText => -scrollbars => 'osoe', -height => 20, -width => 100, -font =>$font, -spacing3 => 5, -wrap => 'word', )->pack(-side => 'bottom', -fill => 'both', -expand => 1); my $button = $mw->Button(-text => 'Find Closest', -command => \&closes +t, )->pack(-side => 'bottom', -fill => 'x'); my $thetext = $mw->Scrolled(ROText => -scrollbars => 'osoe', -height => 4, -wrap => 'word', -font =>$font, )->pack(-side => 'bottom', -fill => 'x', -expand => 0); $mw->Label(-text => 'Verse', -fg => 'blue', )->pack(-side => 'left', -fill => 'x'); my $entry = $mw->Entry( -textvariable => \$theverse, )->pack(-side => 'left', -fill => 'x'); $entry->focus; $mw->Button(-text => 'Find Verse', -command => \&findverse, )->pack(-side => 'left', -fill => 'x'); $entry->bind('<Return>' => \&findverse); $mw->Entry(-textvariable => \$bestcount, -width => 5, )->pack(-side => 'right'); $mw->Label(-text => 'How Many', -fg => 'blue', )->pack(-side => 'right'); MainLoop; exit; sub findverse { $thetext->delete('1.0' => 'end'); $thebody = ''; if( $verses =~ /^$theverse,(.*)/m ) { $thebody = $1; $thetext->insert(end => $thebody); closest(); } else { $thetext->insert(end => '*** Verse not found ***'); } } sub closest { $ro->delete('1.0' => 'end'); $ro->insert(end => "\nSearching ..." ); $mw->update; my @matches; while( $verses =~ /^((\d+,\d+,\d+),(.*))/gm ) { $2 eq $theverse and next; my $whole = $1; my $text = $3; my @sortfrom = sort my @from = $thebody =~ /\w+/g; my @sortto = sort my @to = $text =~ /\w+/g; my $total = my $good = 0; traverse_sequences( \@from, \@to, my $actions = { MATCH => sub {$total += 10; $good += 10}, DISCARD_A => sub {$total++}, DISCARD_B => sub {$total++}, } ); traverse_sequences( \@sortfrom, \@sortto, $actions ); push @matches, sprintf "%06d %s", $good / ++$total * 1e6, $whole; } $ro->insert(end => ' Sorting ...' ); $mw->update; @matches = reverse sort @matches; $ro->delete('1.0' => 'end'); $ro->insert(end => join "", map s/^(\d\d)\d*/$1%/r . "\n", @matches[0 .. $bestcount - 1] ); }

It takes about 2.6 seconds to go through the entire file searching for matches on my machine.

Replies are listed 'Best First'.
Re^2: String Comparison & Equivalence Challenge
by Polyglot (Chaplain) on Mar 15, 2021 at 04:06 UTC

    I took several minutes to install Tk.pm so that I could run this. I was really curious what it actually accomplished in just 2.6 seconds--or if, perhaps, it would be much different on my computer. Alas, I have perl 5.12, which is not compatible with the s///r pragma. I tried a couple of things to work around that, but feel like I'm working in the dark here, never having entirely grasped map or what it does. ...at the risk of appearing foolish, here's what I tried for that line that didn't work.

    $ro->insert(end => join "", map { do { (my $s = $_ ) =~ s/^(\d\d)\d* +/$1%/ . "\n"; $s }, @matches[0 .. $bestcount - 1] }); #RESULT: #Useless use of concatenation (.) or string in void context at KJV_Ver +seMatcher_PM_Script.pl line 88. #syntax error at KJV_VerseMatcher_PM_Script.pl line 89, near "})"

    Blessings,

    ~Polyglot~

      $ro->insert(end => join "", map s/^(\d\d)\d*/$1%/r . "\n", @matches[...] );
      $ro->insert(end => join "", map { do { (my $s = $_ ) =~ s/^(\d\d)\d*/$1%/ . "\n"; $s }, @matches[...] });

      Try:
          $ro->insert(end => join "", map { (my $r = $_) =~ s/^(\d\d)\d*/$1%/;  "$r\n"; } @matches[0 .. $bestcount - 1]);

      Update:

      ... the s///r pragma.
      NB: s///r is not a pragma. /r is a modifier (update: available from Perl version 5.14 on) of the s/// operator.


      Give a man a fish:  <%-{-{-{-<

        Thank you very much! The script works with that adjustment. BTW, "pragma" is not even in my dictionary--so I'm not able to easily verify its usage. I guess I'm a bit vocabulary challenged today.

        Blessings,

        ~Polyglot~

Re^2: String Comparison & Equivalence Challenge
by Polyglot (Chaplain) on Mar 15, 2021 at 05:23 UTC

    Thank you very much!

    With Anomalous Monk's help, I was able to run the script. It takes about 15.1 seconds on my computer--and that is checking for the similarities of one verse. But, wow! Tk made it like a GUI! I'd never used Tk before--so that was a new experience. Interesting. I might have to try experimenting more with that at some point when I have time. For now, I just need a simple text-based process, and, let's see, about five and half days to index the entire Bible at that level.

    Now I just need to set up a loop and redirect the output to a file. That shouldn't be too hard, should it? (I managed to do the latter already. Perhaps if I can isolate the logic from the GUI it might make it more efficient.)

    Blessings,

    ~Polyglot~

      I found Perl/Tk to be a steep learning curve for all options of all widgets, but very convenient for everyday use of a few, familiar widgets. A couple of good books, both from O'Reilly:

      • Mastering Perl/Tk, by Stephen Lidie and Nancy Walsh
      • Learning Perl/Tk, by Nancy Walsh


      Give a man a fish:  <%-{-{-{-<

Re^2: String Comparison & Equivalence Challenge
by LanX (Saint) on Mar 15, 2021 at 12:56 UTC
    How does this rank

    They are corrupt

    Vs

    Corrupt are they

    ?

    One or three matches?

    10/3 or 30/3 ?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11129641]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2024-04-19 10:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found