Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: similar texts !?

by BrowserUk (Patriarch)
on Jul 12, 2003 at 13:10 UTC ( [id://273619]=note: print w/replies, xml ) Need Help??


in reply to similar texts !?

The problem with the filenames I seen for mp3's and the like is that everyone tends to classify them differently. The words used may be the same, but the order tends to get switched around. Some classify my the musician surname/first name/album/track, others by any number of permutations of those plus other stuff.

You might get somewhere if you striped non-alphas and spaces, and the used String::Approx,String::Similarity, Text::Levenstien or if speed is a concern Text::LevenstienXS, though I've had trouble getting the latter to compile.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://273619]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-25 16:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found