I think to be as accurate as possible, you're going to have to go through a few cycles with this program. First off, as
Roy Johnson put it, compute a checksum on every file and compare them. Any duplicate checksums are truly identical files. Next, you should compare the file-names to find potential duplicates that may not be tagged or may be inaccurately tagged. Finally, compare the ID tags to find duplicate copies of songs that may have different filenames and be slightly different (different remixes or maybe just missing the last few seconds or different bit-rates, etc...).
Most importantly, I think you need to have your code output a file for a human to review, not do the deleting itself. If/when you complete it, I'd like to suggest that you post it on PM. I'm sure there are hundreds of others who could benefit from that!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.