in reply to Re^2: Duplicates in Directories
in thread Duplicates in Directories
Then, of course, as you rightly said, if they don't come in alphabetical order, or if there is any doubt, it is just as easy to use the Perl sort facility, and it will only take a few split seconds with 10,000 files.
The idea of sorting data to get better performance (avoiding lookups) is sometimes very efficient. I'm doing it quite commonly in a slightly different context, to compare pairs of very large files that would not fit in a hash: sorting both files on the comparison key (using the *nix sort utility), and then reading both files in parallel in my Perl program to detect records missing from either file or differences in attributes of records having the same comparison key.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Duplicates in Directories
by huck (Prior) on Oct 09, 2017 at 18:55 UTC | |
by Laurent_R (Canon) on Oct 09, 2017 at 19:39 UTC | |
by soonix (Chancellor) on Oct 09, 2017 at 20:01 UTC | |
by Laurent_R (Canon) on Oct 09, 2017 at 21:24 UTC | |
by huck (Prior) on Oct 09, 2017 at 20:28 UTC | |
by Laurent_R (Canon) on Oct 09, 2017 at 21:21 UTC | |
by huck (Prior) on Oct 09, 2017 at 22:40 UTC |