Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
•Re: sorting an array with an array
by merlyn (Sage) on Oct 01, 2002 at 15:37 UTC | |
This'll be ok up to 100 filenames or so. If you have more, you'll probably want to cache the substr-naming mapping using a Schwartzian Transform or other device. Or, you could even go for a merge sort, which will scale for 1000's of filenames:
| [reply] [d/l] [select] |
by tye (Sage) on Oct 01, 2002 at 16:08 UTC | |
1 file for each day of the year Your first case assumes one file per month (well, it doesn't sort within each month, which was requested). Your second case doesn't deal with the possibility of a lack of leading zeros in days of the month. - tye (a leading zero himself) | [reply] |
|
(tye)Re: sorting an array with an array
by tye (Sage) on Oct 01, 2002 at 16:03 UTC | |
Simply replace the month name with a string that sorts as you like (using the default sort), then undo that replacement after the sort: So we set up %mo so that $mo{Jan} is 'a' and $mo{a} is 'Jan'. Then we read the "grep sort grep" from the bottom up: Take the unsorted list of file names, take the first 3 letters of each file name and replace it with appropriate letter, sort the list of modified file names, replace the first letter of the modified file names with the month abbreviations, return the sorted list. Note that this assumes that we have a leading zero in front of single-digit days of the month and that all of the month names are capitalized exactly as we expect. If not, then we need to account for that as well. (It also doesn't deal with invalid file names.) Update: You see, this is why Perl needs a 'filter' primitive. Neither map nor grep is quite the right tool for this job and map seems almost seductively right for it which results in people making mistakes like I just did. In the first snippet, I originally had 'map' instead of 'grep'. 'map' will give back the return value from the s/// operator, which is not the modified string. The second snippet had the same mistake but when I fixed it I also decided to deal with filenames that aren't named as we expected. It is easy to characterize this as an abuse of grep. I tend to agree. (: - tye (just that sort of guy... sort of) | [reply] [d/l] [select] |
by Abigail-II (Bishop) on Oct 01, 2002 at 16:48 UTC | |
Abigail | [reply] [d/l] |
by tye (Sage) on Oct 01, 2002 at 16:51 UTC | |
Yes, it isn't that hard to write one. But you missed one aspect in your version: filter { s/a/b/ } @list shouldn't modify the original @list. Let me go dig up my version... Update: Make that two features. (: I also think that 'filter' in a scalar context should return the concatenated values (a count is pretty useless): Though I've only tested this a little. - tye (filtering shouldn't be straining) | [reply] [d/l] [select] |
by Abigail-II (Bishop) on Oct 01, 2002 at 16:54 UTC | |
by bart (Canon) on Oct 01, 2002 at 21:52 UTC | |
In the first snippet, I originally had 'map' instead of 'grep'. 'map' will give back the return value from the s/// operator, which is not the modified string. (from another follow-up:)In that case, make a copy. And return $_.
| [reply] [d/l] |
by Aristotle (Chancellor) on Oct 04, 2002 at 21:59 UTC | |
However, I really find this silly, as you can simply do the following:
Makeshifts last the longest. | [reply] [d/l] [select] |
|
Re: sorting an array with an array
by bart (Canon) on Oct 01, 2002 at 22:14 UTC | |
So now, for any 3 character month string with the proper case, $months[$monthindex{$monthname}] will return the original string. That's why I called it inversion: it inverts the function that is the array lookup by index. With this, you can do a plain stupid direct sort on hash value: Or you can do a more sofisticated version with a Schwartzian Transform, caching the substr or better still, the monthindex for the file:
In general, I think it would be smarter to do the month lookup for unified case, like all lower case, so you can just as well sort "sep", "Sep", or "SEP".
| [reply] [d/l] [select] |
|
Re: sorting an array with an array
by BrowserUk (Patriarch) on Oct 02, 2002 at 06:49 UTC | |
The first thing I noticed was that there isn't any need to use a hash for the lookup function. Instead of an array, you can just use index into a string containing the three character days in the appropriate order
The nice thing is that misspelt filenames will just get sorted to the top (or bottom) rather than breaking the sort. Given that the OP said that there was one file per day, a max of 366 files, I decided to compare this simple sort against an ST sort (I stole bart's implementation). The results showed that for 366 filenames, a higher resolution timer was needed to distinguish them apart. That got me to wondering what the breakpoint would be if there were more than 1 file per day, and the results are surprising. For up to a 100/day, for a total of 36600 filenames, the simple sort outperformed the ST by a substantial margin.
Then I thought, maybe the hash lookup was the significant factor, so I modified the ST to use index instead of a hash to ensure I was comparing apples with apples. Even then, the simple sort out performs the ST up to 100/day.
However, the results are nearly identical. It seems that the overhead of creating all those little arrays is significant enough to require care to check that the expense of the repeatative function being cached is greater. The simple sort obviously has no memory overhead at all. Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring! | [reply] [d/l] [select] |