Sorting array by number of occurences of a char

Spida has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Sorting array by number of occurences of a char by sauoq (Abbot) on Oct 04, 2002 at 15:22 UTC
Use tr/// to count the commas and use a Schwartzian Transform to limit the amount of work that has to be done. `#!/usr/bin/perl my @unsorted = qw(a,b,c c,b,a a,a,a,a b,b); my @sorted = map { $_->[1] } sort { $b->[0] <=> $a->[0] } map { [ tr/,// , $_ ] } @unsorted; print "@sorted\n"` [download] -sauoq "My two cents aren't worth a dime.";	[reply] [d/l] [select]
Re: Re: Sorting array by number of occurences of a char by zigdon (Deacon) on Oct 04, 2002 at 15:26 UTC
Gah, sauoq beat me to it. This is the famous Schwartzian Transform. If this is your 5th perl script, it's time you start using it! :) Update: Either I'm blind, or saouq has edited his node. This node serves no purpose now :) -- Dan	[reply]
Re: Re: Re: Sorting array by number of occurences of a char by sauoq (Abbot) on Oct 04, 2002 at 15:43 UTC
Either I'm blind, or saouq has edited his node. This node serves no purpose now :) Yes, zigdon, you caught me. Originally I just pasted my quick one-liner in and hit submit instead of preview. Your node does serve a purpose though. You have that nice little link that explains the ST very thoroughly. :-) This was the text of the node when zigdon first saw it: `$ perl -le '@unsorted = qw(a,b,c c,b,a a,a,a,a b,b); @sorted = map { $ +_->[1] } sort { $b->[0] <=> $a->[0] } map { [tr/,// , $_] } @unsorted +; print "@sorted"' a,a,a,a a,b,c c,b,a b,b` [download] -sauoq "My two cents aren't worth a dime.";	[reply] [d/l]
Re: Sorting array by number of occurences of a char by BrowserUk (Patriarch) on Oct 04, 2002 at 18:51 UTC
Depending on your data, you may not need the complexity of the ST sort. If your data is as your samples show, with each a string of single chars separated by commas, then the number of commas is proportional the string length, so you could get away with `@sorted = sort{ length($b) <=> length($a) } @unsorted;` which is very fast in Perl and will outperform the ST in every case. If however, your data consists of comma separated, variable length elements then you'll need to use tr/// as shown above, but depending on the length of the elements and the size of the array re-calculating the comma count can still win over allocating the small anonymous arrays used by the ST. Then again, efficiency may not be a consideration in which case, the following simple sort is easier to follow `my @sorted=sort{ $b=~tr/,// <=> $a=~tr/,// } @unsorted;` If your interested in seeing this can be a win for small and medium amounts of data, Read more... (5 kB) Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!	[reply] [d/l] [select]