Spida has asked for the wisdom of the Perl Monks concerning the following question:

Hi! I have an array with entries like "a,a,a,a"; "b"; "c,c,c" and I want to sort them in descending order by number of occurence of the char ",", so that the result would be "a,a,a,a"; "c,c,c"; "b". I have no hint how to do that, It'll be about my 5th perl-script ;-) Please Help. Thanks, Spida
  • Comment on Sorting array by number of occurences of a char

Replies are listed 'Best First'.
Re: Sorting array by number of occurences of a char
by sauoq (Abbot) on Oct 04, 2002 at 15:22 UTC

    Use tr/// to count the commas and use a Schwartzian Transform to limit the amount of work that has to be done.

    #!/usr/bin/perl my @unsorted = qw(a,b,c c,b,a a,a,a,a b,b); my @sorted = map { $_->[1] } sort { $b->[0] <=> $a->[0] } map { [ tr/,// , $_ ] } @unsorted; print "@sorted\n"
    -sauoq
    "My two cents aren't worth a dime.";
    

      Gah, sauoq beat me to it. This is the famous Schwartzian Transform. If this is your 5th perl script, it's time you start using it! :)

      Update: Either I'm blind, or saouq has edited his node. This node serves no purpose now :)

      -- Dan

        Either I'm blind, or saouq has edited his node. This node serves no purpose now :)

        Yes, zigdon, you caught me. Originally I just pasted my quick one-liner in and hit submit instead of preview.

        Your node does serve a purpose though. You have that nice little link that explains the ST very thoroughly. :-)

        This was the text of the node when zigdon first saw it:

        $ perl -le '@unsorted = qw(a,b,c c,b,a a,a,a,a b,b); @sorted = map { $ +_->[1] } sort { $b->[0] <=> $a->[0] } map { [tr/,// , $_] } @unsorted +; print "@sorted"' a,a,a,a a,b,c c,b,a b,b
        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Sorting array by number of occurences of a char
by BrowserUk (Patriarch) on Oct 04, 2002 at 18:51 UTC

    Depending on your data, you may not need the complexity of the ST sort.

    If your data is as your samples show, with each a string of single chars separated by commas, then the number of commas is proportional the string length, so you could get away with @sorted = sort{ length($b) <=> length($a) } @unsorted; which is very fast in Perl and will outperform the ST in every case.

    If however, your data consists of comma separated, variable length elements then you'll need to use tr/// as shown above, but depending on the length of the elements and the size of the array re-calculating the comma count can still win over allocating the small anonymous arrays used by the ST. Then again, efficiency may not be a consideration in which case, the following simple sort is easier to follow my @sorted=sort{ $b=~tr/,// <=> $a=~tr/,// } @unsorted;

    If your interested in seeing this can be a win for small and medium amounts of data,


    Cor! Like yer ring! ... HALO dammit! ... 'Ave it yer way! Hal-lo, Mister la-de-da. ... Like yer ring!