Samn has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: dictionary sorting
by Trimbach (Curate) on Mar 13, 2002 at 13:48 UTC
    It IS pretty simple: the "cmp" operator that you've already quoted. Your example is complicated because you spend 2 lines eliminating non-word characters and underscores, but without that it can reduced to:
    @temp = sort {lc($a) cmp lc($b)} @temp;
    How simple is that? :-D

    Gary Blackburn
    Trained Killer

Re: dictionary sorting
by dreadpiratepeter (Priest) on Mar 13, 2002 at 14:46 UTC
    This is another job for the Schwartzian Transform. You code is dreadfully inefficient because it does the stripping and case-conversion during every pass through the sort. Try this:
    @temp = map {$_->[1]} sort {$a->[0] cmp $b->[0]} map {($da = lc $_) =~ s/[\W_]+//g;[$da,$_]} @temp;

    This can be made even more efficient if you know the maximum length of your string. We can use pack to build an intermediary string that we an sort using the low level (in c) default sort. Try this:
    @temp = map {(unpack ("A100A100",$_))[1]} sort map {($da = lc $_) =~ s/[\W_]+//g;pack("A100A100",$da,$ +_)} @temp;

    Hope this helps

    -pete
    "I am Jack's utter lack of disbelief"
Knob Re: dictionary sorting
by knobunc (Pilgrim) on Mar 13, 2002 at 17:35 UTC

    Damn... I thought you were going to start a discussion on real dictionary sorting where all sorts of odd rules pertain (and I am not convinced that a computer can do most of them).

    In the most simple form you have to decide how spaces will sort. So, does a complete word sort before a partial word. Then you have to make similar decisions about punctuation (keeping a distinction between leading punctuation and punctuation that occurs in a word). Next add the complication of ignored prefixes (does d'Annunzio, Gabrielle sort under d or A?), these rules also may vary depending on whether the thing is a place or person (e.g. Saint Paul, does it sort as p (for Paul, Saint) or s (for the city)?).

    I read a few books on this from the library and discovered that the rules are pretty convoluted. Fortunately there appears to be an attempt to make the rules more computer friendly so that the program does not need knowledge of what something is in order to sort it.

    -ben