in reply to Natural sorting

after seeing various complaints about Sort::Naturally then having questions/concerns about another option (Sort::Key::Natural) I "grew my own". it may not be "elegant" (it is kind of a brute force method) but it does seem to handle my cases OK. it may also be easier to customize because it is not using any "tricks".

I have no problem with many CPAN packages but many of them seem to be confusing to use or seem to be rather large files for what should be simple functions and some do not seem to work well :(. this example may not be the fastest, most elegant, or "cleanest" code but it does the job I need it to do which for most cases boils down to doing things in the "Microsoft" file name order (for my benefit). for the other cases where I may use this I doubt anybody would notice something incorrect anyway.

#figure this out later#my $separator_characters=" _-\\\\\\\."; # +these are the separator characters ready for use in a regex sub __natural_split_item($) { my @out_array; # # by brute force we will separate a string into the output array # by these cases as they are found: # # separator characters # numerics # alpha/non numeric # # we may need to add other stuff in the future... # # when we change to a different type we will push the previous dat +a into # the output array. # my $last_type=0; # set to no previous my $last_split=""; if(defined $_[0]) # we may get an undefined item.... { for(my $i=0;$i<length $_[0];$i++) { my $thischar=substr $_[0],$i,1; # will need to do +something for unicode if we need it.... my $thistype; # just do not use zero and do not duplica +te values. # # separator characters may need to be handled a little dif +ferently. # first, we need to change them into spaces and second we +need to # push them on the output array as single characters. thi +s may fix # the __ versus _ sort cases where __ shows up first but w +e do the _ # first. # # note that the number selected for $thistype does not mat +ter - we are # only looking for a change in separator type. # #figure this out later#if($thischar =~ /[$separator_charac +ters]/) # this character is a separator characte +r if($thischar =~ /[ _\\\.]/) # this ch +aracter is a separator character { $thistype=100; $thistype++ if($last_type == $thistype); + # repeated separators of any sort must be treated a bit different +ly $thischar=" "; } elsif($thischar =~ /^\d$/) { $thistype=200; } + # this character is a numeric character else { $thistype=300; } + # this character is a non numeric if($last_type!=0 # if we have a previous type (0 m +eans no previous) && $last_type!=$thistype # and we have changed type && $last_split ne "" # and we have something to pus +h (we should if we get to this test) ) { push @out_array,$last_split; $last_split=""; } $last_type=$thistype; $last_split .= $thischar; # add this character to the c +urrent } push @out_array,$last_split; # add the last component } return @out_array; } sub my_custom_natural_sort { my @a_list=__natural_split_item $a; my @b_list=__natural_split_item $b; for (my $i=0;defined $a_list[$i] && defined $b_list[$i];$i++) { next if $a_list[$i] eq $b_list[$i]; # move on to next ite +m if this split component is the same if(($a_list[$i]=~ /\d\d*/) && ($b_list[$i]=~ /\d\d*/)) # if +both numeric to a numeric test first { return $a_list[$i] <=> $b_list[$i] if $a_list[$i] != $b_li +st[$i]; # simple numeric compare if not same value return $a_list[$i] cmp $b_list[$i] if $a_list[$i] ne $b_li +st[$i]; # ascii compare so we can handle leading zero variationsif + not same value } return (lc $a_list[$i]) cmp (lc $b_list[$i]) if (lc $a_list[$i +]) ne (lc $b_list[$i]); # compare as same case first #return ($a_list[$i]) cmp ($b_list[$i]) if ($a_list[$i]) ne ($ +b_list[$i]); # then compare as case sensitive (maybe) } return (length @a_list) <=> (length @b_list); # just do list le +ngth - longer list should be later. if items the same then it does n +ot matter.... }