diweooy has asked for the wisdom of the Perl Monks concerning the following question:

I have an array full of hashes with paths to files or folders. I want to sort it naturally/alphabetically but to have folder on top, subfolder on top of the folder etc. same view like in windows explorer or some other file manager with all subfolders open.
my @arr = ( { 'path' => '/c/', dir => '/' }, { 'path' => '/c/a1.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/a2.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa1.mp3', 'file' => 1, 'dir' => '/c/bb' }, { 'path' => '/c/bb/aa2.mp3', 'file' => 1, 'dir' => '/c/bb' }, { 'path' => '/c/bb/', 'dir' => '/c' }, { 'path' => '/c/cc/', 'dir' => '/c' }, { 'path' => '/c/aa/', 'dir' => '/c' } ); @arr = sort sort_files @arr; foreach (@arr) { my %h = %{$_}; print $h{'path'}."\n"; } # putting the folders on the top works, but... see output sub sort_files { # if both in the same folder and one is folder, put it top if ($$a{'dir'} eq $$b{'dir'}) { if ($$a{'file'} && !$$b{'file'}) { return 1; } elsif (!$$a{'file'} && $$b{'file'}) { return -1; } } # compare lc $$a{'path'} cmp lc $$b{'path'}; }
this is the output:
/c/ /c/aa/ /c/bb/ /c/cc/ /c/a1.mp3 <<<< file! it should at the end of the list /c/a2.mp3 <<<< file! should at the end of the list /c/bb/aa1.mp3 <<<< this should be below /c/bb/ /c/bb/aa2.mp3 <<<< this should be below /c/bb/
if I just sort it lexigraphicaly, it is sorted nicely, but the folders are not on the top of the parent folder. creative ideas?

Replies are listed 'Best First'.
Re: sorting tree with folders on top
by salva (Canon) on Apr 07, 2009 at 11:37 UTC
    Using Sort::Key:

    (Note that this code really does a natural sort inside the path parts so, for instance, "a2" goes before "a12")

    use Sort::Key::Natural qw(mkkey_natural); use Sort::Key qw(keysort); my @arr = ( { 'path' => '/c/', dir => '/' }, { 'path' => '/c/a1.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/a2.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa1.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa2.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa12.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/', 'dir' => '/c' }, { 'path' => '/c/cc/', 'dir' => '/c' }, { 'path' => '/c/aa/', 'dir' => '/c' } ); sub mkkey_fspath { my $path = shift; join("\x00/", map(mkkey_natural($_), split(/\/+/, $path))) } my @arr = keysort { if ($_->{file}) { my ($dir, $file) = $_->{path} =~ m|^((?:.*/)?)([^/]*$)|; join("\x01/", mkkey_fspath($dir), mkkey_fspath($file)); } else { my $dir = $_->{path}; $dir =~ s|/+$||; mkkey_fspath($dir); } } @arr; foreach (@arr) { my %h = %{$_}; print $h{'path'}."\n"; }
    update: as diweooy pointed out there was an error on the key generation sub.
      output:
      /c/ /c/aa/ /c/bb/ /c/cc/ << this should be below /c/bb/*.mp3 /c/bb/aa1.mp3 /c/bb/aa2.mp3 /c/bb/aa12.mp3 /c/a1.mp3 /c/a2.mp3
      it looks better on the first look, but /c/bb/*.mp3 should be below /c/bb, not below /c/cc.
Re: sorting tree with folders on top
by jethro (Monsignor) on Apr 07, 2009 at 11:54 UTC

    Here is a hack that does what you want:

    use File::Basename; sub sort_files { my ($aname,$apath,$asuff)=fileparse $$a{'path'}; my ($bname,$bpath +,$bsuff) = fileparse $$b{'path'}; if ($$a{'file'}) { $aname= 'zzzzzzzz'.$aname; } if ($$b{'file'}) { $bname= 'zzzzzzzz'.$bname; } lc $apath.$aname cmp lc $bpath.$bname; }

    I call it hack because it hopes that you never have a dir that is called 'zzzzzzzzzzzzz'. You could use some other character as prefix that is higher than z in the ascii sequence and hoepfully not a valid filename character, but what if that sequence changes or some valid character is even further up in the sequence.

    A really solid solution would have to check between files A and B:
    1) if A is a directory and in the same directory as B and B is a file then A must come first. Or vice versa
    2) If A is in a subdirectory of the directory B is in and B is a file then A must come first. Or vice versa
    The second test can be done with a simple regex. Must do something for my day job so I leave that to you ;-)

      I was trying to follow 1) and 2), but I did not have much success:
      sub sort_files { my $adir = $1 if $$a{path} =~ m/^(.*)\//; my $bdir = $1 if $$b{path} =~ m/^(.*)\//; # a and b in the same dir if ($adir eq $bdir) { if ($$a{'file'} && !$$b{'file'}) { return -1; } elsif (!$$a{'file'} && $$b{'file'}) { return 1; } return lc $$a{'path'} cmp lc $$b{'path'}; } # b in subdir of a if ($bdir =~/^$adir\// && $$a{'file'} && !$$b{'file'}) { return -1; # a in subdir of b } elsif ($adir =~/^$bdir\// && $$b{'file'} && !$$a{'file'}) { return 1; } lc $$a{'path'} cmp lc $$b{'path'}; }

        I got the right result with this code. I left my debug print statements in there so you can test some more. I'm not sure it gets all border cases. Interestingly the same-dir rule turned out to be a special case of the subdir rule (or not? Please test some more. Really)

        sub sort_files { print $$a{'path'},' - ', $$b{'path'}," : "; my $adir = $$a{'dir'}; my $bdir = $$b{'dir'};; # b in subdir of a if ($bdir =~/^$adir/ && $$a{'file'}) { print "dir 1\n"; return 1; # a in subdir of b } elsif ($adir =~/^$bdir/ && $$b{'file'}) { print "dir -1\n"; return -1; } # a and b in the same dir #if ($$a{'dir'} eq $$b{'dir'}) { # if ($$a{'file'} && !$$b{'file'}) { # print "f -1\n"; return -1; # } elsif (!$$a{'file'} && $$b{'file'}) { # print "f 1\n"; return 1; # } # } print " ",lc $$a{'path'} cmp lc $$b{'path'},"\n"; return lc $$a{'path'} cmp lc $$b{'path'}; }

        Note I changed your code so that $adir is simply $$a{'dir'} as it should be the same. Also put 2) before 1) and removed the condition in 2) that b should not be a file

      ;-)) ... I like this one :).
Re: sorting tree with folders on top
by svenXY (Deacon) on Apr 07, 2009 at 10:23 UTC
    Hi,
    just a quick question for clarification: You want the files of subfolders to appear directly below their folder, but not the files of the parent folder. Those you want to appear below all subfolders - right? Does that make sense to you?
    Regards,
    svenXY

    ++ for the working sample code
      I want the files of the folder to appear just below their folder. if there are subfolders (with or without files), display these first. files have existing hash key 'file'. folders do not. this is maybe more understandable:
      A-Folder AA5-Folder | ----- AAA5-File1 | ----- AAA5-File2 BB5-Folder CC5-Folder A-File1 A-File2
      A-File1/2 are at the end of the list, because they are files. maybe this screenshot helps more: http://img212.imageshack.us/img212/3610/bildschirmfoto1u.png -> files are below subfolders
Re: sorting tree with folders on top
by svenXY (Deacon) on Apr 07, 2009 at 10:37 UTC
    sub sort_files {
            # if both in the same folder and one is folder, put it top
            if ($$a{'dir'} eq $$b{'dir'}) {

    this if only compares whether the hash claims that they are in what you call "dir" - here it is always "/c", hence you will put all folders on top and all files below.

    What you need - and that's why I asked beforehand - is some way of finding out that files are either in your "main"/parent directory (then sort them downwards) or if they are in a subdir (then list them directly below their subdirectory).
    Regards,
    svenXY
      my mistake. it should be:
      my @arr = ( { 'path' => '/c/', dir => '/' }, { 'path' => '/c/a1.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/a2.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa1.mp3', 'file' => 1, 'dir' => '/c/bb' }, # th +is is /c/bb { 'path' => '/c/bb/aa2.mp3', 'file' => 1, 'dir' => '/c/bb' }, # th +is is /c/bb { 'path' => '/c/bb/', 'dir' => '/c' }, { 'path' => '/c/cc/', 'dir' => '/c' }, { 'path' => '/c/aa/', 'dir' => '/c' } );
      key "dir" is only here to make it faster to compare where which file/subfolder is located. dir is nothing else but everything from the path until last slash. path =~ s/^(.*)(\/.*)$/$1/;
      Add this
      #fix dir for my $i (@arr){ $i->{dir} = $1 if $i->{path} =~ m{^(.+)/}; }
      one step closer :)
      /c/ /c/a1.mp3 /c/a2.mp3 /c/aa/ /c/bb/ /c/bb/aa1.mp3 /c/bb/aa2.mp3 /c/cc/
Re: sorting tree with folders on top
by ELISHEVA (Prior) on Apr 07, 2009 at 20:26 UTC

    I wasn't clear from your post if portability was an issue. Your sample data lists *nix style file names but your point of reference is "Windows Explorer". If so, the answers given so far assume *nix file naming conventions and may not be portable. For portable solutions, you'll need to use File::Spec.

    Here is an example showing how to use it. In addition to the sorting it also calculates the information needed to properly indent names "like Windows Explorer":

    use strict; use warnings; use File::Spec; my @arr = ( { 'path' => '/c/', dir => '/' }, { 'path' => '/c/a1.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/a2.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa1.mp3', 'file' => 1, 'dir' => '/c/bb' }, { 'path' => '/c/bb/aa2.mp3', 'file' => 1, 'dir' => '/c/bb' }, { 'path' => '/c/bb/', 'dir' => '/c' }, { 'path' => '/c/cc/', 'dir' => '/c' }, { 'path' => '/c/aa/', 'dir' => '/c' } ); my $TAB_WIDTH=3; my $FORMAT="%-30s%s\n"; #----------------------------------------- # The sorting function #----------------------------------------- sub parsePathPortably { my $hPathInfo = shift; my $sPath = $hPathInfo->{path}; my $bFile = $hPathInfo->{file}; my ($sVol, $sDir, $sLocal) = File::Spec->splitpath($sPath, !$bFile); # File::Spec->splitdir assumes $dir is a relative path # without this line the first element of @aDirs on *nix is # an empty directory (because nothing is before root) my $sRelPath = File::Spec->abs2rel($sDir, File::Spec->rootdir()); my @aDirs =File::Spec->splitdir($sRelPath); # get indentation level and local file name my $iDepth = scalar(@aDirs); unless ($bFile) { $iDepth--; $sLocal = pop @aDirs; } # reconstruct the path, but before each directory name # segment insert 0x000 and before each file name segment # insert 0x001. It is important to insert something # before *all* directory name elements to insure # that the "b" segment of /a/b/c and /a/b sort properly # # we also insert before file names on the off,off chance # that you run into a file system where there is a file # name that begins with 0x000. DOS and most *nix systems # prohibit 0x000 in file names, but not all # - see [Wikipedia://Filename] for a list of OS's where # 0x000 does not appear to be in the list of reserved or # prohibited characters, among them: OS2, various Mac OS's my $k = File::Spec->catpath ($sVol, File::Spec->catdir(map{ord(0).$_} @aDirs) , ord($bFile ? 1 : 0) . $sLocal); return $k => [ $iDepth, $sPath, $sLocal ]; } #-------------------------------------------------- # Demonstration of how to use the sorting function #-------------------------------------------------- my $hPaths = { map { parsePathPortably($_) } @arr }; foreach my $k (sort keys %$hPaths) { my ($iDepth, $sPath, $sLocal) = @{$hPaths->{$k}}; my $sIndent = ' ' x ($TAB_WIDTH*$iDepth); printf($FORMAT, "$sIndent$sLocal", $sPath); }

    This produces the following output:

    c /c/ aa /c/aa/ bb /c/bb/ aa1.mp3 /c/bb/aa1.mp3 aa2.mp3 /c/bb/aa2.mp3 cc /c/cc/ a1.mp3 /c/a1.mp3 a2.mp3 /c/a2.mp3
Re: sorting tree with folders on top
by ig (Vicar) on Apr 07, 2009 at 20:36 UTC

    Maybe you would like this solution:

    use strict; use warnings; my @arr = ( { 'path' => '/c/', dir => '/' }, { 'path' => '/c/a1.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/a2.mp3', 'file' => 1, 'dir' => '/c' }, { 'path' => '/c/bb/aa1.mp3', 'file' => 1, 'dir' => '/c/bb' }, { 'path' => '/c/bb/aa2.mp3', 'file' => 1, 'dir' => '/c/bb' }, { 'path' => '/c/bb/', 'dir' => '/c' }, { 'path' => '/c/cc/', 'dir' => '/c' }, { 'path' => '/c/aa/', 'dir' => '/c' } ); @arr = sort sort_files @arr; foreach (@arr) { my %h = %{$_}; print $h{'path'}."\n"; } # putting the folders on the top works, but... see output sub sort_files { my $apath = $$a{path}; my $bpath = $$b{path}; $apath =~ s/([^\/]*\/)/\x00$1/g; $bpath =~ s/([^\/]*\/)/\x00$1/g; $apath cmp $bpath; }

    If performance is an issue, you could pre-process your array to include the modified path strings so that you only have to calculate them once.

Re: sorting tree with folders on top
by rovf (Priest) on Apr 07, 2009 at 10:25 UTC

    I somehow miss in your code a test whether a path represents a directory or a plain file (or a symlink to a directory or to a file etc.).

    -- 
    Ronald Fischer <ynnor@mm.st>
      hash key 'file' represents files. folders do not have this key. that's why:
      if ($$a{'file'} && !$$b{'file'}) { return 1; } elsif (!$$a{'file'} && $$b{'file'}) { return -1; }
Re: sorting tree with folders on top
by svenXY (Deacon) on Apr 07, 2009 at 10:54 UTC
    Maybe it's better if you first tell us where this hash comes from and what it is good for. Do you create it yourself? Why this file-field? What is the dir-field good for? Why do files below c/bb also have dir set to /c? - sorry, missed the last posting which kind of clarifies this
    Regards,
    svenXY