in reply to Re: Sorting an array of hashes and filenames
in thread Sorting an array of hashes and filenames

actually I was trying to sort on the first column only (the hash values) not the file names. How do I just sort on the first column, but still print the hash and the file name but sorted based on the hashes. thanks for all the help once again
  • Comment on Re^2: Sorting an array of hashes and filenames

Replies are listed 'Best First'.
Re^3: Sorting an array of hashes and filenames
by talexb (Chancellor) on Jan 15, 2009 at 03:44 UTC

    Being as you're fairly new here, I'm not going to rag on you too much, but you need to understand something -- this site welcomes questions about Perl, but it's important to be able to explain

    • What you're trying to do
    • What result you're expecting
    • What result you're actually getting
    • What code you're using to get that result -- a complete script would be ideal.
    In your case, we have a fragment of Perl that doesn't compile. This isn't helpful to us, so our answers may not be very useful to you. A complete working example of what isn't working is way more use to us.

    Now, please note the copious (and probably unnecessary) comments and the logical variable names in my script. Your original code had an array called 'hashes'. That's only confusing initially, but it's not a great name, I would have used digests or something like that. But the biggest problem with your code is that there were no comments.

    Once more, with feeling:

    • What you're trying to do (sort an array of hashes, except it's not an AoH, but an array of MD5 digests. Except that they also have filenames)
    • What result you're expecting (a list, sorted by hash value)
    • What result you're actually getting (we aren't told, except that it's not sorted)
    • What code you're using to get that result. (We got a code fragment that doesn't compile even when we clean up the braces, and the note that @array continues some data looks useful until you observe that this variable never appears in the code fragment.)

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Thanks for the tip and code. Here is the code that I have that is still not sorting the data correctly. I would like to sort based on the first value as shown under expected results.
      #!/usr/bin/perl use strict; use File::Find; use File::stat; use Digest::MD5; find(\&search, "/home/test/"); sub search { my %result; # Make sure we only look at files and end in src extn. if ( -f && /\.src$/i ) { foreach my $files ( $File::Find::name ) { open(FH, $files) or die "Can't open '$files': $!"; binmode(FH); # Get/Create MD5 for the file my $hashValue = Digest::MD5->new->addfile(*FH)->hexdigest; # Store the filename, indexed by hash value, into an arra +y. This # allows us to store multiple files with the same hash va +lue. push( @{ $result{$hashValue} }, $files ); } # Dump out the result hash, sorting by the hash values. foreach my $key ( sort keys %result ) { # Dump out the array of filenames indexed by this hash va +lue. We # could have sorted this list too if we wanted. foreach my $filename ( @{ $result{$key} } ) { print "$key -> $filename\n"; } } } } ------------------------------------------------------ _____Current results______ b78c8fafb9a5d4df6b36dcd35c56f6aa -> /home/test/file1.src f49f62c0d885ea4687d3149ea95c98fd -> /home/test/test2.src 6ebd09e0c2ff94316410b1444fbbb37a -> /home/test/file4.src b1a087078b62487c1d4c02f4c943af09 -> /home/test/file10.src ______Expected results________ 6ebd09e0c2ff94316410b1444fbbb37a -> /home/test/file4.src b1a087078b62487c1d4c02f4c943af09 -> /home/test/file10.src b78c8fafb9a5d4df6b36dcd35c56f6aa -> /home/test/file1.src f49f62c0d885ea4687d3149ea95c98fd -> /home/test/test2.src

        I think you may have misunderstood how File::Find works -- when you call the find method, it's going to call your (mis-named) search function for each file that it finds.

        I would rename that function to storeFileHashValue to better reflect what it's actually doing. Then I'd move the declaration of %result into the file scope, and dump out the results of your file scanning after you call find.

        So your code would look something like this:

        #!/usr/bin/perl use strict; use File::Find; use File::stat; use Digest::MD5; my %result; find(\&storeHashvalue, "/home/test/"); # Dump out the result hash, sorting by the hash values. foreach my $key ( sort keys %result ) { # Dump out the array of filenames indexed by this hash value. We # could have sorted this list too if we wanted. foreach my $filename ( @{ $result{$key} } ) { print "$key -> $filename\n"; } } sub storeHashvalue { # Make sure we only look at files and end in src extn. if ( -f && /\.src$/i ) { foreach my $files ( $File::Find::name ) { open(FH, $files) or die "Can't open '$files': $!"; binmode(FH); # Get/Create MD5 for the file my $hashValue = Digest::MD5->new->addfile(*FH)->hexdigest; # Store the filename, indexed by hash value, into an arra +y. This # allows us to store multiple files with the same hash va +lue. push( @{ $result{$hashValue} }, $files ); } } }
        I haven't tried this. Last night I was at home. Right now I'm on the job.

        Alex / talexb / Toronto

        "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds