I'm trying to clean up duplicate files on my hard drive. I'm recursing through directories and saving all the file names, with data on path etc., in a hash. When a filename is already in the hash, I save its name into a second hash that holds only the dupe names. My problem is over storign the file data (more accurately, my problem is ignorance of perl data structure management ...) Here's a snippet, showing roughly what I'm trying to do
my %fileinfo = (); # name -> (array-of-path+size+etc) my $href = \%fileinfo; my %dupes = (); # name->count my $dref = \%dupes; #------------------------ foreach # for each file, recursing through directory tree { # here $_ is each file name if( exists %$href->{ $_ } ) # if filename seen already { %$dref->{ $_ } = 1; # then record in %dupes } @filedata = ( $fpath, $fsize, $fdate ); # but using real data # this is where I'm lost -- don't know how to "savedata" @savedata = %$href->{ $_ }; # get data data saved for filename push @savedata, @filedata; # add new data to saved data %$href->{ $_ } = @savedata; # put new data back in the hash }
I want to save each file's data as an array, then add that array to the the hash. At the end of all this, for each file in the dupes, I'll get the display the array of its data, something like
myfile.mp3 c:\dir1\dir2; date=12/3/45; size=12345 c:\dir3\dir4; date=1/01/01; size=54321
Please tell me how to do the step of adding the new data, the one I've dummied as:
@savedata = %$href->{ $_ }; push @savedata, @filedata; %$href->{ $_ } = @savedata;
(which doesn't work as I'd hoped :-( where I'd thought the @savedata would have lots of little @filedata arrays inside it ... thanks for any advice (and hints on how to unpick the "@savedata" would be great too)

In reply to data structure advice please by anadem

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.