Re: finding top 10 largest files

Rather than holding every file in the hash, could you instead only keep the top 10 entries with each iteration? I suppose that might be slow, eh?

Pseudoperl follows... caveat coder.

my @topTen = ();
while( moreFiles() )
{
   push @topTen, some_munged_key_using_file_size_and_name();
   @topTen = (reverse sort @topTen)[0..9];
}
[download]

Sorry if I'm missing something. My brain's not working well today...

Comment on Re: finding top 10 largest files Download Code

Replies are listed 'Best First'.
Re: Re: finding top 10 largest files by pelagic (Priest) on Feb 03, 2004 at 16:29 UTC
... that's what my solution (see 326175) does ... Only keep the information for the currently known biggest files. pelagic	[reply]
Re: Re: Re: finding top 10 largest files by rje (Deacon) on Feb 03, 2004 at 18:53 UTC
Ugh! Yep, you're right, my bad... only, the code is too long for my liking. Can it be shorter? Could I use the operating system to pare down the code a bit by pre-gathering a list of all files? Something like my @files = (reverse sort `dir /A-D /S`)[0..9]; [download] In DOS, or ... (big pause)... Aw shoot, in DOS the solution isn't even perl: `dir /A-D /O-S /S` [download] That recursively lists all files from the current working directory on, sorted by largest file first. I imagine there's a combination of opts to ls that will do the same thing, eh? Maybe `ls -alSR (which doesn't sort across directories) ls -alR \| sort -k 5 (maybe?) (Except those don't suppress the directory names. Hmmm.)` [download] Sorry, I meant to write perl, but it came out rather OS-specific... but it's a lot smaller than the perl solution. Is that Appeal To False Laziness?	[reply] [d/l] [select]
Re: Re: Re: Re: finding top 10 largest files by QM (Parson) on Feb 04, 2004 at 00:22 UTC
`dir /A-D /O-S /S` [download] Good try, but that only sorts them within directories. Files are still grouped by directories first, then sorted by filesize. [At least, not on my win2K, ver 5.00.] -QM -QM -- Quantum Mechanics: The dreams stuff is made of	[reply] [d/l]
Re: Re: Re: Re: Re: finding top 10 largest files by rje (Deacon) on Feb 04, 2004 at 14:33 UTC