Vasek has asked for the wisdom of the Perl Monks concerning the following question:
Hi Monks,
I have a problem. I would like to generate a random playlist for an old mxaudio program (SGI O2, perl version 5). My goal is to take the contents of a directory structure containing a very large number of audio files and generate a list of a given number (say 10 tracks) of unique mp3s. It is important that reading of the start folder is recursive and that it is possible to add exceptions. I attach the code as far as I have got. The unique list will add up, but if the random function selects the same file more than once, then of course with fewer items than I would like. Can you help me with this? Thanks Zsolt.
#!/usr/bin/perl
my $target = "/home/zsolti/Temp";
my @exclude = (
#'2023_02_21_Szentendre_Pilis_EK_oldal'
'webftp'
,'x'
);
my (@audioFiles, %playList);
my $numOfRandFiles = 10;
&fileExplore($target);
#print join("\n", @audioFiles), "\n";
&uploadPlaylist;
print "-" x 80, "\n", "SELECTED FILES:\n", "-" x 80, "\n";
foreach $key (sort {$playList{$a} <=> $playList{$b}} (keys(%playList))
+) {
print "$playList{$key}: $key\n";
}
sub fileExplore {
my $dir = shift;
my $hit = 0;
local *DIR;
opendir DIR, $dir or die "opendir $dir: $!";
my $found = 0;
while ($_ = readdir DIR) {
next if /^\.{1,2}$/;
$FSNode = $_;
my $FSObj = "$dir/$FSNode";
foreach my $exc (@exclude) {$hit = 1 if $FSObj =~ m/\/$exc\//g}
next if $hit;
if (-f $FSObj) {
(my $fExt = $FSObj) =~ s/.*\.(.*$)/$1/i;
push(@audioFiles, $FSObj) if lc($fExt) eq 'mp3';
}
fileExplore($FSObj) if -d $FSObj;
}
closedir DIR;
}
sub uploadPlaylist {
for (my $i = 1; $i <= $numOfRandFiles; $i++) {
$playList{$audioFiles[int rand@audioFiles]} = $i;
}
}
Re: Add a fixed number of unique elements to hash
by jo37 (Deacon) on Mar 05, 2023 at 16:56 UTC
|
I would shuffle the list of files and select the first $numOfRandFiles from it, like
use List::Util 'shuffle';
...
(shuffle(@audiofiles))[0 .. $numOfRandFiles - 1];
Note that this generates a list, not a hash.
If you cannot (or don't want to) use a module, you need to reinvent the wheel.
Greetings, -jo
$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
| [reply] [d/l] [select] |
|
use List::AllUtils qw( sample );
my @selected = sample $numOfRandFiles, @audioFiles;
| [reply] [d/l] |
|
G'day tybalt89,
When initially writing the code in my response,
I checked the List::Util::shuffle() documentation
and noticed the sample() function immediately after it.
I hadn't encountered that previously and decided to give it a go.
Curiously, although it did work as documented, multiple runs produced the same results.
As you can see from my "sample runs" using shuffle(), multiple runs produced different results
(yes, only two runs shown, but I did run it quite a few times).
Looking at source/lib/List/AllUtils.pm,
List::AllUtils::sample() should be identical to List::Util::sample().
I have List::Util v1.62 and Perl v5.36.0.
I'm a bit short on time this morning; I was thinking of investigating further this afternoon
[Aussie timezone: UTC+11:00].
If you have any insights into the behaviour of sample(), please share.
It did seem like sample(), taking a random selection from the array,
was probably a better choice than shuffle(), randomising the entire array and then taking a slice
(obviously, benchmarking needed to confirm this).
On the down side, sample() requires List::Util v1.54:
you'd need at least Perl v5.32.0 (which has v1.55) or an upgrade from CPAN.
| [reply] [d/l] [select] |
|
|
|
|
|
|
| [reply] [d/l] [select] |
|
|
sub uploadPlaylist {
for my $i ( 1 .. $numOfRandFiles ) {
$playList{ splice @audioFiles, int rand @audioFiles, 1 } = $i;
}
}
Naked blocks are fun!
-- Randal L. Schwartz, Perl hacker
| [reply] [d/l] |
|
Thx a lot jwkrahn! Splice is the key for me.
Other solutions that use "factory" modules are out of the question, i.e. this machine is running perl 5.004, which has quite a few syntactical differences compared to today's perl versions. Of course, I could try transplanting the modules, but my only connection to this old SGI machine is via nfs sharing over a software half-speed network interface (since it can no longer be communicated with securely with today's settings).
| [reply] |
Re: Add a fixed number of unique elements to hash
by kcott (Archbishop) on Mar 05, 2023 at 17:42 UTC
|
G'day Zsolt,
Your code uses a lot of older coding styles which I recommend you aim to move away from.
These include: leading '&' on subroutine calls; use of package rather than lexical variables;
and a lack of strict and warnings pragmata.
There are a number of ways to achieve what you want.
In the code below, I've continued your use of opendir.
The core File::Find module is popular,
as are a number of related CPAN modules
— I expect other monks may provide you with examples of those.
I've used fairly generic options for inclusions and exclusions — adapt to your needs.
I created this directory structure for testing (in the spoiler):
Here's the code:
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use File::Spec;
use List::Util 'shuffle';
# For production, you'd probably want to read the
# following values from options, config, etc.
my $dir = '/home/ken/tmp/pm_11150231_dir_rand_select';
my @includes = (qr{\.x$}, qr{\.z$});
my @excludes = (qr{skip$}, qr{inc2\/p\.z$});
my $list_length = 3;
my $re = {
inc => qr{(?:@{[join '|', @includes]})},
exc => qr{(?:@{[join '|', @excludes]})},
};
my @all_files;
get_files(\@all_files, $dir, $re);
my @playlist = (shuffle @all_files)[0 .. $list_length - 1];
print "$_\n" for sort @playlist;
sub get_files {
my ($collected, $dir, $re) = @_;
return if $dir =~ $re->{exc};
opendir(my $dh, $dir);
for my $file (grep ! /^(?:\.|\.\.)$/, readdir $dh) {
my $path = File::Spec::->catfile($dir, $file);
next if $path =~ $re->{exc};
if (-d $path) {
get_files($collected, $path, $re);
}
elsif (-f _) {
next unless $path =~ $re->{inc};
push @$collected, $path;
}
else {
# maybe handle other file types here
}
}
return;
}
Here's a couple of sample runs:
$ ./rand_select_files.pl
/home/ken/tmp/pm_11150231_dir_rand_select/inc1/h.x
/home/ken/tmp/pm_11150231_dir_rand_select/inc1/j.z
/home/ken/tmp/pm_11150231_dir_rand_select/inc2/n.x
$ ./rand_select_files.pl
/home/ken/tmp/pm_11150231_dir_rand_select/a.x
/home/ken/tmp/pm_11150231_dir_rand_select/c.z
/home/ken/tmp/pm_11150231_dir_rand_select/inc1/j.z
| [reply] [d/l] [select] |
Re: Add a fixed number of unique elements to hash
by tybalt89 (Monsignor) on Mar 05, 2023 at 22:11 UTC
|
Define "very large number".
Anyways, here's one way I'd do it (who needs recursion?).
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=11150757
use warnings;
use List::AllUtils qw( sample none );
my $target = "/home/zsolti/Temp";
$target = '../mnt/home/old'; # FIXME for testing on my system
my @exclude = (
#'2023_02_21_Szentendre_Pilis_EK_oldal'
'webftp'
,'x'
);
my (@audioFiles, %playList);
my $numOfRandFiles = 10;
my @queue = $target;
while( defined( my $path = pop @queue ) )
{
if( -f $path and $path =~ /\.mp3$/i )
{
push @audioFiles, $path;
}
elsif( -d $path and none { $path =~ m{/\Q$_\E\z} } @exclude ) # prun
+e
{
push @queue, <$path/*>;
}
}
@playList{ sample $numOfRandFiles, @audioFiles } = 1 .. $numOfRandFile
+s;
use Data::Dump 'dd'; dd \%playList;
| [reply] [d/l] |
|
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=11150757
use warnings;
my $target = "/home/zsolti/Temp";
$target = '../mnt/home/old'; # FIXME for testing on my system
my @exclude = (
#'2023_02_21_Szentendre_Pilis_EK_oldal'
'webftp'
,'x'
);
my (@audioFiles, %playList);
my $numOfRandFiles = 10;
my @queue = $target;
while( defined( my $path = pop @queue ) )
{
if( -f $path and $path =~ /\.mp3$/i )
{
push @audioFiles, $path;
}
elsif( -d $path and not grep $path =~ /\/\Q$_\E\z/, @exclude )
{
push @queue, <$path/*>;
}
}
$playList{ splice @audioFiles, rand @audioFiles, 1 } ||= $_
for 1 .. $numOfRandFiles;
printf "%3d %s\n", $playList{$_}, $_ for sort keys %playList;
Though since I don't actually have a perl 5.004 to test on, I do wonder if it works...
| [reply] [d/l] |
|
$ corelist warnings
Data for 2022-05-27
warnings was first released with perl v5.6.0
Replace with '$^W = 1;'.
| [reply] [d/l] [select] |
|
#!/usr/bin/perl
use strict; # https://perlmonks.org/?node_id=11150757
#use warnings; # uncomment for newer perls
my $target = "/home/zsolti/Temp";
$target = '../mnt/home/old'; # FIXME for testing on my system
my @exclude = (
#'2023_02_21_Szentendre_Pilis_EK_oldal'
'webftp'
,'x'
);
my (@audioFiles, %playList);
my $numOfRandFiles = 10;
my @stack = $target;
while( my $path = pop @stack )
{
grep $path =~ /\/\Q$_\E\z/, @exclude and next;
push @audioFiles, grep -f, <$path/*.mp3>;
push @stack, grep -d, <$path/*>;
}
$playList{ splice @audioFiles, rand @audioFiles, 1 or last } = $_
for 1 .. $numOfRandFiles;
printf "%3d %s\n", $playList{$_}, $_ for sort keys %playList;
| [reply] [d/l] |
Re: Add a fixed number of unique elements to hash [UPDATE on sample() vs. shuffle()]
by kcott (Archbishop) on Mar 23, 2023 at 01:43 UTC
|
...
#use List::Util 'shuffle';
use List::Util 'sample';
...
#my @playlist = (shuffle @all_files)[0 .. $list_length - 1];
my @playlist = sample $list_length, @all_files;
#print "$_\n" for sort @playlist;
print "$_\n" for @playlist;
...
Here's some sample runs (the updated script is called "rand_select_files_sample.pl"):
ken@titan ~/tmp/pm_11150231_dir_rand_select
$ ./rand_select_files_sample.pl
/home/ken/tmp/pm_11150231_dir_rand_select/inc2/n.x
/home/ken/tmp/pm_11150231_dir_rand_select/a.x
/home/ken/tmp/pm_11150231_dir_rand_select/c.z
ken@titan ~/tmp/pm_11150231_dir_rand_select
$ ./rand_select_files_sample.pl
/home/ken/tmp/pm_11150231_dir_rand_select/inc1/j.z
/home/ken/tmp/pm_11150231_dir_rand_select/c.z
/home/ken/tmp/pm_11150231_dir_rand_select/inc1/h.x
ken@titan ~/tmp/pm_11150231_dir_rand_select
$ ./rand_select_files_sample.pl
/home/ken/tmp/pm_11150231_dir_rand_select/inc1/j.z
/home/ken/tmp/pm_11150231_dir_rand_select/c.z
/home/ken/tmp/pm_11150231_dir_rand_select/a.x
Clearly, multiple runs are producing different results.
I don't have any results from earlier runs using sample().
Assumption: PEBCAC
| [reply] [d/l] [select] |
|
|