phippsy has asked for the wisdom of the Perl Monks concerning the following question:

Hi! I have a list of files with sequential naming. I've brought this file list into an array. I need to create a smaller arrays based on the filename structure--grouping like filenames... and then do 'something' to each array. To be clearer, here's an example:

@array = (001.file.a, 001.file.b, 002.file.a, 002.file.b)

I need to extract those files with matching prefixes (ie. 001) into a separate array, and do 'something'. When finished, extract those files with the next prefix (ie. 002) and then do 'something'... and on and on. I can't rely on the prefixes following sequentially (ie. 001, 004, 018), so I'll have to do some sort of compare of $array1 to $array[0] and then push if necessary i think...

I don't need help with the regex/grep portion, but moreso how to iterate through and extract matches, do 'something', then repeat with the next set of matches. Thanks in advance for any help! a:)
  • Comment on iterating over array to create smaller arrays based on pattern match

Replies are listed 'Best First'.
Re: iterating over array to create smaller arrays based on pattern match
by samtregar (Abbot) on Apr 17, 2008 at 21:58 UTC
    Sounds to me like what you want to do is sort all the entries into a hash of arrays (HoA), where the key is the prefix. Something like:

    my %sorted; foreach my $item (@array) { my ($prefix) = $item =~ /^(\w+)/; push @{$sorted{$prefix}}, $item; }

    Then you can work your way through the list of items for each prefix like this:

    foreach my $prefix (keys %sorted) { my @items = @{$sorted{$prefix}}; # do something with @items... }

    Does that make sense?

    -sam

Re: iterating over array to create smaller arrays based on pattern match
by FunkyMonk (Bishop) on Apr 17, 2008 at 22:00 UTC
    Iterate over the array and extract the prefixes. Use the prefixes as hash keys and push the filenames onto a hash-of-lists:
    my @array = qw(001.file.a 001.file.b 002.file.a 002.file.b) ; my %groups; for my $filename ( @array ) { my $prefix = (split /\./, $filename)[0]; push @{ $groups{$prefix} }, $filename; } for ( keys %groups ) { print "group $_ has files @{ $groups{$_} }\n" }

    Output:

    group 002 has files 002.file.a 002.file.b group 001 has files 001.file.a 001.file.b


    Unless I state otherwise, my code all runs with strict and warnings
Re: iterating over array to create smaller arrays based on pattern match
by moritz (Cardinal) on Apr 17, 2008 at 21:55 UTC
    use strict; use warnings; my @files = qw(001.file.a 001.file.b 002.file.a 002.file.b); my @sorted_files; for (@files){ if (m/^(\d+)\./){ push @{$sorted_files[$1]}, $_; } else { die "File name with unknown format: '$_'\n"; } } for my $bucket (0 .. $#sorted_files){ print "Processing bucket $bucket\n"; for (@{$sorted_files[$bucket]}){ print "\tprocessing file $_\n"; } } __END__ Processing bucket 0 Processing bucket 1 processing file 001.file.a processing file 001.file.b Processing bucket 2 processing file 002.file.a processing file 002.file.b
      Danger! If you ever encounter a file called "10000000000000000.file.a" your program will run out of memory and crash. Perl's arrays are not sparse so when you ask for $array[10000000000000000] you're going to allocate a great hunk of memory.

      -sam

Re: iterating over array to create smaller arrays based on pattern match
by oko1 (Deacon) on Apr 17, 2008 at 22:09 UTC

    Seems like you need to "sub-select" your list. Given that it's not in order, I'd build a hash of arrays (HoA) with the keys corresponding to your selections, then process them in whatever order you wanted.

    #!/usr/bin/perl -w use strict; my @array = qw(001.file.a 001.file.b 002.file.a 002.file.b); my %hash; for my $filename (@array){ $filename =~ /^(\d+)/; push @{$hash{$1}}, $filename; } for my $group (sort { $a <=> $b } keys %hash){ print "Processing group '$group':\n"; for my $file (@{$hash{$group}}){ print "\tProcessing $file:\n"; ### Do stuff } }

    Update: Wow, you guys are quick. I hit the 'Comment' link, typed out the code, previewed it, and posted it - and suddenly there were three posts ahead of me where there had been zero. I guess I'd better learn to type faster. :) It's also amusing (but unsurprising) that all of us gave essentially the same answer, so I'm going to '++' everyone 'cause they're so brilliant. :)

    
    -- 
    Human history becomes more and more a race between education and catastrophe. -- HG Wells
    
      Thanks everybody! This is exactly what I was trying to do. Much appreciated! a:)