Peter Keystrokes has asked for the wisdom of the Perl Monks concerning the following question:

Hi there, If I have a folder that contains the following files:

hsa_circ_0000007.fa.txt_palindromes.csv

hsa_circ_0000008.fa.txt_palindromes.csv

hsa_circ_0000009.fa.txt_palindromes.csv

hsa_circ_0000010.fa.txt_palindromes.csv

And i then use:

my @pal_files = glob("*palindromes.csv");

Am I just feeding the plain names into an array or am I capturing the entire file into the array?

My hunch is that I'm just capturing the name of the file.

How then do I capture all the files with the extension of my choice so that I can then open those files to extract data?

Pete.

Replies are listed 'Best First'.
Re: Capturing and then opening multiple files
by choroba (Cardinal) on Jun 20, 2017 at 20:18 UTC
    If you are unsure what a variable contains, check its contents:
    use Data::Dumper; my @pal_files = glob '*palindromes.csv'; print Dumper(\@pal_files);

    To read the files, just open them in a loop:

    for my $file (@pal_files) { open my $IN, '<', $file or die "$file: $!\n"; while (my $line = <$IN>) { # Process the line... } }

    You can also assign the filenames to @ARGV and use the diamond operator:

    @ARGV = glob '*palindromes.csv'; while (my $line = <>) { # Process the line... }

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Thank you, I will try this.
Re: Capturing and then opening multiple files
by thanos1983 (Parson) on Jun 20, 2017 at 21:07 UTC

    Hello Peter Keystrokes,

    As the fellow monk choroba has provided you an answer to your question already, I wanted to answer the last part of your question (How then do I capture all the files with the extension of my choice so that I can then open those files to extract data? ).

    Take a look for the module File::Find::Rule. Why I propose this? Simple, multiple directories and also Directory Recursion.

    Sample of solution (for testing purposes I used .txt extension you can use .csv).

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use File::Find::Rule; my @dirs = @ARGV ? @ARGV : ('.'); my $level = shift // 2; my @files = File::Find::Rule->file() ->name('*.txt') ->maxdepth($level) ->in(@dirs); print Dumper(\@files); __END__ $ perl main.pl $VAR1 = [ 'test.txt', 'counts.txt', 'SubFolder/foo.txt' ];

    Update: I updated the output to an array as the user wants an array output.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Capturing and then opening multiple files
by deedo (Novice) on Jun 21, 2017 at 15:57 UTC
    I like to use the following subroutine by passing it a directory name as such:

    my @sourcefiles = &get_files($source_directory);

    The subroutine along the lines of:
    sub get_files { my $files = shift; my @file_list; opendir(DIR, $files) or die "Can't open directory, $!\n"; @file_list = readdir(DIR); closedir(DIR); # Removing directory references '.' and '..' from the listing my $rmdir = shift (@file_list); my $rmdir2 = shift (@file_list); return @file_list; }
    I have some other versions too but can't find them at the minute... but this will get you the full file path and thus file and you can play around with the array/list that is returned.

      # Removing directory references '.' and '..' from the listing my $rmdir = shift (@file_list); my $rmdir2 = shift (@file_list);

      Are you sure this works? I just tested and on my system . and .. are returned somewhere in the middle of the list at seemingly random positions, which means your code wouldn't be portable to my system and probably many Linux systems in general. Much better (File::Spec is a core module):

      use File::Spec::Functions qw/no_upwards/; ... my @file_list = no_upwards readdir($dirhandle);

      Or even better (one difference being that @file_list will contain Path::Class objects that include the directory):

      use Path::Class qw/dir file/; my @file_list = dir($dirname)->children;
        Yeah, I appreciate that the removal of . and .. may not be portable to your system, but you could just add a loop to iterate over the array in the subroutine before returning the list, finding both, and deleting the elements. Wouldn't be too hard - I write mostly on Windows where I work which is why I just shift them off the start of the array.
      my @sourcefiles = &get_files($source_directory);

      You don't need the ampersand in Perl 5 to call a function. Quite the opposite is true: You should avoid the ampersand in Perl 5 when calling functions, as it can introduce nasty, hard-to-debug errors. The ampersand was needed in Perl 4, but that was decades ago.

      Read more at Re^2: Merge log files causing Out of Memory (just a note on ampersand).

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
      "Removing directory references '.' and '..' from the listing"

      BTW: use Path::Tiny; @paths = path($dir)->children;

      "Returns a list of Path::Tiny objects for all files and directories within a directory. Excludes "." and ".." automatically."

      Automatically sounds always good, right?

      See Path::Tiny

      Regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

      perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re: Capturing and then opening multiple files
by karlgoethebier (Abbot) on Jun 21, 2017 at 15:05 UTC

    I have no new ideas. Hence i cannibalize my older nodes:

    #!/usr/bin/env perl use strict; use warnings; use Path::Iterator::Rule; use Path::Tiny; use Data::Dump; use feature qw (say); my $rule = Path::Iterator::Rule->new; my $level = 1; my $suffix = q(*.csv); $rule->file->name($suffix); $rule->file->max_depth($level); my $dir = q(.); my $next = $rule->iter($dir); my ( @data, @files ); while ( defined( my $file = $next->() ) ) { my $data = path($file)->slurp_utf8; push @data, $data; push @files, $file; } dd \@data; dd \@files; __END__

    Slurping is object of disputes. Consider it's use as an option.

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    Furthermore I consider that Donald Trump must be impeached as soon as possible