I have a script that looks something like this, which I want to use it to search through the current directory I am in, open, all directories in that directory, open all files that match certain REs (fastq files that have a format such that every four lines go together), do some work with these files, and write some results to a file in each directory. (The actual code is much more complex but since I think I have a structural issue I am showing a simplified version)
#!user/local/perl #Created by C. Pells, M. R. Snyder, and N. T. Marshall 2017 #Script trims and merges high throughput sequencing reads from fastq f +iles for a specific primer set use Cwd; use warnings; my $StartTime= localtime; my $MasterDir = getcwd; #obtains the current directory opendir (DIR, $MasterDir); my @objects = readdir (DIR); closedir (DIR); foreach (@objects){ print $_,"\n"; } my @Dirs = (); foreach my $O (0..$#objects){ my $CurrDir = ""; if ((length ($objects[$O]) < 7) && ($O>1)){ #Checking if the lengt +h of the object name is < 7 characters. All samples are 6 or less. re +moving the first two elements: "." and ".." $CurrDir = $MasterDir."/".$objects[$O]; #appends directory nam +e to full path push (@Dirs, $CurrDir); } } foreach (@Dirs){ print $_,"\n";#checks that all directories were read in } foreach my $S (0..$#Dirs){ my @files = (); opendir (DIR, $Dirs[$S]) || die "cannot open $Dirs[$S]: $!"; @files = readdir DIR; #reads in all files in a directory closedir DIR; my @AbsFiles = (); foreach my $F (0..$#files){ my $AbsFileName = $Dirs[$S]."/".$files[$F]; #appends file name + to full path push (@AbsFiles, $AbsFileName); } foreach my $AF (0..$#AbsFiles){ if ($AbsFiles[$AF] =~ /_R2_001\.fastq$/m){ #finds reverse fast +q file my @readbuffer=(); #read in reverse fastq my %RSeqHash; my $c = 0; print "Reading, reversing, complimenting, and trimming rev +erse fastq file $AbsFiles[$AF]\n"; open (INPUT1, $AbsFiles[$AF]) || die "Can't open file: $!\ +n"; while (<INPUT1>){ chomp ($_); push(@readbuffer, $_); if (@readbuffer == 4) { $rsn = substr($readbuffer[0], 0, 45); #trims rever +se seq name $cc++ % 10000 == 0 and print "$rsn\n"; $RSeqHash{$rsn} = $readbuffer[1]; @readbuffer = (); } } } } foreach my $AFx (0..$#AbsFiles){ if ($AbsFiles[$AFx] =~ /_R1_001\.fastq$/m){ #finds forward fas +tq file print "Reading forward fastq file $AbsFiles[$AFx]\n"; open (INPUT2, $AbsFiles[$AFx]) || die "Can't open file: $! +\n"; my $OutMergeName = $Dirs[$S]."/"."Merged.fasta"; open (OUT, ">", "$OutMergeName"); my $cc=0; my @readbuffer = (); while (<INPUT2>){ chomp ($_); push(@readbuffer, $_); if (@readbuffer == 4) { my $fsn = substr($readbuffer[0], 0, 45); #trims fo +rward seq name #$cc++ % 10000 == 0 and print "$fsn\n$readbuffer[1 +]\n"; if ( exists($RSeqHash{$fsn}) ){ #checks to see if +forward seq name is present in reverse seq hash print "$fsn was found in Reverse Seq Hash\n"; print OUT "$fsn\n$readbuffer[1]\n"; #ACUAL OUT +PUT FILE IS EMPTY!!! } else { $cc++ % 10000 == 0 and print "$fsn not found i +n Reverse Seq Hash\n"; #PRINTS THIS FOR EVERY LINE IN INPUT2!!! } @readbuffer = (); } } close INPUT1; close INPUT2; close OUT; } } }
I know that the script works without iterating over folders because if I run a simplified version within just one folder it works including using the REs to find file names. But with this version I just get empty output files. Due to the print functions I inserted in this script, I've determined that Perl cant find the variable $fsn as a key in %RSeqHash from INPUT1. I cant understand why because each file is there and it works when I don't iterate over folders so I know that the keys match. So either there is something simple I am missing or this is some sort of limitation to Perl's memory that I have found. Any help is appreciated!
In reply to Running a script across multiple directories with multiple output files (problems comparing hash key values) by msnyder424
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |