in reply to Re^12: Combining 3 files
in thread Combining 3 files

It means exactly what it says, its the same as Re^5: Combining 3 files, undef, a number, a string ... none of those things are an array reference

Replies are listed 'Best First'.
Re^14: Combining 3 files
by garyboyd (Acolyte) on Jun 28, 2011 at 15:48 UTC

    Ok, now I'm more confused than ever and chasing my tail in frustration. This is where I am with the code. Any helpful suggestions are appreciated.

    #!/usr/bin/perl #22/06/2011 #use strict; use warnings; use File::Slurp; use Data::Dumper; my @data; my @col; my $dataset; my @fields; #my %out; my $Hashref; my $fileCount; my @out; my @results; open INFILE, "<Primer-Rev1" or die $!; open my $outfh, '>', "outputfile.txt" or die $!; for my $nr (1..2) { for my $line (read_file('Primer-For'.$nr)) { my @col = split(/\t/,$line); push @{$data[$nr - 1]->{shift(@col)}},\@col; } } while (<INFILE>){ @col = split(/\t+/, $_); chomp (@col); my ($header, $length, $tm, $sequence) = @col[0..3]; # expecting file3 line in @col my @results = ( $col[1], $col[2] ); #print Dumper (\@results) } for my $dataset (@data) { my @out = push @{ $data[ $fileCount ]->{ shift @col } }, \@col; #print Dumper (\@data); } @out = sort { my $diff_a = $col[2] - $a->[1]; $diff_a *= -1 if $diff_a < 0; my $diff_b = $col[2] - $b->[1]; $diff_b *= -1 if $diff_b < 0; $diff_a <=> $diff_b; } @out; print Dumper (\@out); push @results, $out[0]->[2]; #print Dumper (\@results); #}

      Ok, now I'm more confused than ever and chasing my tail in frustration.

      You still have

      my @out = push @{ $data[ $fileCount ]->{ ...
      Why?

        Thanks anonymous monk!!! I now have a breakthrough!!! I think it helped me to sleep on things as well!

        So I removed the:

        my @out = push @{ $data[ $fileCount ]->{ ...

        and now have :

        #!/usr/bin/perl #29/06/2011 #use strict; use warnings; use File::Slurp; use Data::Dumper; my @data, my @col, my @fields; my $dataset; #my %out; my $Hashref; my $fileCount; my @out; my @results; open INFILE, "<Primer-Rev1" or die $!; open my $outfh, '>', "outputfile.txt" or die $!; for my $nr (1..2) { for my $line (read_file('Primer-For'.$nr)) { my @col = split(/\t/,$line); push @{$data[$nr - 1]->{shift(@col)}},\@col; } } while (<INFILE>){ @col = split(/\t+/, $_); chomp (@col); my ($header, $length, $tm, $sequence) = @col[0..3]; # expecting file3 line in @col #} my @results = ( $col[0], $col[3] ); for my $dataset (@data) { my @beef = @{ $dataset->{ $col[0] } }; @beef = sort { my $diff_a = $col[2] - $a->[1]; $diff_a *= -1 if $diff_a < 0; my $diff_b = $col[2] - $b->[1]; $diff_b *= -1 if $diff_b < 0; $diff_a <=> $diff_b; } @beef; push @results, $beef[0]->[2]; #print Dumper (\@results); foreach (@results){ print $_."\n";} } }

        I checked the output from @results and it is almost there generating the output. There are however strange things going on.

        output looks like:

        contig03841 CCAGGTTATTTATTTCAGCGGGAACT AGTAGTTCATAATAAAGAGGAGGCTGGT contig03841 CCAGGTTATTTATTTCAGCGGGAACT AGTAGTTCATAATAAAGAGGAGGCTGGT AGTAGTTCATAATAAAGAGGAGGCTGGA contig06486 GCAAATGGCTCTAAGGATCAGCC TTTTCCTGAGCGTTTTCCTGAGC contig06486 GCAAATGGCTCTAAGGATCAGCC TTTTCCTGAGCGTTTTCCTGAGC CATTTTTCCTGAGCGTTTTCCTGAGT contig09294 GTCGGAGCTCTCTCAGAACCC GCCCCAGAAGACATCACCTTCAT contig09294 GTCGGAGCTCTCTCAGAACCC GCCCCAGAAGACATCACCTTCAT contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT

        I think the gaps are where there are entries missing from the files. Is there a way to print out only those results in @results where there is data from all 3 input files? eg it will output

        contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT

        rather than......

        contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT

        I also want to output the data on one line tab-delimited, so for example

        contig100253 AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT CACTCGAGTTGCAGTTATGTTCCTC