in reply to Re^14: Combining 3 files
in thread Combining 3 files

Ok, now I'm more confused than ever and chasing my tail in frustration.

You still have

my @out = push @{ $data[ $fileCount ]->{ ...
Why?

Replies are listed 'Best First'.
Re^16: Combining 3 files
by garyboyd (Acolyte) on Jun 29, 2011 at 08:45 UTC

    Thanks anonymous monk!!! I now have a breakthrough!!! I think it helped me to sleep on things as well!

    So I removed the:

    my @out = push @{ $data[ $fileCount ]->{ ...

    and now have :

    #!/usr/bin/perl #29/06/2011 #use strict; use warnings; use File::Slurp; use Data::Dumper; my @data, my @col, my @fields; my $dataset; #my %out; my $Hashref; my $fileCount; my @out; my @results; open INFILE, "<Primer-Rev1" or die $!; open my $outfh, '>', "outputfile.txt" or die $!; for my $nr (1..2) { for my $line (read_file('Primer-For'.$nr)) { my @col = split(/\t/,$line); push @{$data[$nr - 1]->{shift(@col)}},\@col; } } while (<INFILE>){ @col = split(/\t+/, $_); chomp (@col); my ($header, $length, $tm, $sequence) = @col[0..3]; # expecting file3 line in @col #} my @results = ( $col[0], $col[3] ); for my $dataset (@data) { my @beef = @{ $dataset->{ $col[0] } }; @beef = sort { my $diff_a = $col[2] - $a->[1]; $diff_a *= -1 if $diff_a < 0; my $diff_b = $col[2] - $b->[1]; $diff_b *= -1 if $diff_b < 0; $diff_a <=> $diff_b; } @beef; push @results, $beef[0]->[2]; #print Dumper (\@results); foreach (@results){ print $_."\n";} } }

    I checked the output from @results and it is almost there generating the output. There are however strange things going on.

    output looks like:

    contig03841 CCAGGTTATTTATTTCAGCGGGAACT AGTAGTTCATAATAAAGAGGAGGCTGGT contig03841 CCAGGTTATTTATTTCAGCGGGAACT AGTAGTTCATAATAAAGAGGAGGCTGGT AGTAGTTCATAATAAAGAGGAGGCTGGA contig06486 GCAAATGGCTCTAAGGATCAGCC TTTTCCTGAGCGTTTTCCTGAGC contig06486 GCAAATGGCTCTAAGGATCAGCC TTTTCCTGAGCGTTTTCCTGAGC CATTTTTCCTGAGCGTTTTCCTGAGT contig09294 GTCGGAGCTCTCTCAGAACCC GCCCCAGAAGACATCACCTTCAT contig09294 GTCGGAGCTCTCTCAGAACCC GCCCCAGAAGACATCACCTTCAT contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT

    I think the gaps are where there are entries missing from the files. Is there a way to print out only those results in @results where there is data from all 3 input files? eg it will output

    contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT

    rather than......

    contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC contig100253 CACTCGAGTTGCAGTTATGTTCCTC AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT

    I also want to output the data on one line tab-delimited, so for example

    contig100253 AGATGATTTGTGCATTATAATTGTAATTTGGGC GAGATGATTTGTGCATTATAATTGTAATTTGGGT CACTCGAGTTGCAGTTATGTTCCTC

        ok, thanks for all your help