??? Your first paragraph makes no sense. You can't tell what snp.txt looks like if you print @output. The contents of @output is processed, changed by a split and a grep. Use an editor or 'less' to look at a file.
You might try out the following (this is the same program, just added a Dumper-line before the foreach:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use strict;
my$line;
my@fields;
my@output;
open (FILE1, 'snp.txt') or die "can't open the file: $!";
open (FILE2,'chr22.txt') or die "can't open the file: $!";
open (FD, '>test.txt') or die "can't open the file: $!";
my $position = tell(FILE2);
my %rs;
while ($line=<FILE2>) {
my ($key)= $line=~/^(rs\d{5,})\b/;
if (defined $key) {
$rs{$key}= $position;
}
$position=tell(FILE2);
}
while (defined ($line= <FILE1>)) {
my@fields= split (/\s+/ ,$line);
push @output, grep /^rs\d{5,}\b/ ,@fields;
}
print Dumper(\@output,\%rs);
foreach (@output) {
if (exists $rs{$_}) {
seek(FILE2,$rs{$_},0);
my $line= <FILE2>;
print FD $line;
}
}
close FILE1;
close FILE2;
close FD;
With the data I posted, I get the following output:
$VAR1 = [
'rs34569384',
'rs123456',
'rs234567',
'rs753444'
];
$VAR2 = {
'rs123456' => 43,
'rs234567' => 15
};
As you can see @output (==$VAR1) contains a lot of rs-numbers. %rs (==$VAR2) contains some of the same rs-numbers and corresponding file positions. You should see the same if you use my data.
Now try it with your data. What is different? Are there rs-numbers that are in both @output and %rs? Do the file position numbers look correct?
|