I thought I'd provide another solution. It uses a
hash,
%seen, to serve as a lookup for genes in
'HFR_genes.txt' that are also in
'gene_list.txt'.
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
my $gene_list = 'gene_list.txt';
open GL, "<", $gene_list or die "Unable to open $gene_list because $!"
+;
my %seen;
while (my $gene = <GL>) {
$seen{$gene}++;
}
close GL or die "Unable to close $gene_list because $!";
my $hfr = 'HFR_genes.txt';
open HFR, "<", $hfr or die "Unable to open $hfr because $!";
while (my $gene = <HFR>) {
print if $seen{$gene};
}
close HFR or die "Unable to close $hfr because $!";
Chris
Update: If both files have unique lines, the code could be shortened to:
my %seen;
print grep $seen{$_}++, <>;
With the command line supplying the 2 files:
perl your_program.pl gene_list.txt HFR_genes.txt
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.