Re: Searching each word of a file

Again it is a matter of building a map from one file then looking up the map while parsing the second file. Consider:

use strict;
use warnings;

my $rawData = <<'RAW';
chr1q21    na    S100A3    S100A6    HRNR    DRD5P2    EFNA1
HSA04910_INSULIN_SIGNALING_PATHWAY    na    XRCC5 HRAS    
V$YY1_02    na    B3GALT6    DZIP1    RAB1B    SART3    FLJ20309 
MORF_EIF3S2    na    HCCS    XRCC3    LDHB    LDHA    OXA1L    RPL14 
module_486    na    CYP3A7    C14orf179    JAG2    INTS1    RBM6
CATABOLIC_PROCESS    na    PGD    HNRPD    USE1    RNF217    RNASEH1
RAW

my $mapData = <<'MAP';
XRCC5    SNP_A-1966881    1
EFNA1    SNP_A-1877994    9
HRNR    SNP_A-1919060    2
XRCC5    SNP_A-1966884    1
XRCC5    SNP_A-1966882    1
HRNR    SNP_A-1829030    1
MAP

my %geneMap;

open my $mapIn, '<', \$mapData or die "Failed to open map data: $!";
while (<$mapIn>) {
    chomp;
    my ($gene, @data) = split;
    
    next unless exists $data[1] || exists $data[2]; # Skip if unexpect
+ed data format
    $geneMap{$gene}{$data[0]} = $data[1];
}
close $mapIn;

open my $rawIn, '<', \$rawData or die "Failed to open raw data: $!";
while (<$rawIn>) {
    chomp;
    my ($geneset, $ignore, @genes) = split;
    
    next unless @genes; # Skip empty or badly formed line
    
    print "$geneset\n";
    for my $gene (@genes) {
        next unless exists $geneMap{$gene};
        print "\t$gene\t$_\t$geneMap{$gene}{$_}\n"
            for sort keys %{$geneMap{$gene}};
    }
}
close $rawIn;
[download]

Prints:

chr1q21
    HRNR    SNP_A-1829030    1
    HRNR    SNP_A-1919060    2
    EFNA1    SNP_A-1877994    9
HSA04910_INSULIN_SIGNALING_PATHWAY
    XRCC5    SNP_A-1966881    1
    XRCC5    SNP_A-1966882    1
    XRCC5    SNP_A-1966884    1
V$YY1_02
MORF_EIF3S2
module_486
CATABOLIC_PROCESS
[download]

Perl is environmentally friendly - it saves trees

Comment on Re: Searching each word of a file Select or Download Code

Replies are listed 'Best First'.
Re^2: Searching each word of a file by biomonk (Acolyte) on Jul 14, 2008 at 17:30 UTC
Thanks alot you really made my day, its an awesome code. I'm very greatful to you.	[reply] [d/l]