Deal all, I need to find match between two tab delimited files files like this:
File 1:ID1 1 65383896 65383896 G C PCNXL3 ID1 2 56788990 55678900 T A ACT1 ID1 1 56788990 55678900 T A PRO55 File 2 ID2 34 65383896 65383896 G C MET5 ID2 2 56788990 55678900 T A ACT1 ID2 2 56788990 55678900 T A HLA
what I would like to do is to retrive the matching line between the two file. What I would like to match is everyting after the gene ID So far I have written this code but unfortunately perl keeps giving me the error: use of "Use of uninitialized value in pattern match (m//)" Could you please help me figure out where i am doing it wrong? Thank you in advance!
#!/usr/bin/perl -w use strict; open (INA, $ARGV[0]) || die "cannot to open gene file"; open (INB, $ARGV[1]) || die "cannot to open coding_annotated.var files +"; my @sample1 = <INA>; my @sample2 = <INB>; foreach my $line (@sample1) { my @tab = split (/\t/, $line); my $chr = $tab[1]; my $start = $tab[2]; #my $end = $tab[3]; my $ref = $tab[4]; my $alt = $tab[5]; my $name = $tab[6]; foreach my $item (@sample2){ my @fields = split (/\t/,$item); if ($fields[1]=~ m/$chr(.*)/ && $fields[2]=~ m/$start(.*)/ && +$fields[4]=~ m/$ref(.*)/ && $fields[5]=~ m/$alt(.*)/&& $fields[6]=~ m +/$name(.*)/){ print $line,"\n",$item; } } }
In reply to Regular expression help by ananassa
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |