in reply to If Statements and Regular Expressions
use strict; use warnings; open INPUT,"<MOUSE_TF1.txt" or die "can not open MOUSE_TF1.txt: $!"; open OUTPUT,">HOX_GENE_TF.txt" or die "can not open HOX_GENE_TF.txt: $ +!"; print OUTPUT "MOUSE TRANSCRIPTION FACTORS IN THE HOX GENE FAMILY\n\n\n +"; print OUTPUT "ENSEMBL GENE ID \tSYMBOL \tCHR\tSTART\t\tEND\t\tSTRAND\n +"; while (<INPUT>){ chomp; my ($id, $cname, $start, $end, $strand, $sym) = split /\t/; # select Gene Symbols belonging to "Hox" family and print if ($Sym =~ /^Hox/) { print join("\t", $id, $sym, $cname, $start, $e +nd, $strand), "\n" } } close INPUT; close OUTPUT;
If you are willing to preserve the column order from the input, this simplifies even further to:
use strict; use warnings; open INPUT,"<MOUSE_TF1.txt" or die "can not open MOUSE_TF1.txt: $!"; open OUTPUT,">HOX_GENE_TF.txt" or die "can not open HOX_GENE_TF.txt: $ +!"; print OUTPUT "MOUSE TRANSCRIPTION FACTORS IN THE HOX GENE FAMILY\n\n\n +"; print OUTPUT "ENSEMBL GENE ID \tCHR\tSTART\t\tEND\t\tSTRAND\tSYMBOL\n" +; while (<INPUT>){ chomp; my @items = split /\t/; # select Gene Symbols belonging to "Hox" family and print if ($items[5] =~ /^Hox/) { print "$_\n" } } close INPUT; close OUTPUT;
|
|---|