pabla23 has asked for the wisdom of the Perl Monks concerning the following question:

Good morning!!! I have this code:

use strict; use warnings; my $mio; my $filename = '/Users/Pabli/Desktop/do_human_mapping.gmt'; my $match = 'DOID:2055'; unlink ("myoutputfilename3.txt"); open(my $file, '<', $filename) or die "open: $!"; while (<$file>){ my ($name,$id,@genes) = split /\t/; if ($id eq $match) { $mio= join("\n",@genes); print $mio."\n"; open my $out_file, '>>', 'myoutputfilename3.txt' or die "$!"; print $out_file $mio."\n"; # print sul file } }
And in another file this:
use strict; use warnings; my $mio2; my $filename = '/Users/Pabli/Desktop/do_human_mapping.gmt'; my $match = 'APOE'; unlink ("myoutputfilename4.txt"); open(my $file, '<', $filename) or die "open: $!"; while (<$file>){ my ($name,$id,@genes) = split /\t/; if (grep/^$match$/, @genes){ $mio2=$id; print $mio2."\n"; open my $out_file, '>>', 'myoutputfilename4.txt' or die "$!"; print $out_file $mio2."\n"; # print sul file } }

I would like to unify this two files. The output of the first file is a list in this format:

APOE

FKBP5

CRH

IL2

Infact into the second file i wrote explicit "APOE".

Can someone help me in order to automate all?

Thanks a lot Paola

Replies are listed 'Best First'.
Re: Unify two files
by Laurent_R (Canon) on Nov 10, 2014 at 11:27 UTC
    This seems to be more or less what you are looking for:
    use strict; use warnings; my ($mio, $mio2); my $filename = '/Users/Pabli/Desktop/do_human_mapping.gmt'; my $match = 'DOID:2055'; unlink ("myoutputfilename3.txt"); unlink ("myoutputfilename4.txt"); open(my $file, '<', $filename) or die "open: $!"; open my $out_file3, '>', 'myoutputfilename3.txt' or die "$!"; open my $out_file4, '>', 'myoutputfilename4.txt' or die "$!"; while (<$file>){ my ($name,$id,@genes) = split /\t/; if ($id eq $match) { $mio= join("\n",@genes); print $mio."\n"; print $out_file3 $mio."\n"; # print sul file } if (grep/^$match$/, @genes){ $mio2=$id; print $mio2."\n"; print $out_file4 $mio2."\n"; # print sul file } }
    I have taken the liberty to move the opening of the files out of the while loop, because it seems inefficient to open the file each time you get a match, but it really depends on you data (how often it matches).
      Ok thanks! Now i have this output:

      APOE

      APOE

      FKBP5

      CRH

      with this list i've to enter again in the same file and to find for a single element the different id that are associated; the file have this format:

      DOID:00001 APOE IL4 RTG5

      DOID:00002 FG6 CRH APOE

      DOID:00003 RTG5 HUTN CRH

      my output would be:

      APOE DOID:00001 DOID:00002

      CRH DOID:00002 DOID:00003

      thanks for your help!!!! Paola
        I'm guessing you are looking for the genes for a certain id, and then a looking for all the id's that have those genes. If so try this
        #!perl use strict; use warnings; my $match = 'DOID:2055'; my $filename = 'do_human_mapping.gmt'; open (my $fh, '<', $filename) or die "open: $!"; my @genes=(); my %gene2id=(); while (<$fh>){ my ($name,$id,@temp) = split /\s+/; if ($id eq $match) { @genes = @temp; } else { for my $gene (@temp){ push @{$gene2id{$gene}},$id } } } for my $gene (@genes){ if (exists $gene2id{$gene}){ print join ' ',$gene,@{$gene2id{$gene}},"\n"; } } __DATA__ DOID:00001 APOE IL4 RTG5 DOID:00002 FG6 CRH APOE DOID:00003 RTG5 HUTN CRH DOID:2055 APOE FKBP5 CRH
        poj
Re: Unify two files
by blindluke (Hermit) on Nov 10, 2014 at 09:52 UTC

    What exactly are you trying to accomplish? You write about "unifying two files", which could mean replacing those two scripts with one, but you also write about your output. Are you trying to write a third script, that "unifies" the output of those two scripts into a combined list (without duplicates, for example)?

    If possible, try looking at your problem in terms of input and output. What is your input? What exactly do you want to produce? It's very difficult to guess what you mean by "unify".

    - Luke