pabla23 has asked for the wisdom of the Perl Monks concerning the following question:

Good morning guys, i have this problem: i've two different script. One of this is the following

use strict; use warnings; my ($mio, $mio2); my @array_with_all_fields=(); my $match = 'DOID:2055'; unlink ("myoutputfilename6.txt"); unlink ("myoutputfilename7.txt"); open(my $file, '<', 'do_human_mapping.gmt') or die "open: $!"; open my $out_file3, '>', 'myoutputfilename6.txt' or die "$!"; open my $out_file4, '>', 'myoutputfilename7.txt' or die "$!"; while (<$file>){ my ($name,$id,@genes) = split /\t/; if ($id eq $match) { $mio= join("\n",@genes); print $mio."\n"; print $out_file3 $mio."\n"; # print on file } if (grep/^$match$/, @genes){ $mio2=$id; print $mio2."\n"; print $out_file4 $mio2."\n"; # print on file } }

The output of this script is:

APOE

FKBP5

CRH

IL2

The second script is the following:

use strict; use warnings; sub trim { my $s=shift; $s=~ s/^\s+$//g; return $s }; my $mio2; my $match = 'APOE'; unlink ("myoutputfilename4.txt"); open(my $file, '<', 'do_human_mapping.gmt') or die "open: $!"; while (<$file>){ my ($name,$id,@genes) = split /\t/; if (grep/^$match$/, @genes){ $mio2=$id; #print $mio2."\n"; open my $out_file, '>>', 'myoutputfilename4.txt' or die "$!"; print $out_file $mio2."\n"; # print sul file } } open (FILE2, 'HumanDO.obo'); my %hash_value = (); my $Key=''; my $Val=''; while (my $line = <FILE2>) { if ($line=~/^id:\s/) { my @splitted_string=split(' ',$line); $Key=$splitted_string[1]; } if ($line=~/^name:\s/) { my @splitted_string=split(':',$line); $Val=$splitted_string[1]; } if ( defined $Key and defined $Val){ $hash_value{trim($Key)}=$Val; $Key=undef; $Val=undef; } } close FILE2; open(FILE,'myoutputfilename4.txt'); while ( my $line =<FILE>){ my @splitted_string=split(' ',$line); foreach my $key (@splitted_string){ if (exists $hash_value{$key}){ print "$key name:$hash_value{trim($key)}"; } } } close FILE;

the output of this script is FOR ONLY "APOE":

DOID:2055

DOID:3453

DOID:4532

I want that the second script takes in input the list from first script and print out for EACH name (APOE, FKB5, CRH,...) the associated "DOID". Every name has more than one "DOID".

Thanks for help Paola

Replies are listed 'Best First'.
Re: list in input
by roboticus (Chancellor) on Dec 11, 2014 at 12:49 UTC

    pabla23:

    You can merge your programs fairly easily, it seems. Turn them both into subroutines by wrapping the "interesting" bits:

    # Array declarations, use modules and initialization stuff for # both scripts go here sub get_list_of_foo { my @out_list; # the interesting part of your first script # in your loop { # Replace your print statement with something like: push @out_list, $mio2; # } return @out_list; } sub report_on_all_foos { # get the list on things we want to report my @input_list = @_; # Now we'll wrap your second script in a for loop so you # can do it for each match in the list previously # generated for my $match (@input_list) { # the interesting part of your second script goes here } } # Now, to put it together: report_on_all_foos( get_list_of_foos() );

    Overall, it's usually not terribly difficult to merge scripts like this. There are a few details you'll likely have to clean up, such as conflicting variable names, but that's the basic approach. If either script is slow, then the resulting program is likely to be slow, too. If that happens, you may have to look over your script(s) for operations that are repeated often (such as scanning through the files) and put the interesting information into arrays or hashes so you can reuse the results.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: list in input
by RonW (Parson) on Dec 11, 2014 at 23:51 UTC

    In the first script, you're opening your output files with:

    open my $out_file3, '>', 'myoutputfilename6.txt' or die "$!"; open my $out_file4, '>', 'myoutputfilename7.txt' or die "$!";

    In your second script, you open your input file with:

    open(FILE,'myoutputfilename4.txt');

    The input file name, in the second script, is not the same as either of the output files names in the first script.

    Are you sure the second script is reading what you think it is?

    If you do have the file names correct, then the next step would be to add a

    print STDERR "DEBUG: $line\n";

    as the first statement in the final while loop so you can see what is being read.