Waterdrake has asked for the wisdom of the Perl Monks concerning the following question:

Salutations , O Great and Wise Monks . I as a total apprentice seek your wisdom and knowledge. On the following problem : I have two files (xname1 xname2 ) the information in those files is structured like this xname1 :

FBgn0041711 yellow-e FBgn0038150 FBgn0041711 yellow-e FBgn0039896 FBgn0041711 yellow-e FBgn0032601 FBgn0041711 yellow-e FBgn0038151 FBgn0041711 yellow-e FBgn0041713 FBgn0041711 yellow-e FBgn0041710 FBgn0041711 yellow-e FBgn0041712 FBgn0041711 yellow-e FBgn0034856 FBgn0041711 yellow-e FBgn0038105 FBgn0041711 yellow-e FBgn0041709 FBgn0041711 yellow-e FBgn0035328 FBgn0041711 yellow-e FBgn0004034 FBgn0032283 CG7296 FBgn0042110 CG18765 FBgn0039328 FBgn0042110 CG18765 FBgn0039325

the information in the other file is structured in the same way xname2 :

FBgn0041711|FBtr0082757 F dme-miR-iab-4-5p MIMAT0000412 25 + seven_m8 FBgn0041711|FBtr0082757 F dme-miR-2280-3p MIMAT0011787 51 +seven_A1 FBgn0041711|FBtr0082757 F dme-miR-955-3p MIMAT0020847 69 e +ightmer FBgn0041711|FBtr0082757 F dme-miR-4957-3p MIMAT0020176 43 +seven_A1 FBgn0039896|FBtr0089115 F dme-miR-2280-3p MIMAT0011787 9 s +even_A1 FBgn0039896|FBtr0089115 F dme-miR-4943-5p MIMAT0020151 12 +seven_m8 FBgn0041711|FBtr0082757 F dme-miR-2280-3p MIMAT0011787 51 +seven_A1 * FBgn0039328|FBtr0084849 F dme-miR-977-3p MIMAT0005493 92 s +even_A1 FBgn0039328|FBtr0084849 F dme-miR-967-5p MIMAT0005482 92 s +even_A1 FBgn0039328|FBtr0084849 F dme-miR-4943-5p MIMAT0020151 143 + seven_m8 FBgn0039328|FBtr0084849 F dme-miR-4967-5p MIMAT0020192 99 +seven_m8 FBgn0039290|FBtr0084805 F dme-miR-2501-5p MIMAT0012216 583 + eightmer

So what I strive to achieve is the following thing . I need a loop that goes trough every line in the first and second documents (array list) and if it finds a matching Fbgn from the first document to the second document (be it the firs one or the second one FBgn in a line) to paste the dme-mir only or the whole line below (preferably the dme-mir only) the found match. It should look something like this :

FBgn0041711 yellow-e FBgn0038150 FBgn0041711|FBtr0082757 F dme-miR-iab-4-5p MIMAT0000412 25 + seven_m8 FBgn0041711|FBtr0082757 F dme-miR-2280-3p MIMAT0011787 51 +seven_A1 FBgn0041711|FBtr0082757 F dme-miR-955-3p MIMAT0020847 69 e +ightmer FBgn0041711|FBtr0082757 F dme-miR-4957-3p MIMAT0020176 43 +seven_A1 FBgn0041711|FBtr0082757 F dme-miR-2280-3p MIMAT0011787 51 +seven_A1 * FBgn0041711 yellow-e FBgn0039896 FBgn0039896|FBtr0089115 F dme-miR-2280-3p MIMAT0011787 9 s +even_A1 FBgn0042110 CG18765 FBgn0039328 FBgn0039328|FBtr0084849 F dme-miR-967-5p MIMAT0005482 92 s +even_A1 FBgn0039328|FBtr0084849 F dme-miR-4943-5p MIMAT0020151 143 + seven_m8 dme-miR-4943-5 +p

The last line is the other variant with the dme-mir only . I'm really sorry if I didn't make it clear and for my noobish question. All help will be greatly appreciated! Thanks again and have a nice day.

Replies are listed 'Best First'.
Re: Duplicates in arrays and add values
by poj (Abbot) on Mar 09, 2016 at 16:01 UTC

    Try

    #!perl use strict; my $file2 = 'xname2.txt'; open IN,'<',$file2 or die "Could not open $file2 : $!"; my %file2 = (); while (<IN>){ next unless /\S/; my ($FBgn) = split '\|'; push @{$file2{$FBgn}},$_; } my $file1 = 'xname1.txt'; open IN,'<',$file1 or die "Could not open $file1 : $!"; while (my $line = <IN>){ next unless $line =~ /\S/; for (split /\s+/,$line){ next unless /^FBgn/; if (exists $file2{$_}){ print join '',$line,@{$file2{$_}}; delete $file2{$_}; } } }
    poj

      Thank you both of you. I really appreciate what both of you did. Poj man if you are in Manchester give me your address I'm going to deliver you couple of beers or what ever you want. Thank you masters again . Have a nice evening.

Re: Duplicates in arrays and add values
by 1nickt (Canon) on Mar 09, 2016 at 15:34 UTC

    Hi Waterdrake, you'll get better help if you show what you've tried, and how it fails for you.

    Also, if you search this site using the Super Search, you'll find that this is a common problem that has been asked about before, and you'll probably find your answers.

    Post back here when you get stuck, after you've tried some things. Good luck!

    The way forward always starts with a minimal test.