in reply to rename duplicate data

Or a variation of the above solution:

use Modern::Perl; my %hash; do {chomp; $_ = qq|$_"$hash{$1}"| if /(ID=.+)$/ and ++$hash{$1}; say} for <DATA>; __DATA__ xxxx . yyy 521 916 . + . ID=OSCAR1028v1rpkm9.67 xxxx . xxx 521 567 . + . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67 xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67 xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67

Results:

xxxx . yyy 521 916 . + . ID=OSCAR1028v1rpkm9.67"1" xxxx . xxx 521 567 . + . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67"1" xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67 xxxx . yyy 126 130 . - . ID=OSCAR10281vrpkm9.67"2" xxxx . xxx 129 130 . - . Parent=OSCAR1028v1rpkm9.67

Replies are listed 'Best First'.
Re^2: rename duplicate data
by blacknight (Initiate) on Jun 22, 2012 at 12:46 UTC

    this is what I want but without double quote surrounding the number.

    but I say in my last mail I am a very scarce in

    programming and I written this script to use your code

    but I can able to do correctly where is my mistake.

    My data are in txt file so I make this script to read

    them. <code> #!/usr/bin/perl; use strict; use warnings; use Modern::Perl; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 exonerate output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; open(IN,$filename); my $ids; my %hash; do {chomp; $_ = qq|$_"$hash{$1}"| if /(ID=.+)$/ and ++$hash{$1}; say} for $filename;

    but I have a error " Cant locate Modern::Perl .."

    I suppose that I don't have this module have you

    suggestion to resolve it. If I want to use the code by monsoon <code> while(<>){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; }

    its good to insert it in my script in this way <code> #!/usr/bin/perl; use strict; use warnings; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 exonerate output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; open(IN,$filename); my $ids; while($filename){ chomp; if(/(ID=.+)$/){ if(++$ids{$1} > 1){ say $_, $ids{$1}; next; } } say; } print say;

      Hi, blacknight.

      In case the Modern::Perl error is still occurring, try the following to produce the results you wanted w/o "quotes":

      use strict; use warnings; my %hash; do { chomp; $_ = qq|$_.$hash{$1}| if /(ID=.+)$/ and ++$hash{$1}; print + "$_\n" } for <DATA>;

        Hi Kenosis

        I tanks you very much.

        You are fantastic.

        the output its right, as I want it, but now I have the numbering also in non duplicate ID

        .... ID=Locus10035v1rpkm10.18.1

        .... Parent=Locus10035v1rpkm10.18

        .... ID=Locus10035v2rpkm2.50.1

        .... Parent=Locus10035v2rpkm2.50

        .... ID=Locus10073v1rpkm10.09.1

        .... Parent=Locus10073v1rpkm10.09

        .... ID=Locus10182v1rpkm9.88.1

        .... Parent=Locus10182v1rpkm9.88

        .... ID=Locus10210v1rpkm9.81.1

        but I would like to have only on duplicate and the numbering start only from second and so on.

        Hi everyone I tried to resolve my issue on renaming duplicate data whit this script. But It don't obtain a result. where is my mistake? My aims is to obtain this format Parent same name of its previous ID, this is a imput file

        0409 . mR 21213 23782 12787 + . ID=0035v110.18"1" 0409 . ex 21213 23782 . + . Parent=0035v110.18 0409 . mR 22173 24122 9669 - . ID=0035v22.50"1" 0409 . ex 22173 24122 . - . Parent=0035v22.50 0409 . mR 86435 89419 14907 - . ID=0073v110.09"1" 0409 . ex 86435 89419 . - . Parent=0073v110.09 0409 . mR 76753 78963 10984 + . ID=0182v19.88"1" 0409 . ex 76753 78963 . + . Parent=0182v19.88 0409 . mR 40542 45144 20377 - . ID=0210v19.81"1" 0409 . ex 45014 45144 . - . Parent=0210v19.81 0409 . ex 44717 44939 . - . Parent=0210v19.81 0409 . ex 44592 44625 . - . Parent=0210v19.81 0409 . ex 41343 44469 . - . Parent=0210v19.81 0409 . ex 41205 41221 . - . Parent=0210v19.81 0409 . ex 40542 41122 . - . Parent=0210v19.81 0409 . mR 43128 45064 8216 + . ID=0210v20.31_PRE"1" 0409 . ex 43128 44469 . + . Parent=0210v20.31_PRE 0409 . ex 44592 44625 . + . Parent=0210v20.31_PRE 0409 . ex 44717 44939 . + . Parent=0210v20.31_PRE 0409 . ex 45014 45064 . + . Parent=0210v20.31_PRE

        <code> use warnings; use feature "say"; use Data::Dumper; my $filename = $ARGV[0]; my $debug = $ARGV1; die "\n\tUSAGE: perl $0 output debug\n\n" unless $ARGV[0]; die "\n\tERROR: Cannot find the file $ARGV[0]\n\n" unless -e $ARGV[0]; open(IN,$filename); my $row = <IN>; my @tabula; my $i = 0; while ($row = <IN>) { $i++; @tabula = "$i) $row"; my $row = split(/\n/,@tabula); my @new_tabula = split(/\t/,$row); #print @tabula; my @field = @tabula; foreach (my @field) { my $id; my $string; if($field2 eq 'mR') { $field8 =~ /\tID(=.+)'$/; $id = $1; $string = "$field[0]\t.\tmR\t$field3\t$field4\t$field5\t$field6\t$field7\tID=$id\n"; } elsif($field2 eq 'ex') { $string = "$field[0]\t.\tex\t$field3\t$field4\t$field5\t$field6\t$field7\tParent=$id\n"; } print $string if $string; } } <code>