don't think i understand what exactly you wish to delete, but if understood you correctly this is what you wish to achieve :

convert this:

A 83 GLU A 90 GLU^? A 163 ARG A 83 ARG^? A 222 ARG A 5 ARG^? A 229 ALA A 115 ALA~? A 257 ALA A 118 ALA~? A 328 ASP A 95 ASP~? A 83 GLU A 90 GLU^? A 163 ARG A 83 ARG^? A 222 ARG A 5 ARG^? A 83 GLU B 90 GLU^? A 163 ARG B 83 ARG^? A 222 ARG B 5 ARG^?
into this :
A 83 GLU B 90 GLU A 163 ARG B 83 ARG A 222 ARG B 5 ARG A 229 ALA A 115 ALA A 257 ALA A 118 ALA A 328 ASP A 95 ASP
right ??

code :

#!/usr/bin/perl use strict; my (%hash, %hash_key); my $x = 0; while (<DATA>){ my @array = split(' ', $_); $x++; $hash{"$array[1]-$array[4]"} = $_; $hash_key{$x} = "$array[1]-$array[4]"; } foreach my $i (sort {$a <=> $b} keys %hash_key){ (exists $hash{$hash_key{$i}}) ? (print "$hash{$hash_key{$i}}") : (pr +int "deleted\n"); delete($hash{$hash_key{$i}}) if (exists $hash{$hash_key{$i}}); } __DATA__ A 83 GLU A 90 GLU A 163 ARG A 83 ARG A 222 ARG A 5 ARG A 229 ALA A 115 ALA A 257 ALA A 118 ALA A 328 ASP A 95 ASP A 83 GLU A 90 GLU A 163 ARG A 83 ARG A 222 ARG A 5 ARG A 83 GLU B 90 GLU A 163 ARG B 83 ARG A 222 ARG B 5 ARG
baxy

UPDATE:

sorry i had to go as soon as i posted the reply (reason: girlfriend)

here is a more elegant solution. the first has some bugs and limitations due to me being in a hurry ;)

code :

#!/usr/bin/perl use strict; my (%hash, %hash_key); # hashes my $x = 0; # counters while (<DATA>){ #starts reading the data line by line my @array = split(' ', $_); # split the data using spaces $x++; # global counter $hash{$array[1]}->{$array[4]} = $_; # primary database $hash_key{$x}= [$array[1],$array[4]]; # key database } foreach my $i (sort {$a <=> $b} keys %hash_key){ (exists $hash{$hash_key{$i}->[0]}->{$hash_key{$i}->[1]}) ? (print "$ +hash{$hash_key{$i}->[0]}->{$hash_key{$i}->[1]}") : (print "deleted\n" +); # if the record in the database (hash) exists print it out otherwi +se print 'deleted' next if ($hash_key{$i}->[0] eq ''); # you need the empty lines so if + you reached an empty line, skip the deleting part delete($hash{$hash_key{$i}->[0]}->{$hash_key{$i}->[1]}) if (exists $ +hash{$hash_key{$i}->[0]}->{$hash_key{$i}->[1]} || $hash{$hash_key{$i} +->[1]}->{$hash_key{$i}->[0]}); # if you printed the entry from the da +tabase delete it , you don't need duplicates. this goes if your recor +d has 80 90 situation or 90 80 situation } __DATA__ A 83 GLU A 90 GLU A 163 ARG A 83 ARG A 222 ARG A 5 ARG A 229 ALA A 115 ALA A 257 ALA A 118 ALA A 328 ASP A 95 ASP A 83 GLU A 90 GLU A 163 ARG A 83 ARG A 222 ARG A 5 ARG A 83 GLU B 90 GLU A 163 ARG B 83 ARG A 222 ARG B 5 ARG
so what happenes... when you think about removing a duplicates think about hashes. so first hash is the actual database that withholds all he data and second one is the database that will preserve the order. once you hash your data all you have to do is print it in the order in which you saved it using the second hash_key. the deletion that follows is there so you don't print duplicates except if it is the blank space. you can remove the 'deleted' note if you don't need it.

baxy

ps

also if you have any questions about the code , just shoot, example if you are not familiar with the :

($a ==1) ? (print "yes") : (print "no");
since you stated that you are new to Perl and all...

In reply to Re: delete redundant data by baxy77bax
in thread delete redundant data by nurulnad

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.