davi54 has asked for the wisdom of the Perl Monks concerning the following question:
>sp|O24310|EFTU_PEA Elongation factor Tu, chloroplastic OS=Pisum sativum OX=3888 GN=TUFA PE=2 SV=1
MALSSTAATTSSKLKLSNPPSLSHTFTASASASVSNSTSFR
>sp|Q43467|EFTU1_SOYBN Elongation factor Tu, chloroplastic OS=Glycine max OX=3847 GN=TUFA PE=3 SV=1
MAVSSATASSKLILLPHASSSSSLNSTPFRSSTTNTHKLTP
So, as highlighted, both these entries have GN=TUFA in their header. So, I want to write a script which can read the value of GN for all entries and removes all the following entries that have the same value for GN, either it be TUFA or any other value. And write the output to a file. Can anyone please help me?
|
|---|