Hi fellow Monks!
I have two lines, like the following:
$str1='---DAAAGLRG--G--G-P-LT-I--A--PG----A-----T----LG---G-YG--------
+--------------------------------SVT----------------------------------
+---------------------------------------------------G-------NV-T------
+NN---G----TI----SVANALPSLASSLPGDFRIF---------------------------------
+--------------------------GTLTNAGVVELRGRVVGN--G-LA-V-S------------G--
+------N---Y---VGQN----------------------GAVN-------------MN-TT-------
+--L--AG--D-----------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+----------------------------------------------------G----------------
+-----A-------PS-------D-TL-LI---------------GGVPA-VATAS---------G----
+K--------T----T---------L--------------------------------------------
+--------------N-----VTNVGG---------------AGAL------------------------
+------------------------------------------TK-SDGI---------RL-VY------
+----------AVNFA-N---------T-------------------G---N-A--F--TLAG----GTV
+S--AG----------------------------------------------------------------
+---------AYSYY--------------LV--KGGV-T-------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+--------------------A-----------------LTG---------EDWYLR-S-----------
+---------------------------------------------------TVPPR-P-DQ---P----
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+-T-QQ--PPF-----------------------------------------------------------
+---------------------S--V-A---DG-TP-ES--I-----------V----------------
+--E--AV-K---N----------------A--AP-DA--------------------------------
+--------------------------------------------------------K-------PEP--
+------------------------------------------------------------V--------
+----------YR---------------------------------------------------------
+----------PEV--PL-YS-----------EVP-----------------------------------
+---------------------------------------------------------------------
+---------------------------------------------A--VARQ-----------------
+-----LG---L-L------------Q--------IDT-F-H----------------DRQ-------G-
+EQG--LL-----AEN-G-S--------------------------------------------------
+---------------------------------------------------------------------
+---------------------------------------------------------------------
+------VP----VSWSRVW-----------GGY---SN------IKQ-NG-------------------
+-------DVTPSY--DGTVW-----G--MQVGQ---DLY-----ADNRP-------SGHRNHYGFF---
+-LGF------SR--AIGDVNGFA--------------------------------------LAQPDL--
+------GVGSLQVN-A-Y-N----L--G--G-YWT-----------------------------H----
+IGPG--------------GWYTDA--------------------------VV--MGS-V--LT---V--
+RTHSN-------------------------------N------NVSGS--T-D--GNA--VTGS-V--E
+AGV--P--I------------SL------G-YG----------L--------------T----L-----
+----E-PQA-QLLW-QWLS-LA--RFND------G-------V--------------------------
+--------SDV----T--W-----NN-GNTFLGR----IG-ARL--------QY-----AFDAN-----
+-GVSWK--------------------PYLRVNVLR--S--FG-S--DD----------RTT-----FG-
+----GS----TT------------------------IG-TQ-VG-------Q--T--AGQIGA-GL-VA
+-Q--LT-KR----GSVYA--T--V--S---Y---------LT-NL-----GG----E----H----QR-
+---T---I--T---GNAGVRW--';
$str2='XXXXXXXXXXX..X..X.X.XX.X..X..XX....X.....X....XX...X.XX........
+................................XXX..................................
+...................................................X.......XX.X......
+XX...X....XX....XXXXX................................................
+.........................................XXX..X.XX.X.X............X..
+......X...X...XXXX......................XXXX.............XX.XX.......
+..X..XX..X...........................................................
+.....................................................................
+.....................................................................
+....................................................X................
+.....X.......XX.......X.XX.XX...............XXX......XX.........X....
+X........X....X.........X............................................
+..............X.....XXXXXX................XXX........................
+..........................................XX.XXXX.........XX.X.......
+...........XXXX.X..X....X.X...................X...X.X..X..XXX......XX
+X..XX................................................................
+.........XXXXX..............XX..XXXX.X...............................
+.....................................................................
+.....................................................................
+.....................................................................
+.....................................................................
+.....................................................................
+....................X..............XXXXXX.........XXXXXX.X...........
+...................................................XXXXX.X.XX...X....
+.....................................................................
+.....................................................................
+.....................................................................
+.....................................................................
+.....................................................................
+.....................................................................
+.....................................................................
+.X.XX..XXX.........XX..X...X.........................................
+.....................X..X.X...XX.XX.XX..X...........X................
+..X..XX.X...X................X..XX.XX................................
+........................................................X.......XXX..
+............................................................X........
+...X..X.X.XX.........................................................
+..........XXX..XX.XX...........XXX...................................
+.....................................................................
+.............................................X..XXXX.................
+.....XX...X.XX.........XXX........XXX.X.X................XXX.......X.
+XXX..XX......XX.X.X..................................................
+.....................................................................
+.....................................................................
+......XX....XIIIIII...........III...II.......XXXX....................
+........XXXXX..XXXXX.....X..XXXXX...XXX.....XXX............XXXXXXX...
+.XXX......XX..XXXXXX.............................................X...
+........XXXXXX.X.X.X....X..X..X.XXX.............................X....
+XXXX...............XXXXX..........................XX..XXX.X..XX...X..
+XXXXXX.XX..XX......................XX......XXXXX..X.X..XXX..XXXX.X..X
+XXX..X..X............XX......X..X..........X..............X....X.....
+....X.XXX.XXXX.XXXX.XX..XXXX......X.......XX....X....................
+X.X.....XXX....X..X.....XX.XXXXXXX....XX.XXX........XX.....XXXXX.....
+.XXXXX....................XXXXXXXXX..X..XX.X..XX....XX...XXXX.....XX.
+....XX....XX............X....XX....XXX.XX.XX.......X..X..XXXXXX.XX.XX
+.X..XX.XX....XXXXX..X..X..X...X.........X...X......X....X....X....XX.
+...X...X..X...XXXXXXXXX';
The goal is, for each of the positions in
$str1 that are
-, erase the respective positions in
$str2. The desired output should then be:
DAAAGLRGGGPLTIAPGATLGGYGSVTGNVTNNGTISVANALPSLASSLPGDFRIFGTLTNAGVVELRGR
+VVGNGLAVSGNYVGQNGAVNMNTTLAGDGAPSDTLLIGGVPAVATASGKTTLNVTNVGGAGALTKSDGI
+RLVYAVNFANTGNAFTLAGGTVSAGAYSYYLVKGGVTALTGEDWYLRSTVPPRPDQPTQQPPFSVADGT
+PESIVEAVKNAAPDAKPEPVYRPEVPLYSEVPAVARQLGLLQIDTFHDRQGEQGLLAENGSVPVSWSRV
+WGGYSNIKQNGDVTPSYDGTVWGMQVGQDLYADNRPSGHRNHYGFFLGFSRAIGDVNGFALAQPDLGVG
+SLQVNAYNLGGYWTHIGPGGWYTDAVVMGSVLTVRTHSN------NNVSGSTDGNAVTGSVEAGVPISL
+GYGLTLEPQAQLLWQWLSLARFNDGV----SDVTWNNGNTFLGRIGARLQYAFDANGVSWKPYLRVNVL
+RSFGSDDRTTFGGSTTIGTQVGQTAGQIGAGLVAQLTKRGSVYATVSYLTNLGGEHQRTITGNAGVRW
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.............................
+.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.....XXXXXXXXXXXXXX.XXXXXXXXX
+XXX..XXXXXXXXXXXXX..XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
+XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX..XXXXIIIII
+IIIIIIXXXX..XXXXXXXXXXXXXXXXXXXXXX.....XXXXXXXXXXXXXXXXXX.......X...X
+XXXXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
+X.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
+XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX..X.X.XXXXXXXXXXXXX
My approach would be to split
$str1 and
$str2 and then, foreach of the positions in
$str1 that are
- I would erase the corresponding positions in
$str2.
The problem is that I have a very large file of such cases and
split would be rather slow I reckon.
Any faster way maybe?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.