in reply to Re^5: Replacing substrings within hash values
in thread Replacing substrings within hash values

Now that i'm no longer iterating over keys%sequences, is there a way to reset a variable for each key? I now wish to add a second type of edit where letters can be deleted or inserted (rather than replaced). This will shift all remaining positions out of sync and so the changes would need to be monitored so that they can compensated for. My original idea to do this was to keep a cumulative total of the insertion/deletion sizes and adjust each remaining position accordingly - which would reset to 0 for each key.

Or would it be better to just create an array, with an entry for each key?

  • Comment on Re^6: Replacing substrings within hash values

Replies are listed 'Best First'.
Re^7: Replacing substrings within hash values
by BrowserUk (Patriarch) on Mar 25, 2016 at 13:22 UTC
    I now wish to add a second type of edit where letters can be deleted or inserted (rather than replaced).

    One method would be to load all the edits for a particular sequence into an array, and then perform them in reverse order by position.

    By doing those at the end first, any changes to length do not affect edits for earlier parts of the string.

    In the following I've used the redundant third field to hold the action 'I'nsert, 'D'elete, or 'R'eplace:

    #! perl -slw use strict; use Inline::Files; use Data::Dump qw[ pp ]; use constant { SEQ => 0, POS => 1, ACT => 2, REP => 3 }; my %seqs = map{ split "\n", $_ } <FASTA>; pp \%seqs; my @edits = [ split ' ', <EDITS> ]; while( 1 ) { my @bits = split ' ', <EDITS>; if( defined $bits[ 0 ] and $bits[ 0 ] eq $edits[ 0 ][ 0 ] ) { push @edits, \@bits; next; } for my $edit ( sort{ $b->[POS] <=> $a->[POS] } @edits ) { if( $edit->[ACT] eq 'I' ) { substr( $seqs{ '>' . $edit->[SEQ] }, $edit->[POS]-1, 0, $e +dit->[REP] ); } elsif( $edit->[ACT] eq 'D' ) { substr( $seqs{ '>' . $edit->[SEQ] }, $edit->[POS]-1, 1, '' + ); } else { ## replace substr( $seqs{ '>' . $edit->[SEQ] }, $edit->[POS]-1, 1, $e +dit->[REP] ); } } last unless defined $bits[ 0 ]; @edits = \@bits; } pp \%seqs; __FASTA__ >I CATCAGTATAAAATGACTAGTAGCTAGATACCACAGATACGATACAACA >II TACCACAGATACGATACAACACATCAGTATAAAATGACTAGTAGCAGAC __EDITS__ I 2 I I I 4 D X I 5 R G I 7 I C II 1 D X II 2 I I II 3 R T II 5 D X II 8 R T II 10 I I

    I've also used I and X as the 'replacement char' for insert and delete respectively to make verification easier.

    Outputs:

    C:\test>1158701 { ">I" => "CATCAGTATAAAATGACTAGTAGCTAGATACCACAGATACGATACAACA", ">II" => "TACCACAGATACGATACAACACATCAGTATAAAATGACTAGTAGCAGAC", } { ">I" => "CIATGGCTATAAAATGACTAGTAGCTAGATACCACAGATACGATACAACA", ">II" => "IATCCATAITACGATACAACACATCAGTATAAAATGACTAGTAGCAGAC", }

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^7: Replacing substrings within hash values
by poj (Abbot) on Mar 25, 2016 at 11:30 UTC
    a variable for each key

    That's just another hash.

    my %offset = (); # inserts $offset{$key} += 1; # length of insert # deletes $offset{$key} -= 1;
    poj