Re^2: Replacing substrings within hash values

Replies are listed 'Best First'.
Re^3: Replacing substrings within hash values by BillKSmith (Monsignor) on Mar 24, 2016 at 13:27 UTC
The main problem with your code is that you must read the entire IN file for each sequence. This would work if you rewind (`seek IN, 0, 0`) the IN file at the end of the sequence loop. This approach is very inefficient. Run-time would become unacceptable for larger files. Your use of `last` is fine for sequence I. For all other sequences, it would exit the loop before it gets to the processing. Use `next` instead. Bill	[reply] [d/l] [select]
Re^4: Replacing substrings within hash values by K_Edw (Beadle) on Mar 24, 2016 at 13:57 UTC
I see, thank you. The script is now working as desired after fixing the usage of substr, using next and the addition of seek(IN,0,0). Is there a more efficient way to accomplish this or is this an unavoidable consequence of the way the script is structured/written? My thought was that i'd only need to iterate over IN once as it's sorted and that once $F[0] no longer equalled the current key, it would begin reading IN where it left off for the next sequence before it terminated.	[reply]
Re^5: Replacing substrings within hash values by poj (Abbot) on Mar 24, 2016 at 14:20 UTC
Since you can access a sequence using it's key, no need to loop through them until you need to print. `#!/usr/bin/env perl use strict; use warnings; my %sequences = ( I => 'CATCAGTATAAAATGACTAGTAGCTAGATACCACAGATACGATACAACA', II => 'TACCACAGATACGATACAACACATCAGTATAAAATGACTAGTAGCAGAC', ); while (<DATA>) { my @f = split(/\s+/, $_); substr ($sequences{$f[0]},$f[1]-1,1) = $f[3]; } print "CGTTGGCATAAAATGACTAGTAGCTAGATACCACAGATACGATACAACA\n"; for my $key (keys %sequences){ print $sequences{$key}."\n"; } __DATA__ I 2 A G I 4 C T I 5 A G I 7 T C II 1 T C II 2 A G II 3 C T II 5 A C II 8 G T II 10 T G` [download] poj	[reply] [d/l]
Re^6: Replacing substrings within hash values by K_Edw (Beadle) on Mar 24, 2016 at 17:25 UTC
Re^6: Replacing substrings within hash values by K_Edw (Beadle) on Mar 25, 2016 at 11:16 UTC
Re^7: Replacing substrings within hash values by BrowserUk (Patriarch) on Mar 25, 2016 at 13:22 UTC
Re^7: Replacing substrings within hash values by poj (Abbot) on Mar 25, 2016 at 11:30 UTC
Re^5: Replacing substrings within hash values by BillKSmith (Monsignor) on Mar 24, 2016 at 22:48 UTC
I did not realize that you were trying to exploit the ordering of the edits. One way that your original code is wrong is that ignores an edit which does not belong to the current sequence rather than applying it to the next sequence. This would not be simple to fix. I strongly recommend you change to BrowserUK's algorithm. It would be very easy to add a test to verify that the character at the position to be edited is what the edit expects. The additional effort would be paid back, the first time that it finds an example of inconsistent data. Bill	[reply]
Re^6: Replacing substrings within hash values by BrowserUk (Patriarch) on Mar 25, 2016 at 01:57 UTC
Re^7: Replacing substrings within hash values by BillKSmith (Monsignor) on Mar 25, 2016 at 03:51 UTC