confused with strings

ktsirig has asked for the wisdom of the Perl Monks concerning the following question:

Hi all! I have this assignment to give in Biology class and I am stuck. Assignment: I am given a sequence of letters, say:
XXXXXXABCDXXXXXXXX
and I am interested in part ABCD, which represents letters #7-#10 as you can see. I am then given the same sequence, which now contains characters like * and !(only these 2 are allowed), say:
XX**XXX!!!X**AB*!C*DXX*!!!XXX**XXX
and I want to find out which letters now represent the part ABCD. If you count, you see that ABCD is now letters #14-#20 [4*and! were added prior to A and 10 prior to D
What I think must be done is:
1) check how many (if any) * or/and ! were added prior to start letter A(#7)
2) check how many (if any) * or/and ! were added prior to end letter D(#10)
3) add all * and/or ! to starting and ending letter of part ABCD
Has anyone got any hints to give me as to which functions of Perl will be useful for this problem?

Comment on confused with strings

Replies are listed 'Best First'.
Re: confused with strings by abcde (Scribe) on Jan 11, 2006 at 11:12 UTC
I thought of the function index first, which gets the position of a string (or character, in this case). `my $seq = "XXXXX!!!XAB!CDXX!!!XXXXXX"; # The code presumes that it's a correct sequence. # If you're getting the sequence from somewhere else you # should check it to make sure it contains the right letters. print "#" . index($seq, "A") . "-#" . index($seq, "D") . "\n";` [download] This is the easiest* way; your method works too, but it would involve splitting up the string and operating on that, so using index is easier.	[reply] [d/l]
Re: confused with strings by GrandFather (Saint) on Jan 11, 2006 at 11:40 UTC
Are you given ABCD or are you given the position and length (or start and end) of the substring? Can the * and !'s occur anywhere? You should give a before and after sample. Something like this: `__DATA__ XXXXXXABCDXXXXXXXX XXXXX!!!XAB!CDXX!!!XXXXXX` [download] Want to print: `10 characters added XXXXXXABCD!!!*!XX!!!XXXXXX` [download] The following code may be a good starting point: use strict; use warnings; my $match = 'ABCD'; while (<DATA>) { my $org = $_; defined (my $mutated = <DATA>) or die "Missing edited line"; my $segment = substr $mutated, 0, index ($mutated, substr $match, 3, + 1) + 1; my $suffix = substr $mutated, index ($mutated, substr $match, 3, 1) ++ 1; (my $pInsert = $segment) =~ tr/!//cd; (my $pSegment = $segment) =~ tr/!//d; print length ($pInsert) . " characters added\n"; print "$pSegment$pInsert$suffix\n"; } __DATA__ XXXXXXABCDXXXXXXXX XXXXX!!!XAB!CDXX!!!XXX**XXX [download] Note that some error checking should be added and that there are a few assumptions about the match string and what the X sequences can actually contain. DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: confused with strings by borisz (Canon) on Jan 11, 2006 at 10:36 UTC
`$_ = 'XXXXX!!!XAB!CDXX!!!XXXXXX'; /A[!\]B[!\]C[!\]*D/g and print '#', pos() + 1 - length($&), '-#', + pos(); __OUTPUT__ #14-#20` [download] Boris	[reply] [d/l]
Re: confused with strings by Perl Mouse (Chaplain) on Jan 11, 2006 at 11:03 UTC
If there is just one A and one D, you can simply use the `index` function to find the indices of the characters in the string. `Perl --((8:>*`	[reply]

Back to Seekers of Perl Wisdom