in reply to Merging Two Strings

This oughta do it.
my $str3 = 'ATGATG'; my $str4 = 'ATGATG'; my $catted = "$str3 $str4"; $catted =~ s/(.*)(.+) (?=\2)/$1/; print "$catted\n";
Make the + a {2,} to specify 2 or more character minimum overlap.

Caution: Contents may have been coded under pressure.

Replies are listed 'Best First'.
Re^2: Merging Two Strings
by sauoq (Abbot) on Oct 27, 2005 at 15:59 UTC

    For one, if there is no overlap, that'll return a string with a space in it. For another, he did say he wanted the maximal overlap with the beginning of string 2. Your's gives the minimal.

    -sauoq
    "My two cents aren't worth a dime.";
    
      Those are spec issues, not defects in my solution. There's no spec saying what happens if there's no overlap, so how is it wrong to return a string with a space in it? It's better than starting a game of nethack or deleting all the files on the user's disk.

      The maximal overlap rule conflicts with the last-overlap rule. Doing maximal overlap is easier than what I posted (you remove the first capture group and substitute nothing in). And changing the + to a * ensures that the space gets removed, even if the overlap is empty. Does this conform to your understanding of the specs?

      $catted =~ s/(.*) (?=\1)//;

      Caution: Contents may have been coded under pressure.

        That merges ATGATG and ATGATG as ATGATG - not wnat OP gave as an explicit example.

        The problem is that a regex finds the first match, which may not be the best match. In this case the first match is the whole string, the best match is the last repeated substring (ATG).


        Perl is Huffman encoded by design.
        Does this conform to your understanding of the specs?

        Yes.(More precisely, with my assumption about which of the OP's examples is wrong.) I think that's almost equivalent to my much more verbose, probably slower, certainly uglier c-style function. Very nice. ++

        † It would be completely equivalent except that yours won't work on arbitrary binary data since you require a unique character not otherwise present in the strings to build your concatenation. That obviously wouldn't pose a problem for the OP, however.

        -sauoq
        "My two cents aren't worth a dime.";