typsy_in has asked for the wisdom of the Perl Monks concerning the following question:

Hello, Iam trying to find the common pattern between 2 strings. For Example I have
str1=a11211b123 str2=a11222b123
the perl script should print out a112[11][22]b123 Could some one tell me a better way of doing this. Thanks, Vijay

Replies are listed 'Best First'.
Re: Finding Overlapping between 2 strings
by Corion (Patriarch) on Jan 03, 2009 at 15:13 UTC

    Well, to show you a better way, we need to see the code you've written so far. You could also tell us why you think your code isn't ideal.

    If your strings are of equal length and the overlappings start at the same position, finding them is easy:

    my $str1 = 'a11211b123'; my $str2 = 'a11222b123'; my $overlapped = $str1 ^ $str2; $overlapped =~ s/\0/X/g; print $overlapped;

    If your strings are of differing lengths, you will have to look at something like Algorithm::Diff.

Re: Finding Overlapping between 2 strings
by bart (Canon) on Jan 03, 2009 at 16:06 UTC
    • split both strings into arrays of characters
    • use Algorithm::Diff to compare the two arrays
    • convert the output into the format you like
Re: Finding Overlapping between 2 strings
by ww (Archbishop) on Jan 03, 2009 at 15:17 UTC
    The Monastery mores hold that the poster of a question such as this should:
    1. Show some effort: ie, post what you've tried
    2. Tell how that code fails to meet your expectations (including error messages, if any)
    3. Use the tools provided here (including Search and Super_Search, Tutorials, etc. to find answers before posting a question already answered repeatedly.

    Hint: search terms such as "common strings" may be helpful.

Re: Finding Overlapping between 2 strings
by BrowserUk (Patriarch) on Jan 03, 2009 at 16:47 UTC

    Something like this might be what you're looking for:

    #! perl -slw use strict; sub common { die 'Bad args' unless @_ == 2; my $output = $_[ 0 ] ^ $_[ 1 ]; $output =~ s[([^\0]+|[\0]+)]{ my $p = $-[0]; my $l = $+[0] - $-[ 0 ]; ord( $1 ) == 0 ? substr( $_[ 0 ], $p, $l ) : join '', map{ '[' . substr( $_, $p, $l ) . ']' } @_ }ge; return $output; } while( my $str1 = <DATA> ) { my $str2 = <DATA>; print common( $str1, $str2 ); } __END__ a11211b123 a11222b123 The quick brown fox jumps over the lazy dog the quiet green fir looms over the lazy dog Send reinforcements; we're going to advance Send 3 and sixpence; we're going to a dance Time flies like an arrow Dung flies like a brick

    Output:

    c:\test>junk1 a112[11][22]b123 [T][t]he qui[ck][et] [b][g]r[ow][ee]n f[ox][ir] [jump][loom]s over the + lazy dog Send [rei][3 a]n[forcem][d sixp]en[ts][ce]; we're going to a[dv][ d]an +ce [Time][Dung] flies like a[n][ ] [a][b]r[row][ick]

    Not sure about the usefulness of the output format though.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Finding Overlapping between 2 strings
by FunkyMonk (Bishop) on Jan 04, 2009 at 00:10 UTC
    This is my first adventure with Algorithm::Diff, so perhaps it isn't the best use of the module

    use Algorithm::Diff; say diff_str( qw/xa11211b123 a11222b123x/ ); sub diff_str { my $diff = Algorithm::Diff->new( map { [ split // ] } @_ ); my $diff_str = ''; while ( $diff->Next ) { my ($h1, $h2) = map { join "", $diff->Items( $_ ) } 1, 2; $diff_str .= $diff->Diff ? "[$h1][$h2]" : $h1; } return $diff_str; } #Output: [x][]a112[11][22]b123[][x]


    Unless I state otherwise, all my code runs with strict and warnings
Re: Finding Overlapping between 2 strings
by swampyankee (Parson) on Jan 03, 2009 at 17:10 UTC

    We can't tell you a better way of doing anything unless you show us the code you're using to do it presently.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Finding Overlapping between 2 strings
by spx2 (Deacon) on Jan 04, 2009 at 15:47 UTC