Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

i.e to find match for the right edge of 1st string(let us say last 12 characters) with left edge of 2nd string(let us say first 12 characters). the number of matched characters can vary but it should be a exact match. This will be needed in Contig generation Tool. Can any one suggest a site where either Perl or C++ code exists which does this kind of thing?
  • Comment on What is the Regular expression to search for boundry match for two strings?

Replies are listed 'Best First'.
Re: What is the Regular expression to search for boundry match for two strings?
by snowcrash (Friar) on Apr 03, 2001 at 10:05 UTC
    i hope i've understood your question right:
    #!/usr/bin/perl -w use strict; my $length = 6; my $left = "xfafasfsdfasdfasdsfFOOBAR"; my $right = "FOOBAR4t11"; if ( substr($left,-$length) eq substr($right,0,$length) ) { print "Match!\n"; }
    Update: Oh, you asked for a regular expression:
    my $intersect = substr($left,-$length); if ( $right =~ /^$intersect/ ) { print "matched again!\n"; }

    snowcrash //////
      Thank you very much. This will help me.
      Can we perfom a match by first concating the 2 strings and then querying on it.
      my $left = "xfafasfsdfasdfasdsfFOOBAR"; my $right = "FOOBAR4t11"; # my $concat = "xfafasfsdfasdfasdsfFOOBARFOOBAR4t11"; my $concat = $left,$right;

      I should be able to hold on to FOOBAR in a back reference so that I can do further manipulation.
      I need it this way because I do not know the size of boundaries which will match and I would like to match the maximum characters possible to match.
      After this match the two sub strings should be concatnated. this concatnated string will have a single copy of FOOBAR in the middle. like
      my $conc = "xfafasfsdfasdfasdsfFOOBAR4t11";
      now this concatnated string will need to matched with one more string let us say
      my $str3 = "AR4t11xysadgsfje";

      Now AR4t11 will match and the above process is repeated.
      How efficient will this be if I have about sub strings of size about 500 each and I need to compare about 10000 such sequences with each other.

      Thank you.
      braj
Re: What is the Regular expression to search for boundry match for two strings?
by I0 (Priest) on Apr 03, 2001 at 11:53 UTC
    Is this what you mean?
    "$left#$right" =~ /(.*)#\1/
Re: What is the Regular expression to search for boundry match for two strings?
by dvergin (Monsignor) on Apr 03, 2001 at 09:54 UTC
    Hmmm... perhaps an example would help us more easily understand what you are describing.
Re: What is the Regular expression to search for boundry match for two strings?
by scain (Curate) on Apr 03, 2001 at 18:28 UTC
    By contig generation tool, do you mean for DNA sequences? If so, you should definietly checkout bioperl.org.

    Scott