in reply to searching for strings

Your problem _sounds_ complicated, but it's pretty simple, really.

First off, the 2 matching strings have to be the same length, so you can use the length function to check that.

Next, if you use substr to look at the last character of 2 strings, the first n-1 characters have to be equal and the last characters have to differ by 1 when compared using ord, or the strings aren't a match.

Finally, you can use split to split your strings using a sequence of digits as the delimiter, like my @fields=split /(\d+)/,$str; Using the /(\d+)/, the delimiter will get captured too, and now you can make sure that the string fields are the same, and the number field differs only by one.

Once you've got all that bundled into a comparison function, all you have to do is scan the list for matches for each string, and pair them up...

I think you can get the rest of the way to your answer from here...


Mike

Replies are listed 'Best First'.
Re^2: searching for strings
by BrowserUk (Patriarch) on Aug 06, 2007 at 11:12 UTC
    Next, if you use substr to look at the last character of 2 strings, the first n-1 characters have to be equal and the last characters have to differ by 1 when compared using ord, or the strings aren't a match.

    That doesn't work for 'BBC49' and 'BBC50'.

    Also, the process you are describing is O(N2). Feasible, but impractical for large lists.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: searching for strings
by Limbic~Region (Chancellor) on Aug 06, 2007 at 13:23 UTC
    RMGir,
    First off, the 2 matching strings have to be the same length, so you can use the length function to check that.

    I think the problem is more complicated than you think. As BrowserUk points out - the OP has indicated 'BBC49' and 'BBC50' is a match. Even if you corrected for that, I think your first discriminating factor would be wrong. Consider ABC99 and ABC100.

    Cheers - L~R